Characterization of Phone Rate as a Vocal Biomarker of Depression
Description
Quantitative approaches to psychiatric assessment beyond the qualitative descriptors in the Diagnostic and Statistical Manual of Mental Disorders could transform mental health care. However, objective neurocognitive state estimation and tracking demands robust, scalable indicators of a disorder. A person's speech is a rich source of neurocognitive information because speech production is a complex sensorimotor task that draws upon many cortical and subcortical regions.
Furthermore, the ease of collection makes speech a practical, scalable candidate for assessment of mental health. One aspect of speech production that has shown sensitivity to neuropsychological disorders is phone rate, the rate at which individual consonants and vowels are spoken. Our aim in this thesis is to characterize phone rate as an indicator of depression and to improve our use of phone rate as a feature through both brain imaging and neurocomputational modeling.
This thesis proposes that psychiatric assessment can be enhanced using a neurocomputational model of speech motor control to estimate unobserved parameters as latent descriptors of a disorder. We use depression as our model disorder and focus on motor control of speech phone rate.
First, we investigate the neural basis for phone rate modulation in healthy subjects uttering emotional sentences and in depression using functional magnetic resonance imaging. Then, we develop a computational model of phone rate to estimate subject-specific parameters that correlate with individual phone rate variability. Finally, we apply both model-based and model-free features derived from speech to classify depressed from healthy control subjects. The framework we have developed provides future avenues for quantitatively understanding differences in speech production under neuropsychological disorders.