
Cog Lunch: Steven Meisler "Leveraging Big Data to Investigate Neuroanatomical Correlates of Reading Abilities and Disabilities" and Gasser Elbanna "Evaluating Speaker Identity Coding in Self-Supervised Models and Humans"
Description
Speaker: Gasser Elbanna
Title: Evaluating Speaker Identity Coding in Self-supervised models and Humans
Abstract:
Speaker identity perception is an essential cognitive phenomenon for human communication that can be broadly reduced to two main tasks: recognizing a voice or discriminating between voices. Several studies have attempted to identify acoustic correlates of identity perception to pinpoint salient parameters for such a task. Unlike other communicative social signals, most efforts have yielded inefficacious conclusions. Furthermore, current neurocognitive models of voice-identity processing consider the bases of perception as acoustic dimensions such as fundamental frequency (F0), harmonics-to-noise ratio (HNR), and formant dispersion (FD). However, these findings do not account for naturalistic speech and within-speaker variability.
Recently, self-supervised models have shown significant performance in various speech-related tasks. In this talk, we demonstrate that self-supervised representations from different families (e.g., generative, contrastive, and predictive models) are significantly better for speaker identification over acoustic representations. We also show that such a speaker identification task can be used to better understand the nature of acoustic information representation in different layers of these networks. By evaluating speaker identification accuracy across acoustic, phonemic, prosodic, and linguistic variants, we report similarity between model performance and human identity perception. These empirical findings provide both enhanced interpretability to these representational spaces and also support using this family of models as candidates to study speaker identity perception in humans.
Speaker: Steven Meisler
Title: Leveraging Big Data to Investigate Neuroanatomical Correlates of Reading Abilities and Disabilities
Abstract:
Reading is a critical skill for education and communication that predominantly recruits a left-hemisphere brain network. Several studies have employed diffusion-weighted imaging (DWI) and structural morphometry on T1-weighted images (T1w) to investigate how white matter microstructure and gray matter properties, respectively, relate to reading skills and diagnoses of reading disabilities. Meta-analyses of such studies have identified a lack of consistent findings, which could be due to a variety of factors, including small sample sizes, different age ranges and behavioral outcome measures, and diversity in acquisition and processing technicuqes. To address this, I analyzed DWI and T1w images from a single large quality-controlled dataset (Healthy Brain Network; n = 983 with DWI, n = 1171 with T1w), using state-of-the-art whole-brain techniques in each modality. In particular, I used fixel-based analyses for white-matter microstructure, and surface-based morphometry for gray matter analysis, in both cases employing general additive modeling to account for non-linear effects of age given the diverse sample (ages 6-18 y.o.). From the fixel-based analysis, I found that intra-axonal volume of left hemispheric temporoparietal and cerebellar white matter most strongly related to reading skills, but there was no significant group difference when comparing children with and without reading disabilities. Surface-based morphometry yielded a correlation between total intracranial volume and reading skills, but no region where vertex-wise volume was associated with reading skills (after controlling for total brain volume). Across all analyses, image quality significantly and globally impacted neuroimaging outcome measures. I discuss these findings in relation to the existing literature on reading neuroanatomy and other cognitive domains more broadly.