Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…

Dani Byrd

· Professor of LinguisticsVerified

University of Southern California · Linguistics

Active 1990–2026

h-index39
Citations6.0k
Papers16017 last 5y
Funding$6.5M
See your match with Dani Byrd — sign in to PhdFit.Sign in

About

Dani Byrd is a Professor in the Department of Linguistics at the University of Southern California, within the Dornsife College of Letters, Arts and Sciences. Her research focuses on phonetics and phonology, and she is actively involved in research groups such as the USC Phonetics & Phonology Group and the USC SPAN Research Group. She contributes to the academic community through her work on speech, words, and the mind, and she has authored an introductory textbook titled 'Discovering Speech, Words, and Mind.' Professor Byrd is engaged in teaching courses related to phonetics and phonology and provides resources and support for students and colleagues in her field.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Speech recognition
  • Physics
  • Linguistics
  • Computer vision
  • Mathematics

Selected publications

  • Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

    Open MIND · 2026-01-20

    preprint

    Many spoken languages, including English, exhibit wide variation in dialects and accents, making accent control an important capability for flexible text-to-speech (TTS) models. Current TTS systems typically generate accented speech by conditioning on speaker embeddings associated with specific accents. While effective, this approach offers limited interpretability and controllability, as embeddings also encode traits such as timbre and emotion. In this study, we analyze the interaction between speaker embeddings and linguistically motivated phonological rules in accented speech synthesis. Using American and British English as a case study, we implement rules for flapping, rhoticity, and vowel correspondences. We propose the phoneme shift rate (PSR), a novel metric quantifying how strongly embeddings preserve or override rule-based transformations. Experiments show that combining rules with embeddings yields more authentic accents, while embeddings can attenuate or overwrite rules, revealing entanglement between accent and speaker identity. Our findings highlight rules as a lever for accent control and a framework for evaluating disentanglement in speech generation.

  • An Approach to Simultaneous Acquisition of Real-Time MRI Video, EEG, and Surface EMG for Articulatory, Brain, and Muscle Activity During Speech Production

    Open MIND · 2026-03-05

    preprint

    Speech production is a complex process spanning neural planning, motor control, muscle activation, and articulatory kinematics. While the acoustic speech signal is the most accessible product of the speech production act, it does not directly reveal its causal neurophysiological substrates. We present the first simultaneous acquisition of real-time (dynamic) MRI, EEG, and surface EMG, capturing several key aspects of the speech production chain: brain signals, muscle activations, and articulatory movements. This multimodal acquisition paradigm presents substantial technical challenges, including MRI-induced electromagnetic interference and myogenic artifacts. To mitigate these, we introduce an artifact suppression pipeline tailored to this tri-modal setting. Once fully developed, this framework is poised to offer an unprecedented window into speech neuroscience and insights leading to brain-computer interface advances.

  • Learning-free L2-Accented Speech Generation using Phonological Rules

    ArXiv.org · 2026-03-08

    articleOpen access

    Accent plays a crucial role in speaker identity and inclusivity in speech technologies. Existing accented text-to-speech (TTS) systems either require large-scale accented datasets or lack fine-grained phoneme-level controllability. We propose a accented TTS framework that combines phonological rules with a multilingual TTS model. The rules are applied to phoneme sequences to transform accent at the phoneme level while preserving intelligibility. The method requires no accented training data and enables explicit phoneme-level accent manipulation. We design rule sets for Spanish- and Indian-accented English, modeling systematic differences in consonants, vowels, and syllable structure arising from phonotactic constraints. We analyze the trade-off between phoneme-level duration alignment and accent as realized in speech timing. Experimental results demonstrate effective accent shift while maintaining speech quality.

  • Deep learning characterizes depression and suicidal ideation in young adults from eye movements

    npj Digital Medicine · 2026-03-28 · 1 citations

    articleOpen access

    Objective biobehavioral markers for mental health conditions remain elusive, with diagnosis typically relying on self-reports and clinical interviews. We investigate eye tracking as a potential marker of attentional and mood biases associated with symptoms of depression and suicidal ideation from self-reported screening questionnaires. We analyze eye movements from 126 young adults during reading and responding to emotionally loaded sentences. A deep learning framework was designed to account for intra-trial and inter-trial variations in eye movements, achieving an AUC of 0.793 (95% CI: 0.766-0.819) for identifying depression/suicidality against healthy controls, and 0.826 (95% CI: 0.798-0.853) for suicidality specifically. The model also exhibited moderate accuracy in differentiating depressed from suicidal individuals (AUC: 0.609, 95% CI: 0.569-0.646). Discriminative patterns were more pronounced during response generation and for stimuli of negative sentiment. These findings suggest that eye tracking can provide objective markers of self-reported symptom severity by measuring the impact of emotional stimuli on oculomotor control.

  • Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

    2026-04-21

    article

    Many spoken languages, including English, exhibit wide variation in dialects and accents, making accent control an important capability for flexible text-to-speech (TTS) models. Current TTS systems typically generate accented speech by conditioning on speaker embeddings associated with specific accents. While effective, this approach offers limited interpretability and controllability, as embeddings also encode traits such as timbre and emotion. In this study, we analyze the interaction between speaker embeddings and linguistically motivated phonological rules in accented speech synthesis. Using American and British English as a case study, we implement rules for flapping, rhoticity, and vowel correspondences. We propose the phoneme shift rate (PSR), a novel metric quantifying how strongly embeddings preserve or override rule-based transformations. Experiments show that combining rules with embeddings yields more authentic accents, while embeddings can attenuate or overwrite rules, revealing entanglement between accent and speaker identity. Our findings highlight rules as a lever for accent control and a framework for evaluating disentanglement in speech generation.

  • Articulatory kinematics of penultimate and final lengthening in Setswana: Evidence from real-time MRI

    2026-05-14

    articleOpen access

    The current real-time vocal tract MRI study examines the articulatory encoding of prosodic boundary, prominence and their interaction through kinematic analysis of penultimate and final lengthening near an intonational phrase (IP) boundary in Setswana.One hypothesis is that penultimate lengthening represents a specific case of final lengthening initiated on the IP-penultimate position.Alternatively, penultimate lengthening and final lengthening may result from the interaction between phrase-level prominence and boundary events.Our results reveal two phases of lengthening in the IP-penultimate and IPfinal positions.Displacement and peak velocity are also greater IP-finally than IP-medially, but boundary-related increase in displacement and peak velocity only shows a single progressive trend approaching the final IP boundary, with no IP-penultimate alterations comparable to durational patterns.Additionally, there is some evidence for greater duration, displacement and peak velocity of initial consonant gestures on word-penultimate syllables than on word-final ones regardless of utterance positions, indicating a possible word-penultimate prominence effect.These findings suggest that penultimate and final lengthening in Setswana are better understood as the interaction between disparate prominence and boundary events.The results are interpreted according to a prosodic gestural approach that posits the coordination of a phrasal-prominence-encoding gesture and a boundary-encoding gesture.

  • Learning-free L2-Accented Speech Generation using Phonological Rules

    Open MIND · 2026-03-08

    preprint

    Accent plays a crucial role in speaker identity and inclusivity in speech technologies. Existing accented text-to-speech (TTS) systems either require large-scale accented datasets or lack fine-grained phoneme-level controllability. We propose a accented TTS framework that combines phonological rules with a multilingual TTS model. The rules are applied to phoneme sequences to transform accent at the phoneme level while preserving intelligibility. The method requires no accented training data and enables explicit phoneme-level accent manipulation. We design rule sets for Spanish- and Indian-accented English, modeling systematic differences in consonants, vowels, and syllable structure arising from phonotactic constraints. We analyze the trade-off between phoneme-level duration alignment and accent as realized in speech timing. Experimental results demonstrate effective accent shift while maintaining speech quality.

  • Automation of real-time vocal tract image segmentation with SAM 2.0 and morphological operation implementation

    JASA Express Letters · 2026-03-01

    articleOpen access

    Modeling articulatory representations is critical to the scientific study of speech production, including its relation to speech acoustics. However, discretizing articulatory dynamics in continuous speech has proven computationally taxing. For example, segmentation analyses of real-time vocal tract images deploying contour-tracking methods, while successful, require manual creation of templates and human supervised assessment [e.g., Bresch and Narayanan (2009). IEEE Trans. Med. Imaging. 28(3), 323-338]. In this paper, we utilize Segment Anything Model 2 (SAM 2.0) [Ravi et al. (2024). arXiv:2408.00714] to efficiently segment critical articulators in real-time magnetic resonance imaging speech production data without fine-tuning and with global nonlinear image filtering to examine such systems' ability to segment speech dynamics, which have both language- and subject-specific characteristics.

  • Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

    ArXiv.org · 2026-01-20

    articleOpen access

    Many spoken languages, including English, exhibit wide variation in dialects and accents, making accent control an important capability for flexible text-to-speech (TTS) models. Current TTS systems typically generate accented speech by conditioning on speaker embeddings associated with specific accents. While effective, this approach offers limited interpretability and controllability, as embeddings also encode traits such as timbre and emotion. In this study, we analyze the interaction between speaker embeddings and linguistically motivated phonological rules in accented speech synthesis. Using American and British English as a case study, we implement rules for flapping, rhoticity, and vowel correspondences. We propose the phoneme shift rate (PSR), a novel metric quantifying how strongly embeddings preserve or override rule-based transformations. Experiments show that combining rules with embeddings yields more authentic accents, while embeddings can attenuate or overwrite rules, revealing entanglement between accent and speaker identity. Our findings highlight rules as a lever for accent control and a framework for evaluating disentanglement in speech generation.

  • Interpretable Modeling of Articulatory Temporal Dynamics from Real-Time MRI for Phoneme Recognition

    2026-04-21

    articleOpen access

    Real-time Magnetic Resonance Imaging (rtMRI) visualizes vocal tract action, offering a comprehensive window into speech articulation. However, its signals are high dimensional and noisy, hindering interpretation. We investigate compact representations of spatiotemporal articulatory dynamics for phoneme recognition from midsagittal vocal tract rtMRI videos. We compare three feature types: (1) raw video, (2) optical flow, and (3) six linguistically-relevant regions of interest (ROIs) for articulator movements. We evaluate models trained independently on each representation, as well as multi-feature combinations. Results show that multi-feature models consistently outperform single-feature baselines, with the lowest phoneme error rate (PER) of 0.34 obtained by combining ROI and raw video. Temporal fidelity experiments demonstrate a reliance on fine-grained articulatory dynamics, while ROI ablation studies reveal strong contributions from tongue and lips. Our findings highlight how rtMRI-derived features provide accuracy and interpretability, and establish strategies for leveraging articulatory data in speech processing. The source code is publicly available. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>

Recent grants

Frequent coauthors

Education

  • Ph.D., Linguistics

    University of Southern California

  • M.A., Linguistics

    University of Southern California

  • B.A., Linguistics

    University of Southern California

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Dani Byrd

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup