Fernando Llanos

· Assistant Professor

University of Texas at Austin · Linguistics

Active 2010–2026

h-index12

Citations546

Papers5025 last 5y

Funding—

Faculty page Lab page

OpenAlex

See your match with Fernando Llanos — sign in to PhdFit.Sign in

About

Fernando Llanos is the lab director of the UT Austin Neuro Linguistics Lab, where he oversees research in neurolinguistics, focusing on the processing and acquisition of speech patterns across multiple languages, including indigenous languages in Latin America. His work involves conducting behavioral and neuroscientific studies, analyzing neural signals, and developing computational models related to speech and language processing. He has contributed to understanding neural responses to speech, prosodic patterns, and the neural basis of linguistic principles, with numerous publications in the field. His research aims to advance knowledge of how the brain processes language and speech, integrating neuroscientific methods with linguistic theory.

Research topics

Artificial Intelligence
Computer Science
Neuroscience
Medicine
Psychology
Machine Learning
Audiology
Speech recognition
Linguistics
Communication
Cognitive psychology
Mathematics
Physics
Acoustics

Selected publications

The effects of vocal emotions and emotional context on the neural tracking of speech envelopes and listeners’ vigilance states
Frontiers in Human Neuroscience · 2026-05-08
articleOpen accessSenior author
Introduction High-arousal emotional speech, such as angry and happy speech, is characterized by changes in signal amplitude that can substantially alter the temporal structure of the speech signal. In this EEG study, we investigated how these acoustic changes, and the structure of the preceding emotional context, influence neural tracking of temporal speech patterns, as well as alpha-band desynchronization associated with vigilance states in listeners. Methods EEGs were recorded from 30 adult native speakers of American English while they listened to angry, happy, or neutral spoken sentences presented either in a randomized order or blocked by emotion. To ensure sustained attention, participants answered occasional questions about sentence content. Results Angry speech elicited stronger alpha desynchronization than neutral and happy speech when stimuli were blocked by emotion but not when stimuli were fully randomized. In contrast, neural tracking of amplitude modulation patterns was more robust for neutral speech compared to happy or angry speech across both stimulus presentation contexts. When neural tracking was controlled for stimulus differences in amplitude variability, angry speech, which conveyed greater amplitude variability, was more faithfully tracked than both happy and neutral speech across stimulus presentation contexts. Discussion Our findings suggest that tonic modulations of alpha power are more sensitive to prolonged emotional context than to transient changes in speaker emotion. Furthermore, we found that emotional speech robustly modulates listeners’ vigilance, particularly following prolonged exposure to a single emotion, while exerting a limited influence on the neural encoding of amplitude modulation, which is primarily dominated by bottom-up amplitude variability in the acoustic signal.
Publisher DOI
Cortical processing of discrete prosodic patterns in continuous speech
Nature Communications · 2025-03-03 · 3 citations
articleOpen access
Prosody has a vital function in speech, structuring a speaker’s intended message for the listener. The superior temporal gyrus (STG) is considered a critical hub for prosody, but the role of earlier auditory regions like Heschl’s gyrus (HG), associated with pitch processing, remains unclear. Using intracerebral recordings in humans and non-human primate models, we investigated prosody processing in narrative speech, focusing on pitch accents—abstract phonological units that signal word prominence and communicative intent. In humans, HG encoded pitch accents as abstract representations beyond spectrotemporal features, distinct from segmental speech processing, and outperforms STG in disambiguating pitch accents. Multivariate models confirm HG’s unique representation of pitch accent categories. In the non-human primate, pitch accents were not abstractly encoded, despite robust spectrotemporal processing, highlighting the role of experience in shaping abstract representations. These findings emphasize a key role for the HG in early prosodic abstraction and advance our understanding of human speech processing. Using intracerebral recordings, the authors find abstract prosodic categories in continuous speech are encoded differently to segmental features by Heschl’s gyrus, suggesting specialized cortical processing early in the auditory processing hierarchy.
Publisher OA PDF DOI
High-arousal emotional speech enhances speech intelligibility and emotion recognition in noise
The Journal of the Acoustical Society of America · 2025-06-01 · 2 citations
articleSenior author
Prosodic and voice quality modulations of the speech signal offer acoustic cues to the emotional state of the speaker. In quiet, listeners are highly adept at identifying not only a speaker's words but also the underlying emotional context. Given that distinct vocal emotions possess varying acoustic characteristics, background noise level may differentially impact speech recognition, emotion recognition, or their interaction. To investigate this question, we assessed the effects of three emotional speech styles (angry, happy, neutral) on speech intelligibility and emotion recognition across four different SNR levels. High-arousal emotional speech styles (happy and angry speech) enhanced both speech intelligibility and emotion recognition in noise. However, emotion recognition behavior was not a reliable predictor of speech recognition behavior. Instead, we found a strong correspondence between speech recognition scores and the relative power of the speech-in-noise signal in critical bands derived from the Speech Intelligibility Index. Unsupervised dimensional scaling analysis of emotion recognition patterns revealed that different noise baselines elicit different perceptual cue weighting strategies. Further dimensional scaling analysis revealed that emotion recognition patterns were best predicted by emotion-level differences in harmonic-to-noise ratio and variability around the fundamental frequency. Listeners may thus weight acoustic features differently for recognizing speech versus emotional patterns.
Publisher DOI
Investigating the Neural Basis of the Loud-first Principle of the Iambic–Trochaic Law
Journal of Cognitive Neuroscience · 2024
1st authorCorresponding
- Artificial Intelligence
- Computer Science
- Psychology
The perception of rhythmic patterns is crucial for the recognition of words in spoken languages, yet it remains unclear how these patterns are represented in the brain. Here, we tested the hypothesis that rhythmic patterns are encoded by neural activity phase-locked to the temporal modulation of these patterns in the speech signal. To test this hypothesis, we analyzed EEGs evoked with long sequences of alternating syllables acoustically manipulated to be perceived as a series of different rhythmic groupings in English. We found that the magnitude of the EEG at the syllable and grouping rates of each sequence was significantly higher than the noise baseline, indicating that the neural parsing of syllables and rhythmic groupings operates at different timescales. Distributional differences between the scalp topographies associated with each timescale suggests a further mechanistic dissociation between the neural segmentation of syllables and groupings. In addition, we observed that the neural tracking of louder syllables, which in trochaic languages like English are associated with the beginning of rhythmic groupings, was more robust than the neural tracking of softer syllables. The results of further bootstrapping and brain-behavior analyses indicate that the perception of rhythmic patterns is modulated by the magnitude of grouping alternations in the neural signal. These findings suggest that the temporal coding of rhythmic patterns in stress-based languages like English is supported by temporal regularities that are linguistically relevant in the speech signal.
Publisher DOI
Neurolinguistic Approaches to Bilingual Phonetics and Phonology
Cambridge University Press eBooks · 2024-11-14
book-chapter1st authorCorresponding
This chapter provides a cross-sectional overview of current neuroimaging techniques and signals used to investigate the processing of linguistically relevant speech units in the bilingual brain. These techniques are reviewed in the light of important contributions to the understanding of perceptual and production processes in different bilingual populations. The chapter is structured as follows. First, we discuss several non-invasive technologies that provide unique insights in the study of bilingual phonetics and phonology. This introductory section is followed by a brief review of the key brain regions and pathways that support the perception and production of speech units. Next, we discuss the neuromodulatory effects of different bilingual experiences on these brain regions from shorter to longer neural latencies and timescales. As we will show, bilingualism can significantly alter the time course, strength, and nature of the neural responses to speech, when compared with monolinguals.
Publisher DOI
High-Arousal Emotional Prosodies Can Disrupt the Temporal Coding of Speech Patterns
SSRN Electronic Journal · 2024-01-01
preprintOpen accessSenior author
Publisher DOI
The relationship between sentence intelligibility, band importance, and signal covariance
JASA Express Letters · 2023-05-01 · 1 citations
articleOpen access1st author
The present study investigates the relationship between sentence intelligibility, band importance, and patterns of spectro-temporal covariation between frequency bands. Sixteen listeners transcribed sentences acoustically degraded to 5, 8, or 15 frequency bands. Half of the sentences retained the frequency bands that captured more signal covariance. The other half retained the bands accounting for less signal covariance. Sentence intelligibility was significantly higher in the high-covariance condition. Critically, this finding was predicted by differences in band importance across reconstructed sentences. These findings provide a mechanistic relationship between the contributions of signal covariance and band importance to sentence intelligibility.
Publisher OA PDF DOI
Distinct Dimensional Encoding of Speech in the Dorsal and Ventral Auditory Streams
Zenodo (CERN European Organization for Nuclear Research) · 2023-08-21
datasetOpen access1st authorCorresponding
Data and code for "Distinct Dimensional Encoding of Speech in the Dorsal and Ventral Auditory Streams"
Publisher DOI
Decoding speech envelopes from electroencephalographic recordings: A comparison of regularized linear regression and long short-term memory deep neural network
The Journal of the Acoustical Society of America · 2023-03-01
articleSenior author
The speech envelope provides enough acoustic information to accurately recognize consonants and vowels (Shannon et al., 1995). The neural representation of speech envelopes is often assessed by reconstructing the envelopes from neural oscillations in the electroencephalogram (EEG) using linear decoders. One such approach is the multivariate temporal response function (mTRF), which achieves envelope reconstruction through regularized linear regression. Here, we compared the envelope reconstructions achieved by the mTRF and a non-linear alternative derived from a long-short term memory (LSTM) deep network. EEGs were collected from 15 native English speakers listening to an English audiobook (Reetzke et al., 2021). We trained a different decoder for each consonant and vowel in each listener. Reconstruction accuracy was measured as the Pearson coefficient (r) between observed and reconstructed envelopes. Preliminary results for the reconstruction of all vowels revealed that speech envelopes were moreaccurately reconstructed by the LSTM decoder (r: M = 0.247, SEM = 0.0024) than the mTRF (r: M = 0.074, SEM = 0.0025). Reconstruction accuracy was equally high and less variable across subjects for the LSTM approach. Additionally, high vowels showed lower decoding performance potentially due to their lower amplitude. These findings demonstrate the potential of non-linear approaches to investigating the neural representation of speech envelope cues.
Publisher DOI
High spectral covariation between frequency channels contributes to clear speech intelligibility
The Journal of the Acoustical Society of America · 2023-03-01
article1st authorCorresponding
Speech signals are acoustically redundant, which could explain why sentence intelligibility is fairly robust even when sentences are acoustically degraded. We investigated the contributions to sentence intelligibility of clear speech redundancy encoded as patterns of spectrotemporal covariation between frequency channels. Participants (N = 16) transcribed 120 clear-speech English sentences acoustically degraded to 5, 8, or 15 frequency bands derived from an ERB-scaled filter bank. Before the acoustic degradation, each sentence was expressed as a linear combination of principal component eigenvectors representing different patterns of covariation between channels. Half of the sentences preserved the channels providing larger score magnitudes for the eigenvector accounting for more spectral covariance (high-covariance condition). These channels represented the spectral covariation patterns that were more dominant in each sentence. The other half of the sentences preserved the bands conveying larger score magnitudes for the eigenvector accounting forless spectral covariance (low-covariance condition). These bands represented the spectral covariation patterns that were less dominant. Participants yielded significantly better transcription accuracy in the high-covariance condition (mixed-effects, ps &lt; 0.0021). Critically, accuracy in this condition was higher than 56% on average for as few as 5 bands. These findings indicate that clear speech intelligibility is supported by patterns of spectral covariation between frequency bands.
Publisher DOI

Frequent coauthors

Bharath Chandrasekaran
University of Pittsburgh
30 shared
Alexander L. Francis
Purdue University West Lafayette
12 shared
Olga Dmitrieva
Purdue University System
8 shared
T. Christina Zhao
University of Washington
8 shared
Abhra Sarkar
The University of Texas at Austin
6 shared
Amanda A. Shultz
Purdue University System
6 shared
G. Nike Gnanateja
University of Wisconsin–Madison
6 shared
Patricia K. Kuhl
University of Washington
6 shared

Labs

Neuroling LabPI

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Fernando Llanos

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you