
Rajka Smiljanic
· Professor and Undergraduate Faculty AdvisorUniversity of Texas at Austin · Linguistics
Active 2000–2024
Research topics
- Computer Science
- Psychology
- Speech recognition
- Linguistics
- Physics
- Communication
- Acoustics
Selected publications
Protective face masks affect clearly produced diphthongs by L1 and L2 English talkers
The Journal of the Acoustical Society of America · 2024-03-01
articleSenior authorThe study investigated how the use of protective face masks and language experience shape the production of listener-oriented clear speech. One L1 and one L2 English talker read sentences in a clear and conversational speaking style with and without a surgical mask. Formant trajectories between the onset and offset of two diphthongs, /aɪ/ and /eɪ/, were analyzed using Euclidean distance in the F1–F2 vowel space. The results showed that the distance between the onset /a/ and the offset /ɪ/ was larger when speech was produced without the mask and when speaking clearly. These modifications were larger for L1 talker compared to the L2 talker. The Euclidian distance for /eɪ/was only affected by speaking style. The results suggest that talkers produced hyperarticulated diphthongs characterized by larger formant movements in clear speech. The presence of a mask limited jaw movement for the diphthong containing the low vowel. Additionally, the L1 talker made larger articulatory modifications in response to the presence of a mask and listener-oriented clear speech compared to the talker with less extensive experience with the target language. These production patterns may be related to lower word recognition in noise for masked, conversational, and L2 speech found in previous work.
Learning a language with vowelless words
Cognition · 2024-08-06 · 4 citations
articleOpen accessSenior authorVowelless words are exceptionally typologically rare, though they are found in some languages, such as Tashlhiyt (e.g., fkt 'give it'). The current study tests whether lexicons containing tri-segmental (CCC) vowelless words are more difficult to acquire than lexicons not containing vowelless words by adult English speakers from brief auditory exposure. The role of acoustic-phonetic form on learning these typologically rare word forms is also explored: In Experiment 1, participants were trained on words produced in either only Clear speech or Casual speech productions of words; Experiment 2 trained participants on lexical items produced in both speech styles. Listeners were able to learn both vowelless and voweled lexicons equally well when speaking style was consistent for participants, but learning was lower for vowelless lexicons when training consisted of variable acoustic-phonetic forms. In both experiments, responses to a post-training wordlikeness ratings task containing novel items revealed that exposure to a vowelless lexicon leads participants to accept new vowelless words as acceptable lexical forms. These results demonstrate that one of the typologically rarest types of lexical forms - words without vowels - can be rapidly acquired by naive adult listeners. Yet, acoustic-phonetic variation modulates learning.
Language Cognition and Neuroscience · 2024-05-04 · 2 citations
articleSenior authorThis study examined whether intelligibility-enhancing hyperarticulated clear speaking styles improve word segmentation during real-time speech processing in quiet and in noise. English-speaking listeners heard clearly and conversationally spoken sentences in which the target (e.g. ham) was temporarily ambiguous with a competitor (e.g. hamster) across a word boundary (e.g. ham starting) while their eye fixations to target and competitor images were recorded. Relative to conversational speech, clear speech led listeners to fixate the target image over the competitor image to a greater degree, indicating facilitation of word segmentation. Such facilitation emerged in quiet and in noise even before disambiguating segmental information (e.g. /ɑ/ in starting) was available. A parallel clear speech benefit was not found when the disyllabic word (e.g. hamster) was the target. The findings suggest that improved word segmentation partly underlies the well-documented clear speech perceptual and cognitive benefits and may arise from the enhancements of multiple word boundary cues.
The Journal of the Acoustical Society of America · 2024-03-01
articleSenior authorListener-directed hyperarticulated clear speech produced by native (L1) talkers improves word segmentation and reduces lexical competition. Less is known about whether non-native (L2) clear speech also confers such benefit. In a visual-world eye-tracking study, we investigated if L2 clear speech improves word segmentation and the time course of the benefit for native listeners. Forty L1 English participants heard sentences produced in conversational and clear styles by a highly intelligible L2 English / L1 Spanish speaker with a discernable non-native accent. Sentences contained a target word (e.g., doll) with which a corresponding competitor overlapped phonemically (e.g., dolphin), creating temporary ambiguity with the target and the following word’s onset (e.g., doll found). Each recording was presented in quiet alongside pictures of the target, competitor, and two distractors. Participants were instructed to select the picture mentioned in the sentence they heard. No significant clear speech segmentation advantage was found; the proportion of looks to targets over competitors indicates similar time course of disambiguation in both conversational and clear speech. The results suggest that L2-accented clear speech with its deviations from the target-language- specific modifications and greater phonetic variability increases signal uncertainty resulting in no benefit for word segmentation even though word recognition was improved.
Clear speech processing benefits beyond intelligibility
The Journal of the Acoustical Society of America · 2023-03-01 · 1 citations
article1st authorCorrespondingA robust clear speech intelligibility benefit for a variety of talkers, listeners, and communication challenges is well-documented. In this talk, I will review research that focuses on how conversational to clear speech modifications facilitate linguistic processes and cognitive functioning beyond word recognition in noise. In one line of work, using a visual-world paradigm, we showed that clear speech enhanced speech segmentation and reduced lexical competition. In another, we showed that clear speech benefit extended to the improved sentence recognition memory and recall of words and sentences. Finally, in a series of experiments using a dual-task paradigm, we showed that hearing clear speech increased reaction times on a concurrent visual task suggesting that the clear speech processing benefits may arise through the increased engagement of the attentional resources toward the more salient hyperarticulated speech. The results contribute evidence that clear speech facilitates signal-dependent sensory processing as well as deeper linguistic processing abstracted from the input speech. These clear speech findings have implications for our understanding of perceptual mechanisms that underlie improved speech perception, including the use of cognitive resources and listening effort.
Journal of Phonetics · 2023-01-03 · 5 citations
articleSenior authorThe Journal of the Acoustical Society of America · 2023-03-01
articleSenior authorListener-oriented hyperarticulated clear speech facilitates linguistic processing and cognitive functioning associated with speech perception under various listening conditions. Using the visual-world eye-tracking paradigm, we investigated whether clear speech also aids speech segmentation, or the discovery of word boundaries, and examined the dynamic time course of its effect. Native American English speakers (N = 77) heard sentences in which the target word (e.g., ham) was temporarily ambiguous with a longer unintended competitor (e.g., hamster) across a word boundary (e.g., She saw the ham starting…) while viewing images depicting the target, competitor, and unrelated distractors. Clear and conversational sentences were presented in quiet or in speech-shaped noise at +3 dB signal-to-noise ratio. Analysis of eye fixations to the images over time revealed that compared with conversational speech, clear speech facilitated the disambiguation of the target from the competitor even before the disambiguation point was reached. The facilitation was found in both listening conditions but was relatively delayed in noise. These findings suggest that speaking clearly improves word segmentation and reduces lexical competition especially in optimal listening conditions. The speech segmentation facilitation may partly underlie the clear speech benefits observed for other signal-dependent and relatively signal-independent linguistic and cognitive processes.
The relationship between sentence intelligibility, band importance, and signal covariance
JASA Express Letters · 2023-05-01 · 1 citations
articleOpen accessThe present study investigates the relationship between sentence intelligibility, band importance, and patterns of spectro-temporal covariation between frequency bands. Sixteen listeners transcribed sentences acoustically degraded to 5, 8, or 15 frequency bands. Half of the sentences retained the frequency bands that captured more signal covariance. The other half retained the bands accounting for less signal covariance. Sentence intelligibility was significantly higher in the high-covariance condition. Critically, this finding was predicted by differences in band importance across reconstructed sentences. These findings provide a mechanistic relationship between the contributions of signal covariance and band importance to sentence intelligibility.
High spectral covariation between frequency channels contributes to clear speech intelligibility
The Journal of the Acoustical Society of America · 2023-03-01
articleSpeech signals are acoustically redundant, which could explain why sentence intelligibility is fairly robust even when sentences are acoustically degraded. We investigated the contributions to sentence intelligibility of clear speech redundancy encoded as patterns of spectrotemporal covariation between frequency channels. Participants (N = 16) transcribed 120 clear-speech English sentences acoustically degraded to 5, 8, or 15 frequency bands derived from an ERB-scaled filter bank. Before the acoustic degradation, each sentence was expressed as a linear combination of principal component eigenvectors representing different patterns of covariation between channels. Half of the sentences preserved the channels providing larger score magnitudes for the eigenvector accounting for more spectral covariance (high-covariance condition). These channels represented the spectral covariation patterns that were more dominant in each sentence. The other half of the sentences preserved the bands conveying larger score magnitudes for the eigenvector accounting forless spectral covariance (low-covariance condition). These bands represented the spectral covariation patterns that were less dominant. Participants yielded significantly better transcription accuracy in the high-covariance condition (mixed-effects, ps < 0.0021). Critically, accuracy in this condition was higher than 56% on average for as few as 5 bands. These findings indicate that clear speech intelligibility is supported by patterns of spectral covariation between frequency bands.
Coarticulation is reduced in clear speech produced with protective face masks
The Journal of the Acoustical Society of America · 2022-10-01
articleSenior authorTalkers dynamically modify their coarticulatory patterns when producing listener-oriented hyperarticulated clear speeches. This study examined how the use of protective face masks interacts with the production of intelligibility-enhancing clear speech to impact coarticulation. A native and a non-native speaker of English read sentences in a clear and conversational speaking style with and without a surgical mask. Coarticulation between word-internal adjacent segments was analyzed with a whole-spectrum analysis including spectral distance and segment overlap duration. Both speakers coarticulated less in clear than in conversational speaking style as indicated by the larger spectral distance and shorter overlap duration between adjacent segments. Coarticulation was further reduced when clear speech was produced with a mask by the native speaker but not by the non-native speaker. The findings showed that producing hyperarticulated intelligibility-enhancing clear speech also involves reducing coarticulatory overlap across adjacent segments. Coarticulatory resistance was adaptively reinforced in the presence of the additional communicative barrier, face mask, particularly for the speaker with extensive experience with the target language. Such cumulative reduction of coarticulation may in part underlie the larger perception-in-noise benefit for clear speech produced with a mask for the native compared to the non-native talker.
Frequent coauthors
- 25 shared
Bharath Chandrasekaran
University of Pittsburgh
- 23 shared
Ann R. Bradlow
Klinikum Saarbrücken
- 16 shared
Kirsten Meemann
The University of Texas at Austin
- 13 shared
Sandie Keerstock
University of Missouri
- 10 shared
Lauren Calandruccio
- 10 shared
Zhe-chen Guo
Northwestern University
- 9 shared
Cynthia P. Blanco
The University of Texas at Austin
- 9 shared
Kristin J. Van Engen
Washington University in St. Louis
Labs
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Rajka Smiljanic
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup