Stefan Thomas Gries
VerifiedUniversity of California, Santa Barbara · French and Italian Studies
Active 1988–2025
About
Stefan Thomas Gries is a professor in the Department of Linguistics at the University of California, Santa Barbara. His areas of specialization include quantitative corpus linguistics, statistical and computational linguistics, cognitive linguistics, and first and second language acquisition. He is involved in research that applies quantitative methods to linguistic data, contributing to the understanding of language learning and use through empirical and computational approaches. His work integrates insights from cognitive linguistics and employs statistical techniques to analyze language patterns, advancing the field of applied linguistics and language acquisition studies.
Research topics
- Linguistics
- Computer science
- Natural language processing
- Artificial intelligence
- Psychology
Selected publications
Elsevier eBooks · 2025-01-01
book-chapterSenior authorStudies in Second Language Acquisition · 2025-12-04
articleOpen accessAbstract Statistical regularities can be acquired from usage. To examine language speakers’ statistical metacognition about multiword expressions (MWEs), we collected ratings for frequency, dispersion, and directional association strength of English binomials from L1, advanced and intermediate L2 speakers. Mixed-effects modeling showed all speakers had limited speaker-to-corpus consistency but significant sensitivity to statistical regularities of language, supporting usage-based (Gries & Ellis, 2015) and statistical learning theories (Christiansen, 2019). Their statistical metacognition was also shaped by word-level cues, consistent with dual-route model (Carrol & Conklin, 2014). Despite similarities, frequency metacognition showed the strongest speaker-to-corpus consistency, while dispersion metacognition was the hardest to develop. Advanced L2 speakers showed the greatest speaker-to-corpus consistency and sensitivity, while lower-proficiency speakers relied more on word-level cues in metacognitive judgments, supporting the shallow-structure hypothesis (Clahsen & Felser, 2006). Overall, L1 and L2 speakers develop diverse statistical metacognition, with L2 speakers not necessarily inferior, suggesting that statistical metacognition is not solely shaped by usage-based experience.
The Encyclopedia of Applied Linguistics · 2025-12-02
otherSenior authorAbstract In this overview, we survey recent and current developments in corpus‐based research on World Englishes. We exemplify current strands of research in both more theoretical and more applied parts of research on varieties of English and conclude with theoretical, methodological, and resource desiderata.
Research in Corpus Linguistics · 2025-01-01
articleOpen accessSenior authorAlthough L1-English fluency has been extensively studied from many angles, few contrastive studies examine whether fluency develops similarly or differently across L1-varieties while taking sociolinguistic variation into consideration. This paper aims to close this research gap and examines the use of three core strategies of fluency (or fluencemes), i.e. discourse markers, filled pauses and unfilled pauses, across Australian, British, Canadian, and New Zealand English. These fluencemes were extracted and manually disambiguated from the private conversation sections of the respective components of the International Corpus of English (ICE-AUS, ICE-GB, ICE-CAN, and ICE-NZ). The data were normalised per speaker and linked with the sociobiographic metadata of the speakers. Analysis using random forests revealed a consistent fluenceme distribution across the four varieties, with unfilled pauses being the most common, followed by discourse markers, and then filled pauses. This pattern suggests a ‘common fluenceme core’ among L1-English varieties. The influence of sociolinguistic variables —gender, age, education, and occupation— was modest across varieties and exhibited diverse trends. Male speakers tend to use filled pauses more frequently but fewer unfilled pauses compared to female speakers. Increasing age did not significantly affect the frequency of these strategies; however, older speakers tend to use discourse markers less frequently. Both education and occupation showed a slight positive correlation with overall fluency.
Corpus Linguistics and Psycholinguistics
Elsevier eBooks · 2025-01-01
book-chapter1st authorCorrespondingInternational Journal of Corpus Linguistics · 2025-06-10 · 2 citations
article1st authorCorrespondingIncorporating Corpora in Second‐Language Acquisition Research
The Encyclopedia of Applied Linguistics · 2025-12-02
other1st authorCorrespondingAbstract This article discusses the use of corpus‐linguistic methods in second‐language acquisition research. It focuses on measurement applications, specific linguistic case studies, and an evaluation coupled with an outlook over desiderata for the future.
Corpus Linguistics and the Cognitive/Constructional Endeavor
Cambridge University Press eBooks · 2025-01-30 · 1 citations
book-chapter1st authorCorrespondingSimilative-pretence constructions in language contact situations
Cognitive Linguistic Studies · 2025-11-10 · 1 citations
articleSenior authorAbstract The present study introduces a method that can be used to explore in a quantitatively rigorous yet less demanding way (both in terms of data and statistical requirements) how constructional templates and their lexical preferences (lexico-syntactic transference) diffuse in language contact situations. The study investigates the influence of Mexican Spanish similative-pretence constructions on Huasteca Nahuatl similative-pretence constructions as a proof-of-concept kind of application for our method. Speakers of Huasteca Nahuatl have borrowed the markers komo ‘like’ and komo si ‘as if’ from Mexican Spanish to express similative (e.g., she swims like a fish ) and pretence meanings (e.g., she swims as if she were a fish ), respectively. Using a conditional inference forest, the paper demonstrates that speakers of Huasteca Nahuatl have not only borrowed these markers from Mexican Spanish, but also lexical preferences (e.g., verb lemmas) of the constructions in which these markers occur. These findings show that the rigid partition of structural levels that has been adopted by traditional models of language contact proves inadequate for describing complex language situations. The method introduced here provides an integrative, non-modular way to explore language contact from a Usage-Based Construction Grammar perspective.
Cultural Keywords in Varieties Research
Journal of Research Design and Statistics in Linguistics and Communication Science · 2025-07-02
article1st authorCorrespondingOne of the four most central corpus-linguistic methods is keywords/keyness analysis, which is generally the identification and interpretation of word types of a target corpus ( T) that, when compared to their occurrence in a reference corpus ( R), are key/characteristic for T. In this article, I will (a) apply methods proposed by Gries (2021) to the study of three outer-circle varieties of English to identify cultural keywords in a bottom-up fashion and (b) use the results from that first application to advance two suggestions how to extend keyness analyses to better understand the keywords from the first step: key collocates, which involves applying keyness to contexts of keywords; and deep key collocates, which involves distributional semantics methods like word2vec, GloVe, BERT, etc. to keywords. I will use Mukherjee and Bernaisch's (2015) keyness analysis as a launchpad to identify keywords from comparisons of Indian, Pakistani, and Sri Lankan Englishes (IndE, PakE, and SriE, respectively) and zoom in on the variety-specific differences of the keyword of terror. The results not only indicate what terms are key for which of the three varieties; they also allow for a new level of granularity in how keywords use differs across varieties and possibly cultures. For example, in the PakE data, newspaper coverage of terror is mostly discussed with regard its financial aspects and implications and matters of communication, whereas in IndE and SriE, terror is much more approached from a military and a religious perspective, respectively. 1
Frequent coauthors
- 109 shared
Stefanie Wulff
UiT The Arctic University of Norway
- 75 shared
Anatol Stefanowitsch
- 71 shared
Martin Hilpert
University of Neuchâtel
- 67 shared
Santa Barbara
University of California, Santa Barbara
- 66 shared
Susanne Flach
Catholic University of Eichstätt-Ingolstadt
- 66 shared
Anna Birmingham
University of Florida
- 66 shared
Magali Mccauley
Baidu (China)
- 62 shared
Neuchâtel Keller
University of Florida
Education
- 2000
Ph.D., Department of British and American Studies
Universität Hamburg
- 1998
M.A., Department of British and American Studies
Universität Hamburg
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Stefan Thomas Gries
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup