About
Kendra V. Dickinson is a sociolinguist and an Assistant Professor in the Department of Spanish and Portuguese at Rutgers University. She received a BA in Spanish from the University of Illinois at Urbana-Champaign in 2011, an MA in Applied Linguistics from Boston University in 2016, an MA in Hispanic Linguistics from the Ohio State University in 2018, and a PhD in Hispanic Linguistics from the Ohio State University in 2022. Her research applies diverse methodologies to the study of how relationships between linguistic systems, cognitive mechanisms, and social contexts can elucidate mechanisms of language variation and change, with a focus on morphosyntax. Beyond her research, she offers courses in sociolinguistics and multilingualism at the undergraduate and graduate levels, and directs the Sociolinguistic Analysis Group (SlAnG).
Research topics
- Computer Science
- Linguistics
- Philosophy
- Mathematics
- Psychology
- Artificial Intelligence
- Combinatorics
- Developmental psychology
- Physics
- Econometrics
- Statistics
Selected publications
Plataforma da Diversidade Linguística Brasileira: Dados linguísticos para uma IA brasileira
2025-05-21
preprintOpen accessGenerative artificial intelligence is based on large-scale language models (LLMs), which are trained with data most often collected without consent or in breach of copyright. LLMs are trained with billions of words and millions of parameters, but we don't know exactly which texts are selected in the training or which parameters are controlled. While unsupervised learning requires a large volume of data, demanding more and more computational costs and generating energy impacts, supervised learning with structured and tagged data can optimize this process; more than that: supervised learning with structured and tagged data resulting from language documentation projects can contribute directly to the National Artificial Intelligence Plan: “Develop advanced language models in Portuguese, with national data that encompasses our cultural, social and linguistic diversity, to strengthen sovereignty in AI.” In Brazil, in addition to Portuguese and its varieties, there are more than 250 other languages (indigenous, immigration, sign language), which are neglected in digital inclusion due to a lack of structured data. The consortium of laboratories and research groups in this INCT aims to prepare linguistic data for the training of LLMs, considering Brazil's linguistic diversity, with the development of a joint protocol for collecting linguistic data in the field, to be replicated in the groups and laboratories longitudinally, as well as transcription procedures, as well as procedures for transcribing, aligning and labeling linguistic data to create a data set that represents Brazilian linguistic diversity, and conducting studies on linguistic processing of diversity to fine-tune LLMs, helping to reduce asymmetries and prejudice resulting from training LLMs with translations from English.
2025-08-13
peer-reviewGender-inclusive morphology in Spanish: learnability, processing costs, and use
Studies in Hispanic and Lusophone Linguistics · 2025-09-01
articleSenior authorAbstract In Spanish, the -e suffix as a gender-inclusive morpheme (e.g., elles son bonites ) has been proposed as an alternative to traditional - o to refer to mixed-gender groups. Limited studies on this innovation suggest that the computation of inclusive morphology might not incur high processing costs. However, it is unclear whether native speakers who have acquired the Spanish binary grammatical system before puberty effectively process noun-adjective gender agreement with said inclusive language morpheme. Thirty-seven native speakers of Spanish completed a self-paced reading task and a sociolinguistic questionnaire. The former task contained sentences with (1) a subject comprised of two stereotypically masculine names (e.g., Pedro) and two stereotypically feminine names (e.g., María), (2) the verb están , and (3) an adjective with the traditional -o suffix or the inclusive -e suffix, and the latter included questions about language use and beliefs. Results show that while -e was associated with longer processing time overall, participants who reported using inclusive language most frequently showed no difference in processing -o and -e. Additionally, questionnaire data show that participant beliefs about learnability and usability of inclusive language mirror experimental findings. These results suggest that it is possible to acquire the - e morpheme in the L1 after puberty, with beliefs influencing the processing of these forms. Furthermore, the results highlight the roles of use of and exposure to inclusive gender in the language in order for speakers to acquire and process it.
Predicative possession choice in Argentinian Spanish
Isogloss Open Journal of Romance Linguistics · 2025-07-16
articleOpen accessThis study investigates the expression of predicative possession in Argentinian Spanish, focusing on the alternation between two constructions: tener (‘have’) + NP and estar con (‘be with’) + NP. Building on previous research, we explore the factors that determine speakers’ choices between these constructions, particularly the influence of temporal context and the presence of adverbs. Using a forced-choice experimental design, participants were presented with vignettes varying in temporal duration (durative vs. non-durative) and adverbial modification (extending, limiting, or none). Results show a strong overall preference for the tener + NP construction, particularly in durative contexts. However, the estar con + NP construction is more likely to be selected in non-durative contexts, especially when a limiting adverb is present. These findings suggest that the distinction between the two constructions is not purely semantic but also pragmatically modulated by temporal and contextual factors. We argue that this pattern reflects a subset-superset relationship between the two constructions, where tener + NP can cover a broader temporal scope than estar con + NP. This overlap mirrors cross-linguistic findings on possessive constructions and aspectual distinctions, with implications for understanding grammaticalization processes in Romance languages.
Review: Sharing and Preserving Sociolinguistic Corpora on the U.S.-Mexico Border
2025-06-28
peer-reviewOpen accessSince William Labov outlined the methodology for the sociolinguistic interview in 1972, sociolinguistic corpora have been used widely in the field of sociolinguistics to study diverse speech communities and linguistic features. However, most of these invaluable sociolinguistic collections have been available only to the individual researcher or research group, and these data sets usually disappear from use with that individual scholar. More recently, there has been a push towards data sharing in sociolinguistics, reflective of data sharing and the open science movement in other fields. Still, accessible online sociolinguistic corpora are few and far between, in part due to the intense time commitment required to create, sustain, share, and preserve such collections. This paper reviews two accessible online sociolinguistic collections at the U.S.-Mexico border: the Corpus de Español en el Sur de Arizona [Corpus of Spanish in Southern Arizona] or CESA (Carvalho, 2012) and the Corpus Bilingüe del Valle [Bilingual Corpus of the Valley] or CoBiVa (Christoffersen & Bessett, 2019; Christoffersen & Ciller, 2024) in South Texas. We explore these two corpora as case studies for data sharing and preservation through collaboration by detailing the data collection and data management protocols and preservation plans. In doing so, we demonstrate how data sharing in sociolinguistics impacts accessibility, reproducibility, and the democratization of knowledge.
Plataforma da Diversidade Linguística Brasileira: dados linguísticos para uma IA brasileira
Cadernos de Linguística · 2025-12-19
articleOpen accessGenerative artificial intelligence is based on large-scale language models (LLMs), which are trained with data most often collected without consent or in breach of copyright. LLMs are trained with billions of words and millions of parameters, but we don't know exactly which texts are selected in the training or which parameters are controlled. While unsupervised learning requires a large volume of data, demanding more and more computational costs and generating energy impacts, supervised learning with structured and tagged data can optimize this process; more than that: supervised learning with structured and tagged data resulting from language documentation projects can contribute directly to the National Artificial Intelligence Plan: “Develop advanced language models in Portuguese, with national data that encompasses our cultural, social and linguistic diversity, to strengthen sovereignty in AI.” In Brazil, in addition to Portuguese and its varieties, there are more than 250 other languages (indigenous, immigration, sign language), which are neglected in digital inclusion due to a lack of structured data. The consortium of laboratories and research groups in this INCT aims to prepare linguistic data for the training of LLMs, considering Brazil's linguistic diversity, with the development of a joint protocol for collecting linguistic data in the field, to be replicated in the groups and laboratories longitudinally, as well as transcription procedures, as well as procedures for transcribing, aligning and labeling linguistic data to create a data set that represents Brazilian linguistic diversity, and conducting studies on linguistic processing of diversity to fine-tune LLMs, helping to reduce asymmetries and prejudice resulting from training LLMs with translations from English.
Languages · 2024 · 1 citations
1st authorCorresponding- Computer Science
- Artificial Intelligence
- Econometrics
This project explores the synchronic variation of participle forms in Brazilian Portuguese (BP). Despite general systematicity, the language maintains many historically irregular participles, which often compete with regularized variants. The language has also developed innovative participles, which tend to exist in variation with regular forms. Adopting a usage-based framework, the study examines how analogical processes affect persistent irregular participles and short-form forms in BP, emphasizing the role of grammatical context and frequency. Data are drawn from the Portuguese Web 2011 corpus, including 12 verbs with long-form Latinate irregulars (n = 4800) and 8 verbs with short-form forms (n = 3200). The results show that long-form Latinate irregulars are more common as adjectives and with the verb estar, while regularized forms are prevalent with ser and in perfect constructions. Conversely, short-form participles occur least frequently in perfect constructions, showing a tendency towards the maintenance of regularity in this context. Additionally, verbs that occur more often in perfect constructions are most resistant to innovation. These findings indicate that perfect constructions play a dual role in promoting and preserving regularity in BP and shed light on how grammar–internal relationships and contexts of occurrence play a role in language variation and change.
Journal of Speech Language and Hearing Research · 2023 · 2 citations
1st authorCorresponding- Computer Science
- Psychology
- Developmental psychology
PURPOSE: Our study analyzes probabilistic constraints on subject expression previously found in adult Spanish in the speech of typically developing (TD) Spanish-speaking children and children with developmental language disorder (DLD). Previous work shows that children with DLD produce fewer overt subjects than typically developing children, and that the latter acquire constraints on subject expression as they age into adolescence. Our study complements these findings and provides further substance to the grammatical profile of children whose morphosyntactic development diverges from that of typically developing children. METHOD: Data are drawn from unstructured spontaneous production data from a sample of 19 monolingual Mexican, Spanish-speaking children, collected in 2006-2007. This sample includes 19 children diagnosed with DLD and 19 age-matched, typically developing children. We collected all instances of finite verbs that either did or could have occurred with a subject personal pronoun uttered by the child participants and coded them for several factors including tense-mood-aspect, switch reference, and person and number. RESULTS: We find that children with DLD produce fewer overt subject pronouns in switch reference contexts than typically developing controls, with a significant interaction of group and switch reference. Furthermore, a discriminant function analysis shows that overt pronoun use in switch reference contexts can form part of a useful diagnostic discriminant function, with high levels of sensitivity and specificity. CONCLUSIONS: Overall, we find important differences between TD Spanish-speaking children and those diagnosed with DLD regarding rates of overt subjects and sensitivity to the probabilistic constraint of switch reference. This finding contributes to our understanding of the morphosyntactic profiles of children with DLD, as well as the utility of factors such as switch reference in the identification of language disorders.
What Does It Meme? English–Spanish Codeswitching and Enregisterment in Virtual Social Space
Languages · 2023-10-10 · 1 citations
articleOpen access1st authorCorrespondingThis project investigates English–Spanish codeswitching in internet memes posted to the Facebook page, We are mitú (mitú), and analyzes how lexical insertions and quotatives contribute to the enregisterment of linguistic patterns and the construction of collective identity among U.S. Latinx millennials in virtual social spaces. Data include instances of lexical insertion (n = 280) and quotative mixed codes (n = 114) drawn from a collected corpus of 765 image–text memes. The most frequent lexical insertions included food items (e.g., elote and pozole), kinship terms (e.g., abuelita and tía), and culturally specific artifacts or practices (e.g., quinceañera and lotería), which reflect biculturalism and rely on a shared set of references for the construction of a group identity. Additionally, the quotatives in the data construct Spanish-speaking characterological figures that enregister a particular brand of U.S. Latinx millennial identity that includes being bilingual, having Spanish-speaking parents, and having strong ties to Latinx culture. Overall, this work highlights not only internet memes as a vehicle for enregisterment, but also, and more importantly, how the language resources employed within them work to enregister linguistic and cultural norms of U.S. Latinx millennials, and thereby, play a role in identity construction in virtual social spaces.
Speakers' subjective evaluations of direct object pronouns in Brazilian Portuguese
Toronto Working Papers in Linguistics · 2022-05-25 · 2 citations
articleOpen accessDirect object pronouns show considerable variation in Brazilian Portuguese, where normative clitic pronouns compete with their tonic counterparts. However, no prior studies have investigated empirically speaker evaluations of these competing pronoun variants. We created a perception experiment of direct object pronouns in spoken Brazilian Portuguese, in order to investigate the role of attitudes and social evaluations in language variation. Results from 160 native speakers show broad evaluative differences between clitic and tonic pronouns, while at the same time showing individual differences by pronoun and effects of the context of utterance. We conclude that the role that social evaluation plays in usage preferences in BP should be re-assessed based on studies linking subjective attitudes with grammatical choices.
Frequent coauthors
- 4 shared
Blanca Flores-Ávalos
Instituto Nacional de Rehabilitación
- 4 shared
Ana Arrieta-Zamudio
Universidad Nacional Autónoma de México
- 4 shared
Pedro Antonio Ortiz-Ramírez
Google (United States)
- 4 shared
John Grinstead
Google (United States)
- 3 shared
Scott A. Schwenter
The Ohio State University
- 2 shared
Luana Lamberti
Iowa State University
- 1 shared
Stephanie Antetomaso
The Ohio State University
- 1 shared
Mark Hoff
The Ohio State University
Education
- 2022
PhD, Department of Spanish and Portuguese
The Ohio State University
- 2018
MA, Department of Spanish and Portuguese
The Ohio State University
- 2016
MA, Linguistics
Boston University
- 2011
BA, Department of Spanish, Portuguese, and Italian
University of Illinois at Urbana-Champaign
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Kendra V. Dickinson
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup