About
I'm an Associate Professor at the Rutgers Department of Linguistics. My research uses computational and mathematical approaches to study fundamental questions in phonological theory, focusing on formal language theory, learning/grammatical inference, model theory/logic, representation, tone/pitch accent, and long-distance processes. I have a particular interest in Bantu, Japonic, and Austronesian languages.
Research signals
Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.
Research topics
- Computer Science
- Programming language
- Artificial Intelligence
- Linguistics
- Natural Language Processing
- Philosophy
- Algorithm
- Theoretical computer science
- Discrete mathematics
- Mathematics
Selected publications
InductionBench: LLMs Fail in the Simplest Complexity Class
ArXiv.org · 2025-02-20
preprintOpen accessLarge language models (LLMs) have shown remarkable improvements in reasoning and many existing benchmarks have been addressed by models such as o1 and o3 either fully or partially. However, a majority of these benchmarks emphasize deductive reasoning, including mathematical and coding tasks in which rules such as mathematical axioms or programming syntax are clearly defined, based on which LLMs can plan and apply these rules to arrive at a solution. In contrast, inductive reasoning, where one infers the underlying rules from observed data, remains less explored. Such inductive processes lie at the heart of scientific discovery, as they enable researchers to extract general principles from empirical observations. To assess whether LLMs possess this capacity, we introduce InductionBench, a new benchmark designed to evaluate the inductive reasoning ability of LLMs. Our experimental findings reveal that even the most advanced models available struggle to master the simplest complexity classes within the subregular hierarchy of functions, highlighting a notable deficiency in current LLMs' inductive reasoning capabilities. Coda and data are available https://github.com/Wenyueh/inductive_reasoning_benchmark.
InductionBench: LLMs Fail in the Simplest Complexity Class
2025-01-01
articleOpen accessLarge language models (LLMs) have shown remarkable improvements in reasoning and many existing benchmarks have been addressed by models such as o1 and o3 either fully or partially.However, a majority of these benchmarks emphasize deductive reasoning, including mathematical and coding tasks in which rules such as mathematical axioms or programming syntax are clearly defined, based on which LLMs can plan and apply these rules to arrive at a solution.In contrast, inductive reasoning, where one infers the underlying rules from observed data, remains less explored.Such inductive processes lie at the heart of scientific discovery, as they enable researchers to extract general principles from empirical observations.To assess whether LLMs possess this capacity, we introduce InductionBench, a new benchmark designed to evaluate the inductive reasoning ability of LLMs.Our experimental findings reveal that even the most advanced modelw available struggle to master the simplest complexity classes within the subregular hierarchy of functions, highlighting a notable deficiency in current LLMs' inductive reasoning capabilities.Coda and data are available https://github.com/wenyueh/ inductive_reasoning_benchmark.
Locality and non-linear representations in tonal phonology
Library, Museums and Press - UDSpace (University of Delaware) · 2024-06-04 · 92 citations
dissertationOpen access1st authorCorrespondingThis dissertation provides support for the hypothesis that surface well-formedness in phonological tone patterns is governed by language-specific, local constraints over autosegmental representations. The particular notion of locality invoked in this dissertation is that of banned substructure constraints, which are drawn from the theory of computation, formal language theory, and formal learning theory (McNaughton and Papert, 1971; García et al., 1990; Rogers et al., 2013). Essentially, any pattern describable with such constraints is local because the well-formedness of a structure with respect to the pattern is based entirely on its composite substructures of a fixed size. The primary novel contribution of the current work is to extend this notion of computational locality from strings to autosegmental structures by way of mathematical graph theory, and to develop a theory of tonal well-formedness based in banned substructure constraints over autosegmental representations. Through analyses of attested edge-based, quality-specific, and positional tone association patterns, as well as long-distance patterns, it is shown that such a theory can describe a range of major tonal generalizations, including ones beyond the power of both string-based local theories and standard explanations of tone in Optimality Theory. Furthermore, a local theory of constraints excludes unattested patterns requiring global evaluation that are predicted by other theories. Finally, it is discussed how banned substructure constraints can be connected to a restrictive theory of phonological input/output generalizations, and that there is a method for learning them. A secondary contribution of this dissertation is show that autosegmental representations are string-like in that they can be derived through the concatenation of graph primitives. Essentially, important properties of autosegmental representations can be seen as emerging from the concatenation of a finite alphabet of primitives, just as strings are built out of a finite alphabet of symbols. This novel approach to defining autosegmental representations not only makes the correct empirical prediction that languages cannot have unbounded ‘contouring’, it also allows for direct comparison of autosegmental grammars to string grammars. It is also shown how this notion of concatenation can be recruited for understanding input/output generalizations, and how it can be used to learn autosegmental grammars from string inputs.
Rational functions via recursive schemes
arXiv (Cornell University) · 2023-02-06
preprintOpen accessSenior authorWe give a new characterization of the class of rational string functions from formal language theory using order-preserving interpretations with respect to a very weak monadic programming language. This refines the known characterization of rational functions by order-preserving MSO interpretations.
Computing Process-Specific Constraints
Linguistic Inquiry · 2023-04-27 · 1 citations
article1st authorCorrespondingThis squib demonstrates how process-specific constraints—in which related but distinct processes in a language can be subject to differing conditions—can be captured with Boolean monadic recursive schemes (BMRSs), a computational formalism for phonological analysis based in mathematical logic. We use the case study of pharyngeal harmony in Palestinian Arabic, which motivated a discussion between Davis (1995) and McCarthy (1997) about the relative advantages and drawbacks of rule-based and Optimality Theory frameworks. We show how BMRS grammars naturally capture process-specific constraints in a way comparable to OT, while still being more computationally restrictive.
Stanley v Finnegan: Child Abuse and Bad Medicine.
PubMed · 2022-12-01
article1st authorCorrespondingIn April 2020 American President Donald Trump publicly stated that consuming disinfectant could cure COVID-19. This apparently shocking statement was not so shocking to many: some people believe that consuming Miracle Mineral Solution (MMS), a name for chlorine dioxide, an industrial bleach, can cure many illnesses. This article is a case note about Stanley v Finnegan, 447 F Supp 3d 771, 777 (WD Ark, 2020), in which parents sued their local county and sheriff in Arkansas for taking their children away after they encouraged their children to consume MMS. This case is particularly important in the current COVID-19 world.
Phonological theory and computational modelling
2022-03-24 · 1 citations
book-chapterSenior authorAbstract The computational modelling of phonology is almost as old as generative phonology itself. Johnson (1972) and Kaplan & Kaye (1981, 1994) showed that SPE rules can be modelled with finite-state machines, after which finite-state modelling became the bedrock of computational phonology, eventually informing computational autosegmental phonology (e.g., Kornai 1991, 1995) and Optimality Theory (e.g., Ellison 1994). More recently, computational modelling has informed phonological theory, following two strands of research: stochastic learning from corpus data and gradient acceptability judgements (e.g., Hayes & Wilson 2008; Albright 2009), and the study of the computational nature of phonological patterns (e.g., Heinz 2007). One can expect that the role computational modelling plays in the explanation of phonological cognition and learning will only increase.
Input and output locality and representation
Glossa a journal of general linguistics · 2021-04-08 · 1 citations
articleOpen accessSenior authorUsing a rigorous, computational notion of locality, this paper evaluates one of the central motivations for autosegmental representations (ARs)—that they reduce long-distance processes to local ones. We analyze a variety of tone processes using two computational notions of locality: input strict locality as defined by Chandlee (2014) and Chandlee & Jardine (2019a) and a corresponding notion of output strict locality we call recursive strict locality. The results of our survey add to our understanding of the typology of tone patterns and the role of ARs in two key ways. First, they indicate that both input and output locality play a role in a comprehensive theory of tone. Second, they reveal the various mechanisms by which ARs render long-distance patterns local by disentangling the various properties these representations combine. The larger contribution then is a more detailed and nuanced exploration of the interaction of representation, locality, and computational complexity in the domain of tonal phonology.
Computational Universals in Linguistic Theory: Using Recursive Programs for Phonological Analysis
Language · 2021 · 7 citations
Senior authorCorresponding- Computer Science
- Natural Language Processing
- Computer Science
This article presents BOOLEAN MONADIC RECURSIVE SCHEMES (BMRSs), adapted from the mathematical study of computation, as a phonological theory that both explains the observed computational properties of phonological patterns and directly captures phonological substance and linguistically significant generalizations. BMRSs consist of structures defined as logical predicates and situated in an ‘if ... then ... else’ syntax in such a way that they variably license or block the features that surface in particular contexts. Three case studies are presented to demonstrate how these grammars (i) express conflicting pressures in a language, (ii) naturally derive elsewhere condition effects, and (iii) capture typologies of repairs for marked structures.
The Computational Similarity of Binding and Long-Distance Consonant Dissimilation
ScholarlyCommons (University of Pennsylvania) · 2021-01-01
articleOpen accessSenior authorThis work shows that the binding patterns are computationally similar to long-distance consonant dissimilation. From a computational point of view, phonological patterns have long been hypothesized to be regular. More recent work has suggested this holds for syntax as well, given the correct representation. By examining binding conditions from morpho-syntactic transformational point of view, we show that binding conditions can be logically characterized in a parallel way to long-distance consonant dissimilation. The similarity shows that binding patterns as transformations fall into a subsequential class, a subregular class of transformations which is considered to capture a great deal of segmental phonological process. This result adds further support to the subregular hypothesis for syntax.
Frequent coauthors
- 15 shared
Jane Chandlee
Haverford College
- 12 shared
Jeffrey Heinz
Stony Brook University
- 5 shared
R. Eyraud
Laboratoire Hubert Curien
- 4 shared
Jonathan Rawski
San Jose State University
- 3 shared
Prasanna Kannappan
- 3 shared
Herbert G. Tanner
University of Delaware
- 3 shared
Nate Koser
Rutgers, The State University of New Jersey
- 2 shared
Christopher Oakden
University of California, Berkeley
Education
- 2016
PhD, Linguistics and Cognitive Science
University of Delaware
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Adam Jardine
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup