Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Robert Berwick

Robert Berwick

Verified

Massachusetts Institute of Technology · Electrical Engineering & Computer Science

Active 1975–2025

h-index39
Citations8.3k
Papers18413 last 5y
Funding
See your match with Robert Berwick — sign in to PhdFit.Sign in

About

Robert Berwick is a Professor of Computer Science and Engineering and Computational Linguistics at MIT. His research focuses on the intersection of artificial intelligence, natural language processing, and systems that interact with the external environment through perception, communication, and action. He is involved in developing techniques for the analysis and synthesis of systems that learn, make decisions, and adapt to changing environments. As a faculty member, Berwick contributes to advancing the understanding of AI and decision-making, leveraging computational, theoretical, and experimental tools to address complex challenges in these fields. His work emphasizes the development of systems that can process language and other modalities, contributing to the broader goals of artificial intelligence and machine learning.

Research topics

  • Information Retrieval
  • Computer Science
  • Artificial Intelligence
  • Linguistics
  • Psychology
  • Philosophy
  • Cognitive science

Selected publications

  • Mathematical Structure of Syntactic Merge

    The MIT Press eBooks · 2025-08-05 · 8 citations

    bookOpen accessSenior author

    A mathematical formalization of Chomsky’s theory of Merge in generative linguistics. The Minimalist Program advanced by Noam Chomsky thirty years ago, focusing on the biological nature of human language, has played a central role in our modern understanding of syntax. One key to this program is the notion that the hierarchical structure of human language syntax consists of a single operation Merge. For the first time, Mathematical Structure of Syntactic Merge presents a complete and precise mathematical formalization of Chomsky’s most recent theory of Merge. It both furnishes a new way to explore Merge’s important linguistic implications clearly while also laying to rest any fears that the Minimalist framework based on Merge might itself prove to be formally incoherent. In this book, Matilde Marcolli, Noam Chomsky, and Robert C. Berwick prove that Merge can be described as a very particular kind of highly structured algebra. Additionally, the book shows how Merge can be placed within a consistent framework that includes both a syntactic-semantic interface that realizes Chomsky’s notion of a conceptual-intentional interface, and an externalization system that realizes language-specific constraints. The syntax-semantics interface encompasses many current semantical theories and offers deep insights into the ways that modern “large language models” work, proving that these do not undermine in any way the scientific theories of language based on generative grammar.

  • Encoding syntactic objects and Merge operations in function spaces

    ArXiv.org · 2025-07-17 · 1 citations

    preprintOpen accessSenior author

    We provide a mathematical argument showing that, given a representation of lexical items as functions (wavelets, for instance) in some function space, it is possible to construct a faithful representation of arbitrary syntactic objects in the same function space. This space can be endowed with a commutative non-associative semiring structure built using the second Renyi entropy. The resulting representation of syntactic objects is compatible with the magma structure. The resulting set of functions is an algebra over an operad, where the operations in the operad model circuits that transform the input wave forms into a combined output that encodes the syntactic structure. The action of Merge on workspaces is faithfully implemented as action on these circuits, through a coproduct and a Hopf algebra Markov chain. The results obtained here provide a constructive argument showing the theoretical possibility of a neurocomputational realization of the core computational structure of syntax. We also present a particular case of this general construction where this type of realization of Merge is implemented as a cross frequency phase synchronization on sinusoidal waves. This also shows that Merge can be expressed in terms of the successor function of a semiring, thus clarifying the well known observation of its similarities with the successor function of arithmetic.

  • Redefining Measures of Career Success

    Journal of Student Affairs Inquiry Improvement and Impact · 2025-07-27

    articleOpen access1st authorCorresponding

    Traditional measures of career success—primarily salary and job titles—offer a limited and often misleading view of post-graduation outcomes. These narrow metrics fail to capture the complexity of career trajectories and provide little actionable insight for institutions seeking to improve student preparedness. This paper advocates for a holistic approach to measuring career success by incorporating objective indicators, such as cost of living and industry trends, and subjective measures, such as alumni perceptions of job satisfaction and career fulfillment. Examples and strategies for measuring career success beyond salary and first-destination outcomes are provided. Lessons learned from collecting these measures are shared, including leadership commitment, community building, stakeholder engagement, and the use of technology and analytics. Additionally, it is important to integrate data collection into curricula, foster industry collaboration, and establish feedback loops to align academic programs with workforce needs. By redefining career success beyond traditional metrics, this study offers a framework for institutions to assess and enhance graduate outcomes more effectively in an evolving job market.

  • Parallel Algorithms for Exact Enumeration of Deep Neural Network Activation Regions

    arXiv (Cornell University) · 2024-02-29

    preprintOpen access

    A feedforward neural network using rectified linear units constructs a mapping from inputs to outputs by partitioning its input space into a set of convex regions where points within a region share a single affine transformation. In order to understand how neural networks work, when and why they fail, and how they compare to biological intelligence, we need to understand the organization and formation of these regions. Step one is to design and implement algorithms for exact region enumeration in networks beyond toy examples. In this work, we present parallel algorithms for exact enumeration in deep (and shallow) neural networks. Our work has three main contributions: (1) we present a novel algorithm framework and parallel algorithms for region enumeration; (2) we implement one of our algorithms on a variety of network architectures and experimentally show how the number of regions dictates runtime; and (3) we show, using our algorithm's output, how the dimension of a region's affine transformation impacts further partitioning of the region by deeper layers. To our knowledge, we run our implemented algorithm on networks larger than all of the networks used in the existing region enumeration literature. Further, we experimentally demonstrate the importance of parallelism for region enumeration of any reasonably sized network.

  • Merge and the Strong Minimalist Thesis

    2023 · 70 citations

    • Computer Science
    • Artificial Intelligence
    • Computer Science

    The goal of this contribution to the Elements series is to closely examine Merge, its form, its function, and its central role in current linguistic theory. It explores what it does (and does not do), why it has the form it has, and its development over time. The basic idea behind Merge is quite simple. However, Merge interacts, in intricate ways, with other components including the language's interfaces, laws of nature, and certain language-specific conditions. Because of this, and because of its fundamental place in the human faculty of language, this Element's focus on Merge provides insights into the goals and development of generative grammar more generally, and its prospects for the future.

  • Mathematical Structure of Syntactic Merge

    arXiv (Cornell University) · 2023-05-29 · 5 citations

    preprintOpen accessSenior author

    The syntactic Merge operation of the Minimalist Program in linguistics can be described mathematically in terms of Hopf algebras, with a formalism similar to the one arising in the physics of renormalization. This mathematical formulation of Merge has good descriptive power, as phenomena empirically observed in linguistics can be justified from simple mathematical arguments. It also provides a possible mathematical model for externalization and for the role of syntactic parameters.

  • Old and New Minimalism: a Hopf algebra comparison

    arXiv (Cornell University) · 2023-06-17 · 3 citations

    preprintOpen access

    In this paper we compare some old formulations of Minimalism, in particular Stabler's computational minimalism, and Chomsky's new formulation of Merge and Minimalism, from the point of view of their mathematical description in terms of Hopf algebras. We show that the newer formulation has a clear advantage purely in terms of the underlying mathematical structure. More precisely, in the case of Stabler's computational minimalism, External Merge can be described in terms of a partially defined operated algebra with binary operation, while Internal Merge determines a system of right-ideal coideals of the Loday-Ronco Hopf algebra and corresponding right-module coalgebra quotients. This mathematical structure shows that Internal and External Merge have significantly different roles in the old formulations of Minimalism, and they are more difficult to reconcile as facets of a single algebraic operation, as desirable linguistically. On the other hand, we show that the newer formulation of Minimalism naturally carries a Hopf algebra structure where Internal and External Merge directly arise from the same operation. We also compare, at the level of algebraic properties, the externalization model of the new Minimalism with proposals for assignments of planar embeddings based on heads of trees.

  • Syntax-semantics interface: an algebraic model

    arXiv (Cornell University) · 2023-11-10 · 2 citations

    preprintOpen access

    We extend our formulation of Merge and Minimalism in terms of Hopf algebras to an algebraic model of a syntactic-semantic interface. We show that methods adopted in the formulation of renormalization (extraction of meaningful physical values) in theoretical physics are relevant to describe the extraction of meaning from syntactic expressions. We show how this formulation relates to computational models of semantics and we answer some recent controversies about implications for generative linguistics of the current functioning of large language models.

  • The odyssey to next-generation computers: cognitive computers (κC) inspired by the brain and powered by intelligent mathematics

    Frontiers in Computer Science · 2023-05-19 · 4 citations

    articleOpen access

    Cognitive computers (κ C ) are intelligent processors advanced from data and information processing to autonomous knowledge learning and intelligence generation. This work presents a retrospective and prospective review of the odyssey toward κ C empowered by transdisciplinary basic research and engineering advances. A wide range of fundamental theories and innovative technologies for κ C is explored, and a set of underpinning intelligent mathematics (IM) is created. The architectures of κ C for cognitive computing and Autonomous Intelligence Generation (AIG) are designed as a brain-inspired cognitive engine. Applications of κ C in autonomous AI (AAI) are demonstrated by pilot projects. This work reveals that AIG will no longer be a privilege restricted only to humans via the odyssey to κ C toward training-free and self-inferencing computers.

  • The Failure of Deep Neural Networks to Capture Human Language’s Cognitive Core

    2021-10-29 · 3 citations

    article1st authorCorresponding

    Current deep neural networks have made remarkable advances in their ability to analyze and use natural language, with great apparent engineering success. But how well do these systems mirror the cognitive constraints associated with human language? In this talk we show that there are three essential core computations that characterize human language as an engine of human thought. One is "digital infinity"– the fact that we can produce an open-ended countably infinite number of sentences. The second is that sentences are hierarchically structured, rather than being arranged in a linear array. The third property is that human language computations always admit the possibility of "displacement" – a word or phrase can be pronounced at a place distinct from its usual location of semantic interpretation. All three properties can be shown to follow from a single, simple, recursive combinatorial operation. We provide empirical evidence for all three properties, both from concrete developmental examples as well as psycholinguistic and brain imaging experiments.What about current "deep neural network" systems? Although they perform very well after large-scale training, their success appears to be grounded on accurate table-lookup–memorization–without truly mirroring the three key computational principles of human language cognition. By "stress testing" currently available deep neural network processors, we show that they are, perhaps surprisingly very fragile when presented even with simple examples that deviate modestly from the examples on which they were trained. In particular, they fail to properly represent hierarchical structure and they cannot reliably reconstruct examples of sentences with "displacement" if the examples go just a bit beyond the complexity of their training set data. For example, while a deep neural network system might work on "Which cookie did Bob want," it fails on, "Which cookie did Bob want to eat." Such failures indicate that the neural net systems have not generalized in the same sense that children do, since children can easily handle such examples after receiving much more limited training data.

Frequent coauthors

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Robert Berwick

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup