Owen Rambow

· IACS Endowed Chair

Stony Brook University · Mathematics

Active 1988–2025

h-index47

Citations9.1k

Papers28133 last 5y

Funding—

Faculty page

OpenAlex

See your match with Owen Rambow — sign in to PhdFit.Sign in

About

Owen Rambow is the IACS Endowed Chair in the Department of Linguistics at Stony Brook University. His research focuses on natural language processing and computational linguistics, with specific interests including the detailed aspects of language such as morphology and syntax, as well as how language is used in context. Owen received a Ph.D. in Computer and Information Sciences from the University of Pennsylvania. He has professional experience working at AT&T Labs — Research, spent 15 years at Columbia University as a research scientist, and worked for three years at Elemental Cognition LLC, a startup dedicated to developing software for deep language understanding. At Columbia, he was part of the Center for Computational Learning Systems and co-founded CADIM, a research group specializing in Arabic natural language processing, which licenses advanced NLP tools. His group has also released several resources, including a richly annotated version of the Enron email corpus. Owen Rambow has published extensively in top conferences and journals. He has served as the Chair of the American chapter of the Association for Computational Linguistics, as program co-chair of the NAACL HLT 2016 conference, and has held roles as program committee chair or senior program committee member for numerous conferences and workshops.

Research topics

Sociology
Artificial Intelligence
Computer Science
Natural Language Processing
Algorithm
Human–computer interaction
Psychology
Epistemology
Communication

Selected publications

Residualized Similarity for Faithfully Explainable Authorship Verification
2025-01-01
articleOpen access
Responsible use of authorship verification (AV) systems requires not only high-accuracy but also interpretable solutions.Specifically, for systems to be deployed in contexts where decisions have real-world consequences, their predictions must be explainable through interpretable features that can be traced to the original text.Neural methods achieve high accuracies, but their representations lack direct interpretability.Furthermore, LLM predictions cannot be explained faithfully -if there is an explanation given for a prediction, it doesn't represent the reasoning process behind the model's prediction.To address this gap, we introduce residualized similarity (RS), 1 a novel method that supplements systems using interpretable features with a neural network to improve their performance while maintaining interpretability.Authorship verification is fundamentally a similarity task, where the goal is to measure how likely two documents are to be written by the same author.The key idea is to use a neural network to predict a residual similarity, i.e. the error in the similarity predicted by the interpretable system.Our evaluation across four datasets shows that not only can we match the performance of state-of-the-art authorship verification models, but we can show how and to what degree the final prediction is faithful and interpretable.
Publisher OA PDF DOI
LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing
2025-01-01 · 1 citations
articleOpen accessSenior author
The paper explores the performance of LLMs in the context of multi-dimensional analytic writing assessments, i.e. their ability to provide both scores and comments based on multiple assessment criteria.Using a corpus of literature reviews written by L2 graduate students and assessed by human experts against 9 analytic criteria, we prompt several popular LLMs to perform the same task under various conditions.To evaluate the quality of feedback comments, we apply a novel feedback comment quality evaluation framework.This framework is interpretable, cost-efficient, scalable, and reproducible, compared to existing methods that rely on manual judgments.We find that LLMs can generate reasonably good and generally reliable multi-dimensional analytic assessments.We release our corpus and code 1 for reproducibility.
Publisher OA PDF DOI
Synthetic Audio Helps for Cognitive State Tasks
ArXiv.org · 2025-02-10
preprintOpen accessSenior author
The NLP community has broadly focused on text-only approaches of cognitive state tasks, but audio can provide vital missing cues through prosody. We posit that text-to-speech models learn to track aspects of cognitive state in order to produce naturalistic audio, and that the signal audio models implicitly identify is orthogonal to the information that language models exploit. We present Synthetic Audio Data fine-tuning (SAD), a framework where we show that 7 tasks related to cognitive state modeling benefit from multimodal training on both text and zero-shot synthetic audio data from an off-the-shelf TTS system. We show an improvement over the text-only modality when adding synthetic audio data to text-only corpora. Furthermore, on tasks and corpora that do contain gold audio, we show our SAD framework achieves competitive performance with text and synthetic audio compared to text and gold audio.
Publisher OA PDF DOI
LVLMs are Bad at Overhearing Human Referential Communication
2025-01-01
articleOpen access
During conversation, speakers collaborate on spontaneous referring expressions, which they can then re-use in subsequent conversation with the same partner.Understanding such referring expressions is an important ability for an embodied agent so that it can carry out tasks in the real world.This requires integrating and understanding language, vision, and conversational interaction.We study the capabilities of seven state-of-the-art Large Vision Language Models (LVLMs) as overhearers to a corpus of spontaneous conversations between pairs of human discourse participants engaged in a collaborative object-matching task.We find that such a task remains challenging for current LVLMs, which fail to show a consistent performance improvement as they overhear more conversations from the same discourse participants repeating the same task for multiple rounds.We release our corpus and code 1 for reproducibility and to facilitate future research.
Publisher OA PDF DOI
Active Few-Shot Learning for Text Classification
ArXiv.org · 2025-02-26 · 1 citations
preprintOpen access
The rise of Large Language Models (LLMs) has boosted the use of Few-Shot Learning (FSL) methods in natural language processing, achieving acceptable performance even when working with limited training data. The goal of FSL is to effectively utilize a small number of annotated samples in the learning process. However, the performance of FSL suffers when unsuitable support samples are chosen. This problem arises due to the heavy reliance on a limited number of support samples, which hampers consistent performance improvement even when more support samples are added. To address this challenge, we propose an active learning-based instance selection mechanism that identifies effective support instances from the unlabeled pool and can work with different LLMs. Our experiments on five tasks show that our method frequently improves the performance of FSL. We make our implementation available on GitHub.
Publisher OA PDF DOI
Synthetic Audio Helps for Cognitive State Tasks
2025-01-01
articleOpen accessSenior author
Automatically recognizing a human's complete cognitive state from text is a difficult task; from text, a model has to recognize a combination of concepts including belief, emotion, common ground, sentiment, and intention.Humans do not only track and update cognitive state from the meaning of words and sentences, but also from paralinguistic cues such as prosody.The NLP community has broadly focused on textonly approaches to cognitive state tasks, but audio can provide vital missing information.We posit that text-to-speech (TTS) models learn to track aspects of cognitive state in order to produce naturalistic audio, and that the signal audio models implicitly identify is orthogonal to the information that language models exploit.We present Synthetic Audio Data fine-tuning (SAD), a framework where we show that seven tasks related to cognitive state modeling benefit from multimodal training on both text and zeroshot synthetic audio data from an off-the-shelf TTS system.We show an improvement over the text-only modality when adding synthetic audio data to text-only corpora.Furthermore, on tasks and corpora that do contain gold audio, we show our SAD framework achieves competitive performance using text and synthetic audio compared to text and gold audio.
Publisher OA PDF DOI
Exploring Limitations of LLM Capabilities with Multi-Problem Evaluation
2025-01-01 · 3 citations
articleOpen accessSenior author
We propose using prompts made up of multiple problems to evaluate LLM capabilities, an approach we call multi-problem evaluation.We examine 7 LLMs on 4 related task types constructed from 6 existing classification benchmarks.We find that while LLMs can generally perform multiple homogeneous classifications at once (Batch Classification) as well as when they do so separately, they perform significantly worse on two selection tasks that are conceptually equivalent to Batch Classification and involve selecting indices of text falling into each class label, either independently or altogether.We show that such a significant performance drop is due to LLMs' inability to adequately combine index selection with text classification.Such a drop is surprisingly observed across all LLMs attested, under zero-shot, few-shot, and CoT settings, and even with a novel synthetic dataset, potentially reflecting an inherent capability limitation with modern LLMs.
Publisher OA PDF DOI
Active Few-Shot Learning for Text Classification
2025-01-01 · 4 citations
articleOpen access
Saeed Ahmadnia, Arash Yousefi Jordehi, Mahsa Hosseini Khasheh Heyran, Seyed Abolghasem Mirroshandel, Owen Rambow, Cornelia Caragea. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025.
Publisher OA PDF DOI
LVLMs are Bad at Overhearing Human Referential Communication
ArXiv.org · 2025-09-15
preprintOpen access
During spontaneous conversations, speakers collaborate on novel referring expressions, which they can then re-use in subsequent conversations. Understanding such referring expressions is an important ability for an embodied agent, so that it can carry out tasks in the real world. This requires integrating and understanding language, vision, and conversational interaction. We study the capabilities of seven state-of-the-art Large Vision Language Models (LVLMs) as overhearers to a corpus of spontaneous conversations between pairs of human discourse participants engaged in a collaborative object-matching task. We find that such a task remains challenging for current LVLMs and they all fail to show a consistent performance improvement as they overhear more conversations from the same discourse participants repeating the same task for multiple rounds. We release our corpus and code for reproducibility and to facilitate future research.
Publisher OA PDF DOI
LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing
ArXiv.org · 2025-02-17
preprintOpen accessSenior author
The paper explores the performance of LLMs in the context of multi-dimensional analytic writing assessments, i.e. their ability to provide both scores and comments based on multiple assessment criteria. Using a corpus of literature reviews written by L2 graduate students and assessed by human experts against 9 analytic criteria, we prompt several popular LLMs to perform the same task under various conditions. To evaluate the quality of feedback comments, we apply a novel feedback comment quality evaluation framework. This framework is interpretable, cost-efficient, scalable, and reproducible, compared to existing methods that rely on manual judgments. We find that LLMs can generate reasonably good and generally reliable multi-dimensional analytic assessments. We release our corpus and code for reproducibility.
Publisher OA PDF DOI

Frequent coauthors

Nizar Habash
52 shared
Mona Diab
Carnegie Mellon University
26 shared
Vinodkumar Prabhakaran
23 shared
Alexis Nasr
17 shared
Jungo Kasai
Toyota Technological Institute at Chicago
15 shared
Robert Frank
15 shared
Ramy Eskander
15 shared
Srinivas Bangalore
14 shared

Education

Ph.D., Computer Science
University of California, San Diego
2000
M.S., Computer Science
University of California, San Diego
1997
B.S., Computer Science
University of California, San Diego
1995

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Owen Rambow

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you