Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Owen Rambow

Owen Rambow

· IACS Endowed Chair

Stony Brook University · Mathematics

Active 1988–2025

h-index47
Citations9.1k
Papers28133 last 5y
Funding
See your match with Owen Rambow — sign in to PhdFit.Sign in

About

Owen Rambow is the IACS Endowed Chair in the Department of Linguistics at Stony Brook University. His research focuses on natural language processing and computational linguistics, with specific interests including the detailed aspects of language such as morphology and syntax, as well as how language is used in context. Owen received a Ph.D. in Computer and Information Sciences from the University of Pennsylvania. He has professional experience working at AT&T Labs — Research, spent 15 years at Columbia University as a research scientist, and worked for three years at Elemental Cognition LLC, a startup dedicated to developing software for deep language understanding. At Columbia, he was part of the Center for Computational Learning Systems and co-founded CADIM, a research group specializing in Arabic natural language processing, which licenses advanced NLP tools. His group has also released several resources, including a richly annotated version of the Enron email corpus. Owen Rambow has published extensively in top conferences and journals. He has served as the Chair of the American chapter of the Association for Computational Linguistics, as program co-chair of the NAACL HLT 2016 conference, and has held roles as program committee chair or senior program committee member for numerous conferences and workshops.

Research topics

  • Sociology
  • Artificial Intelligence
  • Computer Science
  • Natural Language Processing
  • Algorithm
  • Human–computer interaction
  • Psychology
  • Epistemology
  • Communication

Selected publications

  • Residualized Similarity for Faithfully Explainable Authorship Verification

    2025-01-01

    articleOpen access

    Responsible use of authorship verification (AV) systems requires not only high-accuracy but also interpretable solutions.Specifically, for systems to be deployed in contexts where decisions have real-world consequences, their predictions must be explainable through interpretable features that can be traced to the original text.Neural methods achieve high accuracies, but their representations lack direct interpretability.Furthermore, LLM predictions cannot be explained faithfully -if there is an explanation given for a prediction, it doesn't represent the reasoning process behind the model's prediction.To address this gap, we introduce residualized similarity (RS), 1 a novel method that supplements systems using interpretable features with a neural network to improve their performance while maintaining interpretability.Authorship verification is fundamentally a similarity task, where the goal is to measure how likely two documents are to be written by the same author.The key idea is to use a neural network to predict a residual similarity, i.e. the error in the similarity predicted by the interpretable system.Our evaluation across four datasets shows that not only can we match the performance of state-of-the-art authorship verification models, but we can show how and to what degree the final prediction is faithful and interpretable.

  • LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing

    2025-01-01 · 1 citations

    articleOpen accessSenior author

    The paper explores the performance of LLMs in the context of multi-dimensional analytic writing assessments, i.e. their ability to provide both scores and comments based on multiple assessment criteria.Using a corpus of literature reviews written by L2 graduate students and assessed by human experts against 9 analytic criteria, we prompt several popular LLMs to perform the same task under various conditions.To evaluate the quality of feedback comments, we apply a novel feedback comment quality evaluation framework.This framework is interpretable, cost-efficient, scalable, and reproducible, compared to existing methods that rely on manual judgments.We find that LLMs can generate reasonably good and generally reliable multi-dimensional analytic assessments.We release our corpus and code 1 for reproducibility.

  • Synthetic Audio Helps for Cognitive State Tasks

    ArXiv.org · 2025-02-10

    preprintOpen accessSenior author

    The NLP community has broadly focused on text-only approaches of cognitive state tasks, but audio can provide vital missing cues through prosody. We posit that text-to-speech models learn to track aspects of cognitive state in order to produce naturalistic audio, and that the signal audio models implicitly identify is orthogonal to the information that language models exploit. We present Synthetic Audio Data fine-tuning (SAD), a framework where we show that 7 tasks related to cognitive state modeling benefit from multimodal training on both text and zero-shot synthetic audio data from an off-the-shelf TTS system. We show an improvement over the text-only modality when adding synthetic audio data to text-only corpora. Furthermore, on tasks and corpora that do contain gold audio, we show our SAD framework achieves competitive performance with text and synthetic audio compared to text and gold audio.

  • LVLMs are Bad at Overhearing Human Referential Communication

    2025-01-01

    articleOpen access

    During conversation, speakers collaborate on spontaneous referring expressions, which they can then re-use in subsequent conversation with the same partner.Understanding such referring expressions is an important ability for an embodied agent so that it can carry out tasks in the real world.This requires integrating and understanding language, vision, and conversational interaction.We study the capabilities of seven state-of-the-art Large Vision Language Models (LVLMs) as overhearers to a corpus of spontaneous conversations between pairs of human discourse participants engaged in a collaborative object-matching task.We find that such a task remains challenging for current LVLMs, which fail to show a consistent performance improvement as they overhear more conversations from the same discourse participants repeating the same task for multiple rounds.We release our corpus and code 1 for reproducibility and to facilitate future research.

  • Active Few-Shot Learning for Text Classification

    ArXiv.org · 2025-02-26 · 1 citations

    preprintOpen access

    The rise of Large Language Models (LLMs) has boosted the use of Few-Shot Learning (FSL) methods in natural language processing, achieving acceptable performance even when working with limited training data. The goal of FSL is to effectively utilize a small number of annotated samples in the learning process. However, the performance of FSL suffers when unsuitable support samples are chosen. This problem arises due to the heavy reliance on a limited number of support samples, which hampers consistent performance improvement even when more support samples are added. To address this challenge, we propose an active learning-based instance selection mechanism that identifies effective support instances from the unlabeled pool and can work with different LLMs. Our experiments on five tasks show that our method frequently improves the performance of FSL. We make our implementation available on GitHub.

  • Synthetic Audio Helps for Cognitive State Tasks

    2025-01-01

    articleOpen accessSenior author

    Automatically recognizing a human's complete cognitive state from text is a difficult task; from text, a model has to recognize a combination of concepts including belief, emotion, common ground, sentiment, and intention.Humans do not only track and update cognitive state from the meaning of words and sentences, but also from paralinguistic cues such as prosody.The NLP community has broadly focused on textonly approaches to cognitive state tasks, but audio can provide vital missing information.We posit that text-to-speech (TTS) models learn to track aspects of cognitive state in order to produce naturalistic audio, and that the signal audio models implicitly identify is orthogonal to the information that language models exploit.We present Synthetic Audio Data fine-tuning (SAD), a framework where we show that seven tasks related to cognitive state modeling benefit from multimodal training on both text and zeroshot synthetic audio data from an off-the-shelf TTS system.We show an improvement over the text-only modality when adding synthetic audio data to text-only corpora.Furthermore, on tasks and corpora that do contain gold audio, we show our SAD framework achieves competitive performance using text and synthetic audio compared to text and gold audio.

  • Exploring Limitations of LLM Capabilities with Multi-Problem Evaluation

    2025-01-01 · 3 citations

    articleOpen accessSenior author

    We propose using prompts made up of multiple problems to evaluate LLM capabilities, an approach we call multi-problem evaluation.We examine 7 LLMs on 4 related task types constructed from 6 existing classification benchmarks.We find that while LLMs can generally perform multiple homogeneous classifications at once (Batch Classification) as well as when they do so separately, they perform significantly worse on two selection tasks that are conceptually equivalent to Batch Classification and involve selecting indices of text falling into each class label, either independently or altogether.We show that such a significant performance drop is due to LLMs' inability to adequately combine index selection with text classification.Such a drop is surprisingly observed across all LLMs attested, under zero-shot, few-shot, and CoT settings, and even with a novel synthetic dataset, potentially reflecting an inherent capability limitation with modern LLMs.

  • Active Few-Shot Learning for Text Classification

    2025-01-01 · 4 citations

    articleOpen access

    Saeed Ahmadnia, Arash Yousefi Jordehi, Mahsa Hosseini Khasheh Heyran, Seyed Abolghasem Mirroshandel, Owen Rambow, Cornelia Caragea. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025.

  • LVLMs are Bad at Overhearing Human Referential Communication

    ArXiv.org · 2025-09-15

    preprintOpen access

    During spontaneous conversations, speakers collaborate on novel referring expressions, which they can then re-use in subsequent conversations. Understanding such referring expressions is an important ability for an embodied agent, so that it can carry out tasks in the real world. This requires integrating and understanding language, vision, and conversational interaction. We study the capabilities of seven state-of-the-art Large Vision Language Models (LVLMs) as overhearers to a corpus of spontaneous conversations between pairs of human discourse participants engaged in a collaborative object-matching task. We find that such a task remains challenging for current LVLMs and they all fail to show a consistent performance improvement as they overhear more conversations from the same discourse participants repeating the same task for multiple rounds. We release our corpus and code for reproducibility and to facilitate future research.

  • LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing

    ArXiv.org · 2025-02-17

    preprintOpen accessSenior author

    The paper explores the performance of LLMs in the context of multi-dimensional analytic writing assessments, i.e. their ability to provide both scores and comments based on multiple assessment criteria. Using a corpus of literature reviews written by L2 graduate students and assessed by human experts against 9 analytic criteria, we prompt several popular LLMs to perform the same task under various conditions. To evaluate the quality of feedback comments, we apply a novel feedback comment quality evaluation framework. This framework is interpretable, cost-efficient, scalable, and reproducible, compared to existing methods that rely on manual judgments. We find that LLMs can generate reasonably good and generally reliable multi-dimensional analytic assessments. We release our corpus and code for reproducibility.

Frequent coauthors

  • Nizar Habash

    52 shared
  • Mona Diab

    Carnegie Mellon University

    26 shared
  • Vinodkumar Prabhakaran

    23 shared
  • Alexis Nasr

    17 shared
  • Jungo Kasai

    Toyota Technological Institute at Chicago

    15 shared
  • Robert Frank

    15 shared
  • Ramy Eskander

    15 shared
  • Srinivas Bangalore

    14 shared

Education

  • Ph.D., Computer Science

    University of California, San Diego

    2000
  • M.S., Computer Science

    University of California, San Diego

    1997
  • B.S., Computer Science

    University of California, San Diego

    1995
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Owen Rambow

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup