Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Douglas Downey

Douglas Downey

· Professor of Computer ScienceVerified

Northwestern University · Chemical Engineering

Active 1997–2026

h-index31
Citations8.2k
Papers178109 last 5y
Funding$1.2M1 active
See your match with Douglas Downey — sign in to PhdFit.Sign in

About

Douglas Downey is a Professor of Computer Science at Northwestern University, affiliated with the Master of Science in Artificial Intelligence and Master of Science in Robotics programs. His research focuses on natural language processing, machine learning, and artificial intelligence, with particular interest in the automatic construction of useful knowledge bases from Web text. He aims to develop techniques and prototypes that extend the state of the art in Web search and to establish a formal basis for learning from unstructured text without relying on hand-labeled data. Downey also works on ways to utilize human input more effectively in machine learning, exploring methods such as active learning and semi-supervised learning to improve the efficiency and effectiveness of machine learning systems.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Natural Language Processing
  • Machine Learning
  • Information Retrieval
  • Psychology
  • Mathematics
  • Engineering
  • Human–computer interaction

Selected publications

  • Omakase: proactive assistance with actionable suggestions for evolving scientific research projects

    arXiv (Cornell University) · 2026-04-10

    articleOpen access

    As AI agents become increasingly capable of complex knowledge tasks, the lack of context limits their capability to proactively reason about a user's latent needs throughout a long evolving project. In scientific research, many researchers still manually query a deep research system and compress their rich project contexts into short, targeted queries. Further, a deep research system produces exhaustive reports, making it difficult to identify concrete actions. To explore the opportunities of research assistants that are proactive throughout a research project, we conducted several studies (N=42) with a technology probe and an iterative prototype. The latest iteration of our system, Omakase, is a research assistant that monitors a user's project documents to infer timely queries to a deep research system. Omakase then distills long reports into suggestions contextualized to their evolving projects. Our evaluations showed that participants found the generated queries to be useful and timely, and rated Omakase's suggestions as significantly more actionable than the original reports.

  • Omakase: proactive assistance with actionable suggestions for evolving scientific research projects

    arXiv (Cornell University) · 2026-04-10

    preprintOpen access

    As AI agents become increasingly capable of complex knowledge tasks, the lack of context limits their capability to proactively reason about a user's latent needs throughout a long evolving project. In scientific research, many researchers still manually query a deep research system and compress their rich project contexts into short, targeted queries. Further, a deep research system produces exhaustive reports, making it difficult to identify concrete actions. To explore the opportunities of research assistants that are proactive throughout a research project, we conducted several studies (N=42) with a technology probe and an iterative prototype. The latest iteration of our system, Omakase, is a research assistant that monitors a user's project documents to infer timely queries to a deep research system. Omakase then distills long reports into suggestions contextualized to their evolving projects. Our evaluations showed that participants found the generated queries to be useful and timely, and rated Omakase's suggestions as significantly more actionable than the original reports.

  • SciArena: An Open Evaluation Platform for Non-Verifiable Scientific Literature-Grounded Tasks

    arXiv (Cornell University) · 2025-07-01

    preprintOpen access

    We present SciArena, an open and collaborative platform for evaluating foundation models on scientific literature-grounded tasks. Unlike traditional benchmarks for scientific literature understanding and synthesis, SciArena engages the research community directly, following the Chatbot Arena evaluation approach of community voting on model comparisons. By leveraging collective intelligence, SciArena offers a community-driven evaluation of model performance on open-ended scientific tasks that demand literature-grounded, long-form responses. The platform currently supports 47 foundation models and has collected over 20,000 votes from human researchers across diverse scientific domains. Our analysis of the data collected so far confirms its high quality. We discuss the results and insights based on the model ranking leaderboard. To further promote research in building model-based automated evaluation systems for literature tasks, we release SciArena-Eval, a meta-evaluation benchmark based on collected preference data. It measures the accuracy of models in judging answer quality by comparing their pairwise assessments with human votes. Our experiments highlight the benchmark's challenges and emphasize the need for more reliable automated evaluation methods.

  • SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

    2025-01-01 · 2 citations

    articleOpen access

    David Wadden, Kejian Shi, Jacob Morrison, Alan Li, Aakanksha Naik, Shruti Singh, Nitzan Barzilay, Kyle Lo, Tom Hope, Luca Soldaini, Shannon Zejiang Shen, Doug Downey, Hannaneh Hajishirzi, Arman Cohan. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.

  • Ai2 Scholar QA: Organized Literature Synthesis with Attribution

    ArXiv.org · 2025-04-15

    preprintOpen access

    Retrieval-augmented generation is increasingly effective in answering scientific questions from literature, but many state-of-the-art systems are expensive and closed-source. We introduce Ai2 Scholar QA, a free online scientific question answering application. To facilitate research, we make our entire pipeline public: as a customizable open-source Python package and interactive web app, along with paper indexes accessible through public APIs and downloadable datasets. We describe our system in detail and present experiments analyzing its key design decisions. In an evaluation on a recent scientific QA benchmark, we find that Ai2 Scholar QA outperforms competing systems.

  • Intent-Aware Schema Generation And Refinement For Literature Review Tables

    ArXiv.org · 2025-07-18

    preprintOpen access

    The increasing volume of academic literature makes it essential for researchers to organize, compare, and contrast collections of documents. Large language models (LLMs) can support this process by generating schemas defining shared aspects along which to compare papers. However, progress on schema generation has been slow due to: (i) ambiguity in reference-based evaluations, and (ii) lack of editing/refinement methods. Our work is the first to address both issues. First, we present an approach for augmenting unannotated table corpora with \emph{synthesized intents}, and apply it to create a dataset for studying schema generation conditioned on a given information need, thus reducing ambiguity. With this dataset, we show how incorporating table intents significantly improves baseline performance in reconstructing reference schemas. We start by comprehensively benchmarking several single-shot schema generation methods, including prompted LLM workflows and fine-tuned models, showing that smaller, open-weight models can be fine-tuned to be competitive with state-of-the-art prompted LLMs. Next, we propose several LLM-based schema refinement techniques and show that these can further improve schemas generated by these methods.

  • Ai2 Scholar QA: Organized Literature Synthesis with Attribution

    2025-01-01 · 6 citations

    articleOpen access

    Amanpreet Singh, Joseph Chee Chang, Dany Haddad, Aakanksha Naik, Jena D. Hwang, Rodney Kinney, Daniel S Weld, Doug Downey, Sergey Feldman. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations). 2025.

  • Front Matter

    2025-01-01

    paratextOpen access

    Peter Jansen, Bhavana Dalvi Mishra, Harsh Trivedi, Bodhisattwa Prasad Majumder, Tom Hope, Tushar Khot, Doug Downey, Eric Horvitz. Proceedings of the 1st Workshop on AI and Scientific Discovery: Directions and Opportunities. 2025.

  • Preference Learning from Physics-Based Feedback: Tuning Language Models to Design BCC/B2 Superalloys

    ArXiv.org · 2025-11-15

    preprintOpen access

    We apply preference learning to the task of language model-guided design of novel structural alloys. In contrast to prior work that focuses on generating stable inorganic crystals, our approach targets the synthesizeability of a specific structural class: BCC/B2 superalloys, an underexplored family of materials with potential applications in extreme environments. Using three open-weight models (LLaMA-3.1, Gemma-2, and OLMo-2), we demonstrate that language models can be optimized for multiple design objectives using a single, unified reward signal through Direct Preference Optimization (DPO). Unlike prior approaches that rely on heuristic or human-in-the-loop feedback (costly), our reward signal is derived from thermodynamic phase calculations, offering a scientifically grounded criterion for model tuning. To our knowledge, this is the first demonstration of preference-tuning a language model using physics-grounded feedback for structural alloy design. The resulting framework is general and extensible, providing a path forward for intelligent design-space exploration across a range of physical science domains.

  • Intent-aware Schema Generation and Refinement for Literature Review Tables

    2025-01-01 · 1 citations

    articleOpen access

    The increasing volume of academic literature makes it essential for researchers to organize, compare, and contrast collections of documents.Large language models (LLMs) can support this process by generating schemas defining shared aspects along which to compare papers.However, progress on schema generation has been slow due to: (i) ambiguity in referencebased evaluations, and (ii) lack of editing/refinement methods.Our work is the first to address both issues.First, we present an approach for augmenting unannotated table corpora with synthesized intents, and apply it to create a dataset for studying schema generation conditioned on a given information need, thus reducing ambiguity.With this dataset, we show how incorporating table intents significantly improves baseline performance in reconstructing reference schemas.We start by comprehensively benchmarking several singleshot schema generation methods, including prompted LLM workflows and fine-tuned models, showing that smaller, open-weight models can be fine-tuned to be competitive with stateof-the-art prompted LLMs.Next, we propose several LLM-based schema refinement techniques and show that these can further improve schemas generated by these methods.

Recent grants

Frequent coauthors

  • Daniel S. Weld

    Allen Institute

    89 shared
  • Kyle Lo

    64 shared
  • Zejiang Shen

    Massachusetts Institute of Technology

    50 shared
  • Bailey Kuehl

    41 shared
  • Erin Bransom

    34 shared
  • Luca Soldaini

    33 shared
  • Amanpreet Singh

    32 shared
  • Chandra Bhagavatula

    Allen Institute

    30 shared
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Douglas Downey

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup