Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Arjun Guha

Arjun Guha

Verified

Northeastern University · Software Engineering

Active 2005–2026

h-index33
Citations3.9k
Papers12852 last 5y
Funding$1.9M
See your match with Arjun Guha — sign in to PhdFit.Sign in

About

Arjun Guha is an associate professor in the Khoury College of Computer Sciences at Northeastern University, based in Boston. His research focuses on programming languages, with particular interest in security and reliability problems in web programming, systems, and robotics. Guha uses tools and techniques from programming languages to address these issues, and one of his recent projects aims to make serverless computing more cost-effective, reliable, and applicable. He is a member of the Programming Research Laboratory. Prior to joining Northeastern, Guha was an associate professor at the University of Massachusetts Amherst and a postdoctoral research associate at Cornell University. His work has received several awards, including an OOPSLA Most Influential Paper Award, a PLDI Distinguished Paper Award, and a PACT Best Paper Award. In his free time, Guha enjoys running, cooking, and reading.

Research topics

  • Artificial Intelligence
  • Computer Science
  • World Wide Web
  • Machine Learning
  • Operating system
  • Programming language
  • Software engineering
  • Theoretical computer science

Selected publications

  • Learning Reasoning World Models for Parallel Code

    arXiv (Cornell University) · 2026-04-22

    preprintOpen access

    Large language models have shown remarkable ability in serial code generation, but they still struggle with parallel code for which training data is comparatively scarce. A common remedy is to use coding agents that interact with external tools, but tool calls can be costly and sometimes impractical, e.g., for partially written code. We propose Parallel-Code World Models (PCWMs), reasoning LLMs that aim to predict tool outcomes directly from parallel source code. To train PCWMs, we design a novel exploration and data generation pipeline that samples diverse parallel-coding problems and candidate implementations across multiple domains, then executes them via tools to record data races and performance profiles. From these, we synthesize hindsight reasoning traces that causally connect source code to observed tool outcomes. Fine-tuning on the resulting data yields noticeable gains, with a 7B-parameter world model improving from 64.3% to 72.8% accuracy in race-outcome prediction, while an 8B-parameter model improves in a performance profiling task from 49.3% to 58.6% accuracy. Furthermore, when open-weight models were tasked with fixing data races, world-model feedback improved their race-fixing rates relative to self-feedback by 2.7%-9.1% using our 7B-parameter world model and by 6.1%-11.1% using our 14B-parameter world model. Our results suggest that reasoning models have the potential to serve as practical substitutes for external tool calls in parallel-coding agents.

  • Steering Code LLMs with Activation Directions for Language and Library Control

    arXiv (Cornell University) · 2026-03-24

    preprintOpen access

    Code LLMs often default to particular programming languages and libraries under neutral prompts. We investigate whether these preferences are encoded as approximately linear directions in activation space that can be manipulated at inference time. Using a difference-in-means method, we estimate layer-wise steering vectors for five language/library pairs and add them to model hidden states during generation. Across three open-weight code LLMs, these interventions substantially increase generation toward the target ecosystem under neutral prompts and often remain effective even when prompts explicitly request the opposite choice. Steering strength varies by model and target, with common ecosystems easier to induce than rarer alternatives, and overly strong interventions can reduce output quality. Overall, our results suggest that code-style preferences in LLMs are partly represented by compact, steerable structure in activation space.

  • Steering Code LLMs with Activation Directions for Language and Library Control

    ArXiv.org · 2026-03-24

    articleOpen access

    Code LLMs often default to particular programming languages and libraries under neutral prompts. We investigate whether these preferences are encoded as approximately linear directions in activation space that can be manipulated at inference time. Using a difference-in-means method, we estimate layer-wise steering vectors for five language/library pairs and add them to model hidden states during generation. Across three open-weight code LLMs, these interventions substantially increase generation toward the target ecosystem under neutral prompts and often remain effective even when prompts explicitly request the opposite choice. Steering strength varies by model and target, with common ecosystems easier to induce than rarer alternatives, and overly strong interventions can reduce output quality. Overall, our results suggest that code-style preferences in LLMs are partly represented by compact, steerable structure in activation space.

  • Learning Reasoning World Models for Parallel Code

    ArXiv.org · 2026-04-22

    articleOpen access

    Large language models have shown remarkable ability in serial code generation, but they still struggle with parallel code for which training data is comparatively scarce. A common remedy is to use coding agents that interact with external tools, but tool calls can be costly and sometimes impractical, e.g., for partially written code. We propose Parallel-Code World Models (PCWMs), reasoning LLMs that aim to predict tool outcomes directly from parallel source code. To train PCWMs, we design a novel exploration and data generation pipeline that samples diverse parallel-coding problems and candidate implementations across multiple domains, then executes them via tools to record data races and performance profiles. From these, we synthesize hindsight reasoning traces that causally connect source code to observed tool outcomes. Fine-tuning on the resulting data yields noticeable gains, with a 7B-parameter world model improving from 64.3% to 72.8% accuracy in race-outcome prediction, while an 8B-parameter model improves in a performance profiling task from 49.3% to 58.6% accuracy. Furthermore, when open-weight models were tasked with fixing data races, world-model feedback improved their race-fixing rates relative to self-feedback by 2.7%-9.1% using our 7B-parameter world model and by 6.1%-11.1% using our 14B-parameter world model. Our results suggest that reasoning models have the potential to serve as practical substitutes for external tool calls in parallel-coding agents.

  • Understanding How CodeLLMs (Mis)Predict Types with Activation Steering

    2025-01-01

    articleOpen accessSenior author
  • ReasoningWeekly: A General Knowledge and Verbal Reasoning Challenge for Large Language Models

    2025-01-01

    articleOpen accessSenior author

    Zixuan Wu, Francesca Lucchetti, Aleksander Boruch-Gruszecki, Jingmiao Zhao, Carolyn Jane Anderson, Joydeep Biswas, Federico Cassano, Arjun Guha. Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics. 2025.

  • ReasoningWeekly: A General Knowledge and Verbal Reasoning Challenge for Large Language Models

    ArXiv.org · 2025-02-03 · 1 citations

    preprintOpen accessSenior author

    Existing benchmarks for frontier models often test specialized, "PhD-level" knowledge that is difficult for non-experts to grasp. In contrast, we present a benchmark with 613 problems based on the NPR Sunday Puzzle Challenge that requires only general knowledge. Our benchmark is challenging for both humans and models; however correct solutions are easy to verify, and models' mistakes are easy to spot. As LLMs are more widely deployed in society, we believe it is useful to develop benchmarks for frontier models that humans can understand without the need for deep domain expertise. Our work reveals capability gaps that are not evident in existing benchmarks: OpenAI o1 significantly outperforms other reasoning models on our benchmark, despite being on par with other models when tested on benchmarks that test specialized knowledge. Furthermore, our analysis of reasoning outputs uncovers new kinds of failures. DeepSeek R1, for instance, often concedes with "I give up" before providing an answer that it knows is wrong. R1 can also be remarkably "uncertain" in its output and in rare cases, it does not "finish thinking," which suggests the need for techniques to ``wrap up'' before the context window limit is reached. We also quantify the effectiveness of reasoning longer to identify the point beyond which more reasoning is unlikely to improve accuracy on our benchmark.

  • Bridging the Gap Between Binary and Source Based Package Management in Spack

    2025-11-12 · 1 citations

    articleOpen access

    Binary package managers install software quickly but they limit configurability due to rigid ABI requirements that ensure compatibility between binaries. Source package managers provide flexibility in building software, but compilation can be slow. For example, installing an HPC code with a new MPI implementation may result in a full rebuild. Spack, a widely deployed, HPC-focused package manager, can use source and pre-compiled binaries, but lacks a binary compatibility model, so it cannot mix binaries not built together. We present splicing, an extension to Spack that models binary compatibility between packages and allows seamless mixing of source and binary distributions. Splicing augments Spack’s packaging language and dependency resolution engine to reuse compatible binaries but maintains the flexibility of source builds. It incurs minimal installation-time overhead and allows rapid installation from binaries, even for ABI-sensitive dependencies like MPI that would otherwise require many rebuilds.

  • Substance Beats Style: Why Beginning Students Fail to Code with LLMs

    2025-01-01 · 2 citations

    articleOpen access

    Francesca Lucchetti, Zixuan Wu, Arjun Guha, Molly Q Feldman, Carolyn Jane Anderson. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025.

  • Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions

    Lecture notes in computer science · 2025-11-23 · 1 citations

    book-chapterOpen access

Recent grants

Frequent coauthors

Labs

  • Khoury College of Computer SciencesPI

Education

  • Ph.D., Computer Science

    University of California, Los Angeles

    2007
  • M.S., Computer Science

    University of California, Los Angeles

    2003
  • B.S., Computer Science

    University of California, Los Angeles

    2001

Awards & honors

  • OOPSLA Most Influential Paper Award
  • PLDI Distinguished Paper Award
  • PACT Best Paper Award
  • Distinguished Paper Award (2019)
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Arjun Guha

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup