Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Murali Annavaram

Murali Annavaram

· Lloyd F. Hunt Chair of Electrical Power Engineering and Professor of Electrical and Computing Engineering and Computer ScienceVerified

University of Southern California · Thomas Lord Department of Computer Science

Active 2001–2026

h-index45
Citations7.5k
Papers27083 last 5y
Funding$2.3M1 active
See your match with Murali Annavaram — sign in to PhdFit.Sign in

About

Murali Annavaram has been a faculty member in the Ming-Hsieh Department of Electrical Engineering at the University of Southern California since 2007, where he currently holds the Robert G. and Mary G. Lane Early Career Chair. His research focuses on energy efficiency and reliability of computing platforms, with particular attention to mobile platforms, sensor management for health monitoring, and computer systems architecture exploring reliability challenges in future CMOS technologies. He has received notable awards including the NSF CAREER award in 2010 and the IBM Faculty Partnership award in 2009. Annavaram has a background that includes working as a senior research scientist at Intel Microprocessor Research Labs from 2001 to 2007, where he contributed to energy-efficient server design and 3D stacking architectures, and a visiting researcher at Nokia Research Center in 2007, working on traffic sensing technologies. His work on Energy Per Instruction Throttling influenced Intel's Core i7 processor, and his research on Virtual-Trip-Lines laid the foundation for Nokia Traffic Works, a real-time traffic sensing product. He holds a Ph.D. in Computer Engineering from the University of Michigan, Ann Arbor, and is a Senior Member of IEEE and ACM.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Machine Learning
  • Algorithm
  • Computer engineering
  • Geography
  • Operating system

Selected publications

  • Differentially Private Retrieval-Augmented Generation

    ArXiv.org · 2026-02-16

    articleOpen accessSenior author

    Retrieval-augmented generation (RAG) is a widely used framework for reducing hallucinations in large language models (LLMs) on domain-specific tasks by retrieving relevant documents from a database to support accurate responses. However, when the database contains sensitive corpora, such as medical records or legal documents, RAG poses serious privacy risks by potentially exposing private information through its outputs. Prior work has demonstrated that one can practically craft adversarial prompts that force an LLM to regurgitate the augmented contexts. A promising direction is to integrate differential privacy (DP), a privacy notion that offers strong formal guarantees, into RAG systems. However, naively applying DP mechanisms into existing systems often leads to significant utility degradation. Particularly for RAG systems, DP can reduce the usefulness of the augmented contexts leading to increase risk of hallucination from the LLMs. Motivated by these challenges, we present DP-KSA, a novel privacy-preserving RAG algorithm that integrates DP using the propose-test-release paradigm. DP-KSA follows from a key observation that most question-answering (QA) queries can be sufficiently answered with a few keywords. Hence, DP-KSA first obtains an ensemble of relevant contexts, each of which will be used to generate a response from an LLM. We utilize these responses to obtain the most frequent keywords in a differentially private manner. Lastly, the keywords are augmented into the prompt for the final output. This approach effectively compresses the semantic space while preserving both utility and privacy. We formally show that DP-KSA provides formal DP guarantees on the generated output with respect to the RAG database. We evaluate DP-KSA on two QA benchmarks using three instruction-tuned LLMs, and our empirical results demonstrate that DP-KSA achieves a strong privacy-utility tradeoff.

  • PrivacySIM: Evaluating LLM Simulation of User Privacy Behavior

    arXiv (Cornell University) · 2026-05-12

    preprintOpen accessSenior author

    Large language models (LLMs) are increasingly used to simulate human behavior, but their ability to simulate $individual$ privacy decisions is not well understood. In this paper, we address the problem of evaluating whether a core set of user persona attributes can drive LLMs to simulate individual-level privacy behavior. We introduce PrivacySIM, an evaluation suite that benchmarks LLM simulation of user privacy behavior against the ground-truth responses of 1,000 users. These users are drawn from five published user studies on privacy spanning LLM healthcare consultations, conversational agents, and chatbots. Drawing on these user studies, we hypothesize three persona facets as plausible predictors of privacy decision-making: demographics, previous experiences, and stated privacy attitudes. We condition nine frontier LLMs on subsets of these three facets and measure how often each model's response to a data-sharing scenario matches the user's actual response. Our findings show that (1) privacy persona conditioning consistently improves simulation quality over no-persona conditioning, but even the strongest model (40.4\% accuracy) remains far from faithfully simulating individual privacy decisions. (2) A user's stated privacy attitudes alone may not be the best predictor because they often diverge from the user's actual privacy behavior. (3) Users with high AI/chatbot experience but low stated privacy attitudes are the most challenging to simulate. PrivacySIM is a first step toward understanding and improving the capabilities of LLMs to simulate user privacy decisions. We release PrivacySIM to enable further evaluation of LLM privacy simulation.

  • LRD-MPC: Efficient MPC Inference through Low-rank Decomposition

    Open MIND · 2026-02-16

    preprintSenior author

    Secure Multi-party Computation (MPC) enables untrusted parties to jointly compute a function without revealing their inputs. Its application to machine learning (ML) has gained significant attention, particularly for secure inference services deployed across multiple cloud virtual machines (VMs), where each VM acts as an MPC party. Model providers secret-share model weights, and users secret-share inputs, ensuring that each server operates only on random shares. While MPC provides strong cryptographic guarantees, it incurs substantial computational and communication overhead. Deep neural networks rely heavily on convolutional and fully connected layers, which require costly matrix multiplications in MPC. To reduce this cost, we propose leveraging low-rank decomposition (LRD) for linear layers, replacing one large matrix multiplication with two smaller ones. Each matrix multiplication in MPC incurs a round of communication, meaning decomposing one matrix multiplication into two leads to an additional communication round. Second, the added matrix multiplication requires an additional truncation step to maintain numerical precision. Since truncation itself requires communication and computation, these overheads can offset the gains from decomposition. To address this, we introduce two complementary optimizations: truncation skipping and efficient linear layer concatenation. Truncation skipping removes the extra truncation induced by LRD, while linear layer concatenation pipelines operations to hide the additional communication round. Together, these techniques mitigate the main overheads of LRD in MPC and improve overall efficiency. Our approach is broadly applicable across MPC protocols. Experiments show up to 25% speedup in n-PC and 33% in 3-PC protocols over full-rank baselines, along with up to 52% GPU energy savings and 88% reduction in offline-phase latency.

  • PrivacySIM: Evaluating LLM Simulation of User Privacy Behavior

    ArXiv.org · 2026-05-12

    articleOpen accessSenior author

    Large language models (LLMs) are increasingly used to simulate human behavior, but their ability to simulate $individual$ privacy decisions is not well understood. In this paper, we address the problem of evaluating whether a core set of user persona attributes can drive LLMs to simulate individual-level privacy behavior. We introduce PrivacySIM, an evaluation suite that benchmarks LLM simulation of user privacy behavior against the ground-truth responses of 1,000 users. These users are drawn from five published user studies on privacy spanning LLM healthcare consultations, conversational agents, and chatbots. Drawing on these user studies, we hypothesize three persona facets as plausible predictors of privacy decision-making: demographics, previous experiences, and stated privacy attitudes. We condition nine frontier LLMs on subsets of these three facets and measure how often each model's response to a data-sharing scenario matches the user's actual response. Our findings show that (1) privacy persona conditioning consistently improves simulation quality over no-persona conditioning, but even the strongest model (40.4\% accuracy) remains far from faithfully simulating individual privacy decisions. (2) A user's stated privacy attitudes alone may not be the best predictor because they often diverge from the user's actual privacy behavior. (3) Users with high AI/chatbot experience but low stated privacy attitudes are the most challenging to simulate. PrivacySIM is a first step toward understanding and improving the capabilities of LLMs to simulate user privacy decisions. We release PrivacySIM to enable further evaluation of LLM privacy simulation.

  • SAGERec: Sampling and Gating for Enhanced Long-Tail Item Recommendations

    2026-02-16

    articleOpen accessSenior author

    Recommendation systems are an integral part of daily life, influencing how people interact with and access information. The content recommended to users shapes their perceptions, making it crucial to eliminate biases that could negatively impact those perceptions. One such bias is the popularity bias which causes the long-tail item recommendation problem, where systems tend to favor popular items while overlooking less popular yet relevant ones.

  • LRD-MPC: Efficient MPC Inference through Low-rank Decomposition

    arXiv (Cornell University) · 2026-02-16

    articleOpen accessSenior author

    Secure Multi-party Computation (MPC) enables untrusted parties to jointly compute a function without revealing their inputs. Its application to machine learning (ML) has gained significant attention, particularly for secure inference services deployed across multiple cloud virtual machines (VMs), where each VM acts as an MPC party. Model providers secret-share model weights, and users secret-share inputs, ensuring that each server operates only on random shares. While MPC provides strong cryptographic guarantees, it incurs substantial computational and communication overhead. Deep neural networks rely heavily on convolutional and fully connected layers, which require costly matrix multiplications in MPC. To reduce this cost, we propose leveraging low-rank decomposition (LRD) for linear layers, replacing one large matrix multiplication with two smaller ones. Each matrix multiplication in MPC incurs a round of communication, meaning decomposing one matrix multiplication into two leads to an additional communication round. Second, the added matrix multiplication requires an additional truncation step to maintain numerical precision. Since truncation itself requires communication and computation, these overheads can offset the gains from decomposition. To address this, we introduce two complementary optimizations: truncation skipping and efficient linear layer concatenation. Truncation skipping removes the extra truncation induced by LRD, while linear layer concatenation pipelines operations to hide the additional communication round. Together, these techniques mitigate the main overheads of LRD in MPC and improve overall efficiency. Our approach is broadly applicable across MPC protocols. Experiments show up to 25% speedup in n-PC and 33% in 3-PC protocols over full-rank baselines, along with up to 52% GPU energy savings and 88% reduction in offline-phase latency.

  • Infrastructure for Valuable, Tradable, and Verifiable Agent Memory

    arXiv (Cornell University) · 2026-03-25

    preprintOpen accessSenior author

    Every API token you spend is your accumulated wealth; once you can prove its value and the effort behind it, you can resell it. As autonomous agents repeatedly call models and tools, they accumulate memories that are your intellectual property. But today these memories remain private and non-transferable, as there is no way to validate their value. We argue that agent memory can serve as an economic commodity in the agent economy, if buyers can verify that it is authentic, effort-backed, and produced in a compatible execution context. To realize this idea, we propose clawgang, which binds memory to verifiable computational provenance, and meowtrade, a market layer for listing, transferring, and governing certified memory artifacts. Together, they transform one-shot API token spending into reusable and tradable assets, enabling timely memory transfer, reducing repeated exploration, and opening a memory trade market.

  • Infrastructure for Valuable, Tradable, and Verifiable Agent Memory

    arXiv (Cornell University) · 2026-03-25

    articleOpen accessSenior author

    Every API token you spend is your accumulated wealth; once you can prove its value and the effort behind it, you can resell it. As autonomous agents repeatedly call models and tools, they accumulate memories that are your intellectual property. But today these memories remain private and non-transferable, as there is no way to validate their value. We argue that agent memory can serve as an economic commodity in the agent economy, if buyers can verify that it is authentic, effort-backed, and produced in a compatible execution context. To realize this idea, we propose clawgang, which binds memory to verifiable computational provenance, and meowtrade, a market layer for listing, transferring, and governing certified memory artifacts. Together, they transform one-shot API token spending into reusable and tradable assets, enabling timely memory transfer, reducing repeated exploration, and opening a memory trade market.

  • Differentially Private Retrieval-Augmented Generation

    Open MIND · 2026-02-16

    preprintSenior author

    Retrieval-augmented generation (RAG) is a widely used framework for reducing hallucinations in large language models (LLMs) on domain-specific tasks by retrieving relevant documents from a database to support accurate responses. However, when the database contains sensitive corpora, such as medical records or legal documents, RAG poses serious privacy risks by potentially exposing private information through its outputs. Prior work has demonstrated that one can practically craft adversarial prompts that force an LLM to regurgitate the augmented contexts. A promising direction is to integrate differential privacy (DP), a privacy notion that offers strong formal guarantees, into RAG systems. However, naively applying DP mechanisms into existing systems often leads to significant utility degradation. Particularly for RAG systems, DP can reduce the usefulness of the augmented contexts leading to increase risk of hallucination from the LLMs. Motivated by these challenges, we present DP-KSA, a novel privacy-preserving RAG algorithm that integrates DP using the propose-test-release paradigm. DP-KSA follows from a key observation that most question-answering (QA) queries can be sufficiently answered with a few keywords. Hence, DP-KSA first obtains an ensemble of relevant contexts, each of which will be used to generate a response from an LLM. We utilize these responses to obtain the most frequent keywords in a differentially private manner. Lastly, the keywords are augmented into the prompt for the final output. This approach effectively compresses the semantic space while preserving both utility and privacy. We formally show that DP-KSA provides formal DP guarantees on the generated output with respect to the RAG database. We evaluate DP-KSA on two QA benchmarks using three instruction-tuned LLMs, and our empirical results demonstrate that DP-KSA achieves a strong privacy-utility tradeoff.

  • Meta-Learn to Unlearn: Enhanced Exact Machine Unlearning in Recommendation Systems with Meta-Learning

    Proceedings on Privacy Enhancing Technologies · 2025-07-13 · 1 citations

    articleOpen accessSenior author

    Recommendation systems are used widely to recommend items such as movies, products, or news to users. The performance of a recommendation model depends on the quality of the embeddings that are associated with users and items, which are generally learned by tracking user behavior, such as their click history. Recent legislative requirements allow users to withdraw their consent to learning from some of their behaviors, even if they have provided such a consent initially. Once a user withdraws their consent, the models are supposed to unlearn the user behavior. This requirement has led to the emergence of machine unlearning, a research area that proposes a class of privacy policy-compliant techniques aimed at maintaining good model utility after deleting user information. Machine unlearning techniques are generally divided into two categories: exact unlearning, which may be accomplished by retraining the model from scratch after removing a data point from the training data; and approximate unlearning, which approximates the model parameters that would result from removing a specific user data, without needing a complete retraining of the model to minimize computational costs. In this work, we propose an enhanced exact machine unlearning (EEMU) strategy that leverages meta-learning to reduce the loss of recommendation performance while ensuring efficient and exact unlearnability. We demonstrate our results using four public datasets and show a significant improvement in recommendation performance over state-of-the-art baselines while preserving the privacy guarantees of exact unlearning.

Recent grants

Frequent coauthors

  • Salman Avestimehr

    60 shared
  • Michel Dubois

    53 shared
  • Yongqin Wang

    Wuhan University of Technology

    42 shared
  • Gunjae Koo

    Korea University

    34 shared
  • Zhifeng Lin

    Fuzhou University

    33 shared
  • Hanieh Hashemi

    31 shared
  • Hyeran Jeon

    University of California, Merced

    30 shared
  • Krishna Giri Narra

    30 shared

Education

  • Ph.D., Computer Science

    University of California, Los Angeles

    1996
  • M.S., Computer Science

    University of California, Los Angeles

    1993
  • B.S., Electrical Engineering

    Indian Institute of Technology, Madras

    1991

Awards & honors

  • NSF CAREER Award (2010)
  • Body Computing Award (2009)
  • IEEE International Conference on Distributed Computing in Se…
  • ACM Senior Membership (2009)
  • IEEE Senior Membership (2009)
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Murali Annavaram

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup