Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Majid Sarrafzadeh

Majid Sarrafzadeh

· ProfessorVerified

University of California, Los Angeles · Computer Science

Active 1984–2026

h-index66
Citations15.6k
Papers69863 last 5y
Funding$12.8M
See your match with Majid Sarrafzadeh — sign in to PhdFit.Sign in

About

Majid Sarrafzadeh is a Distinguished Professor in the Department of Computer Science and Electrical and Computer Engineering at UCLA Samueli School of Engineering. He holds the Levi James Knight, Jr. Chair for Innovation Engineering. His research interests include data science, health analytics, embedded systems, and algorithm design. Dr. Sarrafzadeh earned his PhD from the University of Illinois, Urbana-Champaign in 1987. He is a member of the National Academy of Inventors since 2021 and an IEEE Fellow. He has been recognized for his contributions to medical technology innovation, including receiving the 2018 Best Innovation in Medical Technology Award and being featured in TIME magazine for best inventions in 2020. Dr. Sarrafzadeh is also a co-founder of Bruin Biometrics and MediSens Wireless, contributing to advancements in medical devices and health monitoring technologies.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Computer Security
  • Medicine
  • Political Science
  • Human–computer interaction
  • Nursing
  • Data science
  • Internet privacy
  • Embedded system

Selected publications

  • Report for NSF Workshop on Algorithm-Hardware Co-design for Medical Applications

    ArXiv.org · 2026-03-11

    articleOpen access

    This report summarizes the discussions and recommendations from the NSF Workshop on Algorithm-Hardware Co-design for Medical Applications, held on September 26-27, 2024, in Pittsburgh, PA. The workshop assembled an interdisciplinary cohort of researchers, clinicians, and industry leaders to examine foundational challenges and develop a strategic roadmap for algorithm-hardware co-design in medical computing. The workshop focuses on four thematic areas: (1) teleoperations, telehealth, and surgical operations; (2) wearable and implantable medicine, including implantable living pharmacies; (3) home ICU, hospital systems, and elderly care; and (4) medical sensing, imaging, and reconstruction. This report calls for a fundamental shift in how next-generation medical technologies are conceived, designed, validated, and translated into practice. The report recommends that NSF sustain investment in shared standardized data infrastructures and compute infrastructures, develop clinic workflow-aware systems and human-AI collaboration frameworks, promote scalable validation ecosystems grounded in objective, continuous measures, and physics-informed, and enable safe, accountable, and resilient platforms, including virtual-physical healthcare ecosystems, to de-risk translational pathways. The workshop information can be found on the website: https://sites.google.com/view/nsfworkshop.

  • Report for NSF Workshop on Algorithm-Hardware Co-design for Medical Applications

    arXiv (Cornell University) · 2026-03-11

    preprintOpen access

    This report summarizes the discussions and recommendations from the NSF Workshop on Algorithm-Hardware Co-design for Medical Applications, held on September 26-27, 2024, in Pittsburgh, PA. The workshop assembled an interdisciplinary cohort of researchers, clinicians, and industry leaders to examine foundational challenges and develop a strategic roadmap for algorithm-hardware co-design in medical computing. The workshop focuses on four thematic areas: (1) teleoperations, telehealth, and surgical operations; (2) wearable and implantable medicine, including implantable living pharmacies; (3) home ICU, hospital systems, and elderly care; and (4) medical sensing, imaging, and reconstruction. This report calls for a fundamental shift in how next-generation medical technologies are conceived, designed, validated, and translated into practice. The report recommends that NSF sustain investment in shared standardized data infrastructures and compute infrastructures, develop clinic workflow-aware systems and human-AI collaboration frameworks, promote scalable validation ecosystems grounded in objective, continuous measures, and physics-informed, and enable safe, accountable, and resilient platforms, including virtual-physical healthcare ecosystems, to de-risk translational pathways. The workshop information can be found on the website: https://sites.google.com/view/nsfworkshop.

  • Leveraging ChatGPT and Other NLP Methods for Identifying Risk and Protective Behaviors in MSM: Social Media and Dating apps Text Analysis

    arXiv (Cornell University) · 2026-01-20

    preprintOpen access

    Men who have sex with men (MSM) are at elevated risk for sexually transmitted infections and harmful drinking compared to heterosexual men. Text data collected from social media and dating applications may provide new opportunities for personalized public health interventions by enabling automatic identification of risk and protective behaviors. In this study, we evaluated whether text from social media and dating apps can be used to predict sexual risk behaviors, alcohol use, and pre-exposure prophylaxis (PrEP) uptake among MSM. With participant consent, we collected textual data and trained machine learning models using features derived from ChatGPT embeddings, BERT embeddings, LIWC, and a dictionary-based risk term approach. The models achieved strong performance in predicting monthly binge drinking and having more than five sexual partners, with F1 scores of 0.78, and moderate performance in predicting PrEP use and heavy drinking, with F1 scores of 0.64 and 0.63. These findings demonstrate that social media and dating app text data can provide valuable insights into risk and protective behaviors and highlight the potential of large language model-based methods to support scalable and personalized public health interventions for MSM.

  • Leveraging ChatGPT and Other NLP Methods for Identifying Risk and Protective Behaviors in MSM: Social Media and Dating apps Text Analysis

    ArXiv.org · 2026-01-20

    articleOpen access

    Men who have sex with men (MSM) are at elevated risk for sexually transmitted infections and harmful drinking compared to heterosexual men. Text data collected from social media and dating applications may provide new opportunities for personalized public health interventions by enabling automatic identification of risk and protective behaviors. In this study, we evaluated whether text from social media and dating apps can be used to predict sexual risk behaviors, alcohol use, and pre-exposure prophylaxis (PrEP) uptake among MSM. With participant consent, we collected textual data and trained machine learning models using features derived from ChatGPT embeddings, BERT embeddings, LIWC, and a dictionary-based risk term approach. The models achieved strong performance in predicting monthly binge drinking and having more than five sexual partners, with F1 scores of 0.78, and moderate performance in predicting PrEP use and heavy drinking, with F1 scores of 0.64 and 0.63. These findings demonstrate that social media and dating app text data can provide valuable insights into risk and protective behaviors and highlight the potential of large language model-based methods to support scalable and personalized public health interventions for MSM.

  • Utilizing Machine Learning for Predicting PrEP Use Status Among Sexual and Gender Minority Young Adults

    Prevention Science · 2026-01-10

    articleOpen access

    Pre-exposure prophylaxis (PrEP) is a highly effective biomedical prevention tool for HIV yet remains underutilized among key populations, particularly among young sexual and gender minorities (SGM). Recognizing the popularity of specific dating and social media apps among SGM young adults, we leveraged user data from these platforms to build a machine learning (ML) model that could inform targeted, data-driven interventions aimed at improving PrEP uptake and adherence. We adapted eWellness, an Android mobile app, to passively collect data from research participants capturing mobile app usage, keystroke patterns and logs, and GPS location data between 2021 and 2024. These data were used to train a ML model to predict self-reported PrEP use. Model accuracy was evaluated through F1 scores across different data types and feature combinations. Study protocols were developed in collaboration with community partners and adhered to strict ethical and privacy standards. A total of 82 SGM young adults participated, with 46 (56%) reporting PrEP use at baseline. Our machine learning model demonstrated good predictive accuracy for predicting PrEP use and non-use, achieving an F1 score of 0.84 (PrEP use) and 0.82 (non-use) outcomes when incorporating data from all mobile apps, including messaging, dating, and social media mobile apps. By contrast, predictions based solely on social media mobile app usage, language associated with sexual behavior and substance use risk, or location monitoring demonstrated worse accuracy (F1 scores of 0.79/0.75, 0.70/0.57, and 0.70/0.52, respectively). Additional feature extraction methods, as well as various combinations of these features, were also tested. However, none achieved predictive accuracy as well as the model incorporating all mobile app usage data combined. This study demonstrates the potential of machine learning to accurately predict PrEP use status among SGM young adults. The findings offer a foundation for developing more personalized PrEP promotion strategies, particularly among SGM young adults who use social media and dating apps. Future research should assess the model's adaptability across diverse SGM subgroups to further inform intervention development. Registry: ClinicalTrials.gov, ID: NCT04710901, November 9, 2020.

  • Identifying Substance Use and High-Risk Sexual Behavior Among Sexual and Gender Minority Youth by Using Mobile Phone Data: Development and Validation Study

    Online Journal of Public Health Informatics · 2025-06-20 · 2 citations

    articleOpen accessSenior author

    Background: Sexual and gender minority (SGM) individuals are at heightened risk for substance use and sexually transmitted infections than their non-SGM peers. Collecting mobile phone usage data passively may open new opportunities for personalizing interventions, as behavioral risks could be identified without user input. Objective: This study aimed to determine (1) whether passively sensed mobile phone data can be used to identify substance use and sexual risk behaviors for sexually transmitted infection (STI) and HIV transmission among young SGM who have sex with men, (2) which outcomes can be predicted with a high level of accuracy, and (3) which passive data sources are most predictive of these outcomes. Methods: We developed a mobile phone app to collect participants' messaging, location, and app use data and trained a machine learning model to predict risk behaviors for STI and HIV transmission. We used Scikit-learn to train logistic regression and gradient boosting classification models with simple linear model specification to predict participants' substance use and sexual behaviors (ie, condomless anal sex, number of sexual partners, and methamphetamine use), which were validated using self-report questionnaires. F1-scores were used to quantify prediction accuracy of the model using different data sources (and combinations of these sources) for prediction. Differences between text, location, app use, and Linguistic Inquiry and Word Count (LIWC) domains by outcome were investigated using independent t tests where associations were considered significant at P<.05. Results: Among participants (n=82) who identified as SGM, were sexually active, and reported recent substance use, our model was highly predictive of methamphetamine use and having ≥6 sexual partners (F1-scores as high as 0.83 and 0.69, respectively). The model was less predictive of condomless anal sex (highest F1-score 0.38). Overall, text-based features were found to be most predictive, but app use and location data improved predictive accuracy, particularly for detecting ≥6 sexual partners. Methamphetamine use was significantly associated with dating app use (P=.01) and use of sex-related words (P=.002). Having ≥6 sex partners was associated with dating app use (0.02), use of sex-related words (P=.001), and traveling a further distance from home (P=.03), on average, compared to participants with fewer sex partners. Methamphetamine users were more likely to use social (P=.002) and affect words (P=.003) and less likely to use drive-related words (P=.02). People having 6 or more partners were more likely to use social, affect words, and cognitive process-related words (P=.003 and .004 respectively). Conclusions: Our results show that passively collected mobile phone data may be useful in detecting sexual risk behaviors. Expanding data collection may improve the results further, as certain behaviors, such as injection drug use, were quite rare in the study sample. These models may be used to personalize STI and HIV prevention as well as substance use harm reduction interventions.

  • Exploring the Impact of Dataset Statistical Effect Size on Model Performance and Data Sample Size Sufficiency

    ArXiv.org · 2025-01-05 · 2 citations

    preprintOpen accessSenior author

    Having a sufficient quantity of quality data is a critical enabler of training effective machine learning models. Being able to effectively determine the adequacy of a dataset prior to training and evaluating a model's performance would be an essential tool for anyone engaged in experimental design or data collection. However, despite the need for it, the ability to prospectively assess data sufficiency remains an elusive capability. We report here on two experiments undertaken in an attempt to better ascertain whether or not basic descriptive statistical measures can be indicative of how effective a dataset will be at training a resulting model. Leveraging the effect size of our features, this work first explores whether or not a correlation exists between effect size, and resulting model performance (theorizing that the magnitude of the distinction between classes could correlate to a classifier's resulting success). We then explore whether or not the magnitude of the effect size will impact the rate of convergence of the learning curve, (theorizing again that a greater effect size may indicate that the model will converge more rapidly, and with a smaller sample size needed). Our results appear to indicate that this is not an effective heuristic for determining adequate sample size or projecting model performance, and therefore that additional work is still needed to better prospectively assess adequacy of data.

  • PRISM: A Transformer-based Language Model of Structured Clinical Event Data

    ArXiv.org · 2025-06-04

    preprintOpen accessSenior author

    We introduce PRISM (Predictive Reasoning in Sequential Medicine), a transformer-based architecture designed to model the sequential progression of clinical decision-making processes. Unlike traditional approaches that rely on isolated diagnostic classification, PRISM frames clinical trajectories as tokenized sequences of events - including diagnostic tests, laboratory results, and diagnoses - and learns to predict the most probable next steps in the patient diagnostic journey. Leveraging a large custom clinical vocabulary and an autoregressive training objective, PRISM demonstrates the ability to capture complex dependencies across longitudinal patient timelines. Experimental results show substantial improvements over random baselines in next-token prediction tasks, with generated sequences reflecting realistic diagnostic pathways, laboratory result progressions, and clinician ordering behaviors. These findings highlight the feasibility of applying generative language modeling techniques to structured medical event data, enabling applications in clinical decision support, simulation, and education. PRISM establishes a foundation for future advancements in sequence-based healthcare modeling, bridging the gap between machine learning architectures and real-world diagnostic reasoning.

  • Leveraging Large Language Models and Topic Modeling for Toxicity Classification

    2025-02-17 · 2 citations

    articleSenior author

    Content moderation and toxicity classification represent critical tasks with significant social implications. However, studies have shown that major classification models exhibit tendencies to magnify or reduce biases and potentially overlook or disadvantage certain marginalized groups within their classification processes. Researchers suggest that the positionality of annotators influences the gold standard labels in which the models learned from propagate annotators’ bias. To further investigate the impact of annotator positionality, we delve into fine-tuning BERTweet and HateBERT on the dataset while using topic-modeling strategies for content moderation. The results indicate that fine-tuning the models on specific topics results in a notable improvement in the F1 score of the models when compared to the predictions generated by other prominent classification models such as GPT-4, PerspectiveAPI, and RewireAPI. These findings further reveal that the state-of-the-art large language models exhibit significant limitations in accurately detecting and interpreting text toxicity contrasted with earlier methodologies. Code is available at https://github.com/aheldis/Toxicity-Classification.git.

  • PRISM-Consult: A Panel-of-Experts Architecture for Clinician-Aligned Diagnosis

    ArXiv.org · 2025-10-01

    preprintOpen accessSenior author

    We present PRISM-Consult, a clinician-aligned panel-of-experts architecture that extends the compact PRISM sequence model into a routed family of domain specialists. Episodes are tokenized as structured clinical events; a light-weight router reads the first few tokens and dispatches to specialist models (Cardiac-Vascular, Pulmonary, Gastro-Oesophageal, Musculoskeletal, Psychogenic). Each specialist inherits PRISM's small transformer backbone and token template, enabling parameter efficiency and interpretability. This initial study evaluates a scoped panel of five specialist families defined by high-impact ED diagnostic groups. On real-world Emergency Department cohorts, specialists exhibit smooth convergence with low development perplexities across domains, while the router achieves high routing quality and large compute savings versus consult-all under a safety-first policy. We detail the data methodology (initial vs.\ conclusive ICD-9 families), routing thresholds and calibration, and report per-domain results to avoid dominance by common events. The framework provides a practical path to safe, auditable, and low-latency consult at scale, and we outline validation steps-external/temporal replication, asymmetric life-threat thresholds, and multi-label arbitration-to meet prospective clinical deployment standards.

Recent grants

Frequent coauthors

  • Hassan Ghasemzadeh

    71 shared
  • Lorraine S. Evangelista

    University of Nevada, Las Vegas

    60 shared
  • Carol M. Mangione

    University of California, Los Angeles

    53 shared
  • Jung‐Ah Lee

    University of California, Irvine

    51 shared
  • Alison Moore

    University of California, San Diego

    50 shared
  • Marjan Motie

    University of California, Irvine

    49 shared
  • Nabil Alshurafa

    Northwestern University

    46 shared
  • Foad Dabiri

    43 shared

Education

  • Ph.D., Electrical Engineering

    University of California, Los Angeles

    1995
  • M.S., Electrical Engineering

    University of California, Los Angeles

    1992
  • B.S., Electrical Engineering

    University of Tehran

    1988

Awards & honors

  • Member, National Academy of Inventors, 2021
  • IEEE Fellow, 2020
  • TIME magazine 2020 best inventions
  • 2018 Best Innovation in Medical Technology Award
  • Public Prize for Innovation, 2018
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Majid Sarrafzadeh

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup