Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Marzyeh Ghassemi

Marzyeh Ghassemi

Verified

Massachusetts Institute of Technology · Electrical Engineering & Computer Science

Active 2011–2024

h-index60
Citations13.9k
Papers368278 last 5y
Funding
See your match with Marzyeh Ghassemi — sign in to PhdFit.Sign in

Research topics

  • Computer Science
  • Political Science
  • Artificial Intelligence
  • Computer Security
  • Psychology
  • Medicine
  • Machine Learning
  • Pathology
  • Data science
  • Engineering
  • Public relations
  • Medical education
  • Knowledge management
  • Law
  • Applied psychology
  • Business
  • Medical physics
  • Psychiatry
  • Risk analysis (engineering)
  • Social psychology
  • Mathematics

Selected publications

  • Peer review of GPT-4 technical report and systems card

    PLOS Digital Health · 2024 · 91 citations

    • Political Science
    • Computer Science
    • Computer Security

    The study provides a comprehensive review of OpenAI's Generative Pre-trained Transformer 4 (GPT-4) technical report, with an emphasis on applications in high-risk settings like healthcare. A diverse team, including experts in artificial intelligence (AI), natural language processing, public health, law, policy, social science, healthcare research, and bioethics, analyzed the report against established peer review guidelines. The GPT-4 report shows a significant commitment to transparent AI research, particularly in creating a systems card for risk assessment and mitigation. However, it reveals limitations such as restricted access to training data, inadequate confidence and uncertainty estimations, and concerns over privacy and intellectual property rights. Key strengths identified include the considerable time and economic investment in transparent AI research and the creation of a comprehensive systems card. On the other hand, the lack of clarity in training processes and data raises concerns about encoded biases and interests in GPT-4. The report also lacks confidence and uncertainty estimations, crucial in high-risk areas like healthcare, and fails to address potential privacy and intellectual property issues. Furthermore, this study emphasizes the need for diverse, global involvement in developing and evaluating large language models (LLMs) to ensure broad societal benefits and mitigate risks. The paper presents recommendations such as improving data transparency, developing accountability frameworks, establishing confidence standards for LLM outputs in high-risk settings, and enhancing industry research review processes. It concludes that while GPT-4's report is a step towards open discussions on LLMs, more extensive interdisciplinary reviews are essential for addressing bias, harm, and risk concerns, especially in high-risk domains. The review aims to expand the understanding of LLMs in general and highlights the need for new reflection forms on how LLMs are reviewed, the data required for effective evaluation, and addressing critical issues like bias and risk.

  • Considering Biased Data as Informative Artifacts in AI-Assisted Health Care

    New England Journal of Medicine · 2023 · 106 citations

    Senior authorCorresponding
    • Computer Science
    • Artificial Intelligence
    • Psychology

    Medical AI tools that are trained on skewed data may exhibit bias. The authors discuss how thinking of clinical data as artifacts can identify values, practices, and patterns of inequity in health care.

  • Mitigating the impact of biased artificial intelligence in emergency decision-making

    Communications Medicine · 2022 · 75 citations

    Senior authorCorresponding
    • Artificial Intelligence
    • Computer Science
    • Psychology

    BACKGROUND: Prior research has shown that artificial intelligence (AI) systems often encode biases against minority subgroups. However, little work has focused on ways to mitigate the harm discriminatory algorithms can cause in high-stakes settings such as medicine. METHODS: In this study, we experimentally evaluated the impact biased AI recommendations have on emergency decisions, where participants respond to mental health crises by calling for either medical or police assistance. We recruited 438 clinicians and 516 non-experts to participate in our web-based experiment. We evaluated participant decision-making with and without advice from biased and unbiased AI systems. We also varied the style of the AI advice, framing it either as prescriptive recommendations or descriptive flags. RESULTS: Participant decisions are unbiased without AI advice. However, both clinicians and non-experts are influenced by prescriptive recommendations from a biased algorithm, choosing police help more often in emergencies involving African-American or Muslim men. Crucially, using descriptive flags rather than prescriptive recommendations allows respondents to retain their original, unbiased decision-making. CONCLUSIONS: Our work demonstrates the practical danger of using biased models in health contexts, and suggests that appropriately framing decision support can mitigate the effects of AI bias. These findings must be carefully considered in the many real-world clinical scenarios where inaccurate or biased models may be used to inform important decisions.

  • AI recognition of patient race in medical imaging: a modelling study

    The Lancet Digital Health · 2022 · 497 citations

    • Artificial Intelligence
    • Machine Learning
    • Medicine

    BACKGROUND: Previous studies in medical imaging have shown disparate abilities of artificial intelligence (AI) to detect a person's race, yet there is no known correlation for race on medical imaging that would be obvious to human experts when interpreting the images. We aimed to conduct a comprehensive evaluation of the ability of AI to recognise a patient's racial identity from medical images. METHODS: Using private (Emory CXR, Emory Chest CT, Emory Cervical Spine, and Emory Mammogram) and public (MIMIC-CXR, CheXpert, National Lung Cancer Screening Trial, RSNA Pulmonary Embolism CT, and Digital Hand Atlas) datasets, we evaluated, first, performance quantification of deep learning models in detecting race from medical images, including the ability of these models to generalise to external environments and across multiple imaging modalities. Second, we assessed possible confounding of anatomic and phenotypic population features by assessing the ability of these hypothesised confounders to detect race in isolation using regression models, and by re-evaluating the deep learning models by testing them on datasets stratified by these hypothesised confounding variables. Last, by exploring the effect of image corruptions on model performance, we investigated the underlying mechanism by which AI models can recognise race. FINDINGS: In our study, we show that standard AI deep learning models can be trained to predict race from medical images with high performance across multiple imaging modalities, which was sustained under external validation conditions (x-ray imaging [area under the receiver operating characteristics curve (AUC) range 0·91-0·99], CT chest imaging [0·87-0·96], and mammography [0·81]). We also showed that this detection is not due to proxies or imaging-related surrogate covariates for race (eg, performance of possible confounders: body-mass index [AUC 0·55], disease distribution [0·61], and breast density [0·61]). Finally, we provide evidence to show that the ability of AI deep learning models persisted over all anatomical regions and frequency spectrums of the images, suggesting the efforts to control this behaviour when it is undesirable will be challenging and demand further study. INTERPRETATION: The results from our study emphasise that the ability of AI deep learning models to predict self-reported race is itself not the issue of importance. However, our finding that AI can accurately predict self-reported race, even from corrupted, cropped, and noised medical images, often when clinical experts cannot, creates an enormous risk for all model deployments in medical imaging. FUNDING: National Institute of Biomedical Imaging and Bioengineering, MIDRC grant of National Institutes of Health, US National Science Foundation, National Library of Medicine of the National Institutes of Health, and Taiwan Ministry of Science and Technology.

  • Do as AI say: susceptibility in deployment of clinical decision-aids

    npj Digital Medicine · 2021 · 420 citations

    Senior authorCorresponding
    • Computer Science
    • Psychology
    • Medicine

    Artificial intelligence (AI) models for decision support have been developed for clinical settings such as radiology, but little work evaluates the potential impact of such systems. In this study, physicians received chest X-rays and diagnostic advice, some of which was inaccurate, and were asked to evaluate advice quality and make diagnoses. All advice was generated by human experts, but some was labeled as coming from an AI system. As a group, radiologists rated advice as lower quality when it appeared to come from an AI system; physicians with less task-expertise did not. Diagnostic accuracy was significantly worse when participants received inaccurate advice, regardless of the purported source. This work raises important considerations for how advice, AI and non-AI, should be deployed in clinical environments.

  • The role of machine learning in clinical research: transforming the future of evidence generation

    Trials · 2021 · 277 citations

    Senior authorCorresponding
    • Political Science
    • Computer Science
    • Medicine

    BACKGROUND: Interest in the application of machine learning (ML) to the design, conduct, and analysis of clinical trials has grown, but the evidence base for such applications has not been surveyed. This manuscript reviews the proceedings of a multi-stakeholder conference to discuss the current and future state of ML for clinical research. Key areas of clinical trial methodology in which ML holds particular promise and priority areas for further investigation are presented alongside a narrative review of evidence supporting the use of ML across the clinical trial spectrum. RESULTS: Conference attendees included stakeholders, such as biomedical and ML researchers, representatives from the US Food and Drug Administration (FDA), artificial intelligence technology and data analytics companies, non-profit organizations, patient advocacy groups, and pharmaceutical companies. ML contributions to clinical research were highlighted in the pre-trial phase, cohort selection and participant management, and data collection and analysis. A particular focus was paid to the operational and philosophical barriers to ML in clinical research. Peer-reviewed evidence was noted to be lacking in several areas. CONCLUSIONS: ML holds great promise for improving the efficiency and quality of clinical research, but substantial barriers remain, the surmounting of which will require addressing significant gaps in evidence.

  • The false hope of current approaches to explainable artificial intelligence in health care

    The Lancet Digital Health · 2021 · 1280 citations

    1st authorCorresponding
    • Computer Science
    • Artificial Intelligence
    • Computer Science

    The black-box nature of current artificial intelligence (AI) has caused some to question whether AI must be explainable to be used in high-stakes scenarios such as medicine. It has been argued that explainable AI will engender trust with the health-care workforce, provide transparency into the AI decision making process, and potentially mitigate various kinds of bias. In this Viewpoint, we argue that this argument represents a false hope for explainable AI and that current explainability methods are unlikely to achieve these goals for patient-level decision support. We provide an overview of current explainability techniques and highlight how various failure cases can cause problems for decision making for individual patients. In the absence of suitable explainability methods, we advocate for rigorous internal and external validation of AI models as a more direct means of achieving the goals often associated with explainability, and we caution against having explainability be a requirement for clinically deployed models.

  • Reproducibility in machine learning for health research: Still a ways to go

    Science Translational Medicine · 2021 · 253 citations

    Senior authorCorresponding
    • Machine Learning
    • Computer Science
    • Artificial Intelligence

    Machine learning for health must be reproducible to ensure reliable clinical use. We evaluated 511 scientific papers across several machine learning subfields and found that machine learning for health compared poorly to other areas regarding reproducibility metrics, such as dataset and code accessibility. We propose recommendations to address this problem.

Frequent coauthors

  • Laura C. Rosella

    Public Health Ontario

    211 shared
  • Amol A. Verma

    194 shared
  • Michael Fralick

    191 shared
  • Timothy C. Y. Chan

    University of Toronto

    185 shared
  • Fahad Razak

    St. Michael's Hospital

    185 shared
  • Angela M. Cheung

    University Health Network

    185 shared
  • Adina Weinerman

    Health Sciences Centre

    185 shared
  • Shail Rawal

    University Health Network

    183 shared

Education

  • PhD, EECS

    Massachusetts Institute of Technology

    2017
  • MsC by Research, Biomedical Engineering

    University of Oxford

    2011
  • BSEE, Electrical Engineering

    New Mexico State University

    2005
  • BSCS, Computer Science

    New Mexico State University

    2005
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Marzyeh Ghassemi

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup