Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Samuel Berchuck

Samuel Berchuck

· Assistant Professor of Biostatistics & BioinformaticsVerified

Duke University · Biostatistics and Bioinformatics

Active 2012–2025

h-index18
Citations1.0k
Papers7855 last 5y
Funding
See your match with Samuel Berchuck — sign in to PhdFit.Sign in

About

Samuel I. Berchuck is an Assistant Professor in the Department of Biostatistics & Bioinformatics within the Division of Translational Biomedical Informatics at Duke University. He is also affiliated with Duke AI Health and is a member of the Duke Cancer Institute. His research and academic work focus on biostatistics and bioinformatics, particularly in the context of translational biomedical informatics. Prior to his current position, he was an NIH K99/R00 Postdoctoral Fellow, during which he developed a program to assess and treat distress in glaucoma patients using an automated electronic health record (EHR)-derived artificial intelligence algorithm. This fellowship was funded by the National Eye Institute. Dr. Berchuck earned his Ph.D. in Biostatistics from the University of North Carolina-Chapel Hill. Following his doctoral studies, he joined Duke University as a Postdoctoral Associate in the Department of Statistical Science, where he was a Forge Scholar and a member of the Visual Performance Lab at the Duke Eye Center.

Research topics

  • Computer Science
  • Medicine
  • Artificial Intelligence
  • Ophthalmology
  • Internal medicine
  • Optometry
  • Biomedical engineering
  • Gerontology
  • Environmental health
  • Demography

Selected publications

  • Artificial Intelligence in Triaging Patient Questions: An Evaluation of a Large Language Model for Distal Radius Fractures

    Journal of the American Academy of Orthopaedic Surgeons · 2025-08-27

    article

    INTRODUCTION: Large language models (LLMs) are promising tools for clinical decision support but require thorough validation to ensure safety and reliability. This study assessed a knowledge and intelligence messaging interface (KIMI; RevelAi Health), an LLM enhanced with retrieval-augmented generation configured with American Academy of Orthopaedic Surgeons guidelines for distal radius fracture management and a persistent system-prompt layer. The goal was to evaluate KIMI's efficacy in acuity triaging and generating appropriate patient-facing responses for distal radius fracture management. METHODS: We analyzed KIMI-generated responses to 100 simulated patient queries. Four clinical experts independently assessed responses for guideline concordance, safety, clarity, and acuity. Probabilities for adequate scoring in all domains were modeled. Bayesian mixed-effects logistic regression and ordered logistic regression models were used for binary and ordinal scoring outcomes, respectively, to account for repeated measures and within-reviewer correlations. RESULTS: Reviewer evaluations of KIMI responses demonstrated high performance across safety and quality domains. Posterior average probability of responses being rated as safe was 94.2% (95% credible interval [CI]: 91.2 to 96.9), as concordant was 88.7% (95% CI: 85.0 to 92.0), and as clear was 93.7% (95% CI: 90.5 to 96.5). Posterior average probability of exact agreement between reviewer-assigned and LLM-assigned acuity levels was 62.9% (95% CI: 58.0 to 67.7). Surgical queries were associated with slightly higher safety ratings (95.4% versus 91.3%) and acuity agreement (63.9% versus 60.6%) than nonsurgical queries. Query category markedly influenced acuity agreement. LLM-assigned acuity was markedly associated with reviewer-assigned acuity across all models even when adjusting for both query type and category (odds ratio = 2.66; 95% CI: 1.81 to 3.83). DISCUSSION: KIMI generated responses that were generally safe, clinically concordant, and clearly communicated. These findings support the feasibility of deploying enhanced LLMs for asynchronous patient engagement in low-to-moderate risk care coordination settings.

  • Patient Attitudes toward Distress Screening and Referral in Glaucoma Care

    Ophthalmology Glaucoma · 2025-09-23

    articleOpen accessSenior authorCorresponding

    A mixed-methods study of 300 glaucoma patients found strong support for screening and referral for psychological distress. Greater interest among those with elevated distress or higher intraocular pressure highlights an opportunity to integrate psychosocial support into glaucoma care.

  • Validation of Existing Comorbidity Models and Development of a New Transplant-Specific Index

    American Journal of Transplantation · 2025-08-01

    article
  • Discovering Spatial Patterns of Readmission Risk Using a Bayesian Competing Risks Model with Spatially Varying Coefficients

    ArXiv.org · 2025-11-25

    preprintOpen accessSenior author

    Time-to-event models are commonly used to study associations between risk factors and disease outcomes in the setting of electronic health records (EHR). In recent years, focus has intensified on social determinants of health, highlighting the need for methods that account for patients' locations. We propose a Bayesian approach for introducing point-referenced spatial effects into a competing risks proportional hazards model. Our method leverages Gaussian process (GP) priors for spatially varying intercept and slope. To improve computational efficiency under a large number of spatial locations, we implemented a Hilbert space low-rank approximation of the GP. We modeled the baseline hazard curves as piecewise constant, and introduced a novel multiplicative gamma process prior to induce shrinkage and smoothing. A loss-based clustering method was then used on the spatial random effects to identify high-risk regions. We demonstrate the utility of this method through simulation and a real-world analysis of EHR data from Duke Hospital to study readmission risk of elderly patients with upper extremity fractures. Our results showed that the proposed method improved inference efficiency and provided valuable insights for downstream policy decisions.

  • Development of Common Data Elements for Organ Transplantation

    JAMA Network Open · 2025-04-28 · 2 citations

    articleOpen access

    This cohort study examines the validity of an electronic health record data model for organ transplantation.

  • Scalable Bayesian Inference for Generalized Linear Mixed Models via Stochastic Gradient MCMC

    arXiv (Cornell University) · 2024-03-05

    preprintOpen access1st authorCorresponding

    The generalized linear mixed model (GLMM) is widely used for analyzing correlated data, particularly in large-scale biomedical and social science applications. Scalable Bayesian inference for GLMMs is challenging because the marginal likelihood is intractable and conventional Markov chain Monte Carlo (MCMC) methods become computationally prohibitive as the number of subjects grows. We develop a stochastic gradient MCMC (SGMCMC) algorithm tailored to GLMMs that enables accurate posterior inference in the large-sample regime. Our approach uses Fisher's identity to construct an unbiased Monte Carlo estimator of the gradient of the marginal log-likelihood, making SGMCMC feasible when direct gradient computation is impossible. We analyze the additional variability introduced by both minibatching and gradient approximation, and derive a post-hoc covariance correction that yields properly calibrated posterior uncertainty. Through simulations, we show that the proposed method provides accurate posterior means and variances, outperforming existing approaches, including control variate methods, in large-$n$ settings. We further demonstrate the method's practical utility in an analysis of electronic health records data, where accounting for variance inflation materially changes scientific conclusions.

  • Defining the learning curve for robotic pancreaticoduodenectomy for a single surgeon following experience with laparoscopic pancreaticoduodenectomy

    Journal of Robotic Surgery · 2024-03-16 · 8 citations

    article
  • Use of Predictive Models to Determine Transplant Eligibility

    Current Transplantation Reports · 2024-10-08 · 1 citations

    articleOpen access1st authorCorresponding
  • Predictive Value of Early Autism Detection Models Based on Electronic Health Record Data Collected Before Age 1 Year

    JAMA Network Open · 2023-02-02 · 23 citations

    articleOpen access

    Importance: Autism detection early in childhood is critical to ensure that autistic children and their families have access to early behavioral support. Early correlates of autism documented in electronic health records (EHRs) during routine care could allow passive, predictive model-based monitoring to improve the accuracy of early detection. Objective: To quantify the predictive value of early autism detection models based on EHR data collected before age 1 year. Design, Setting, and Participants: This retrospective diagnostic study used EHR data from children seen within the Duke University Health System before age 30 days between January 2006 and December 2020. These data were used to train and evaluate L2-regularized Cox proportional hazards models predicting later autism diagnosis based on data collected from birth up to the time of prediction (ages 30-360 days). Statistical analyses were performed between August 1, 2020, and April 1, 2022. Main Outcomes and Measures: Prediction performance was quantified in terms of sensitivity, specificity, and positive predictive value (PPV) at clinically relevant model operating thresholds. Results: Data from 45 080 children, including 924 (1.5%) meeting autism criteria, were included in this study. Model-based autism detection at age 30 days achieved 45.5% sensitivity and 23.0% PPV at 90.0% specificity. Detection by age 360 days achieved 59.8% sensitivity and 17.6% PPV at 81.5% specificity and 38.8% sensitivity and 31.0% PPV at 94.3% specificity. Conclusions and Relevance: In this diagnostic study of an autism screening test, EHR-based autism detection achieved clinically meaningful accuracy by age 30 days, improving by age 1 year. This automated approach could be integrated with caregiver surveys to improve the accuracy of early autism screening.

  • Intraocular Pressure and Rates of Macular Thinning in Glaucoma

    Ophthalmology Glaucoma · 2023-04-08 · 4 citations

    articleOpen access

Frequent coauthors

  • Felipe A. Medeiros

    University of Miami

    43 shared
  • Alessandro A. Jammal

    University of Miami

    38 shared
  • Eduardo B. Mariottoni

    Universidade Federal de São Paulo

    23 shared
  • Sayan Mukherjee

    11 shared
  • Tais Estrela

    Boston Children's Hospital

    10 shared
  • Atalie C. Thompson

    9 shared
  • Swarup S. Swaminathan

    University of Miami

    9 shared
  • Joshua L. Warren

    Yale University

    7 shared
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Samuel Berchuck

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup