Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Ruoqing Zhu

Ruoqing Zhu

· Associate ProfessorVerified

University of Illinois Urbana-Champaign · Statistics

Active 2008–2026

h-index23
Citations1.4k
Papers11158 last 5y
Funding
See your match with Ruoqing Zhu — sign in to PhdFit.Sign in

About

Ruoqing Zhu is an Associate Professor in the Department of Statistics at the University of Illinois Urbana-Champaign and serves as the PhD Program Director. He completed his Ph.D. in Biostatistics from the University of North Carolina at Chapel Hill in 2013 and has since held positions including Postdoctoral Associate at Yale University and faculty roles at UIUC. His research interests encompass developing statistical methodology, theory, and computational algorithms for decision-making problems, with a focus on personalized medicine and reinforcement learning. He aims to address issues such as unrealistic model assumptions, unstable performance, interpretability challenges, and complexities arising from small sample sizes, high dimensionality, and complex data structures. His recent work emphasizes uncertainty quantification, distributional shift, causal inference, and trustworthiness to develop reliable solutions for real-world applications. Zhu also has a strong interest in classical statistical learning and machine learning methods, including random forests, sufficient dimension reduction, and survival analysis, with applications in bioinformatics, infectious diseases, and nutrition studies.

Research topics

  • Biology
  • Computer Science
  • Medicine
  • Internal medicine
  • Pathology
  • Machine Learning
  • Biochemistry
  • Cancer research
  • Gastroenterology
  • Immunology
  • Cell biology
  • Econometrics
  • Mathematics
  • Psychology
  • Food science
  • Statistics
  • Bioinformatics
  • Ecology
  • Engineering
  • Oncology
  • Andrology

Selected publications

  • Rectified Fisher-Bingham Model for Compositional Data with Zeros

    arXiv (Cornell University) · 2026-04-27

    preprintOpen accessSenior author

    This paper introduces a rectified and renormalized Fisher-Bingham model for compositional data with zeros, motivated in part by the presence of zeros in microbiota studies. The approach represents compositions through a square-root transformation that maps data to the positive orthant of the unit sphere, and models them via a latent Fisher-Bingham followed by a deterministic transformation that induces exact zeros. This construction yields a coherent likelihood without requiring zero imputation or separate modeling of zero and nonzero components. Parameter estimation is performed using a Monte Carlo expectation-maximization algorithm that accommodates the latent structure. We further develop a score test for detecting structured differences in composition across groups, providing a parametric alternative to commonly used distance-based methods. Simulation studies demonstrate that the proposed method closely approximates the induced distribution and achieves higher power for detecting structured compositional changes, particularly when observations include many zero-valued components. An application to a dietary intervention study illustrates that the method identifies meaningful microbiota shifts not detected by standard approaches.

  • Rectified Fisher-Bingham Model for Compositional Data with Zeros

    ArXiv.org · 2026-04-27

    articleOpen accessSenior author

    This paper introduces a rectified and renormalized Fisher-Bingham model for compositional data with zeros, motivated in part by the presence of zeros in microbiota studies. The approach represents compositions through a square-root transformation that maps data to the positive orthant of the unit sphere, and models them via a latent Fisher-Bingham followed by a deterministic transformation that induces exact zeros. This construction yields a coherent likelihood without requiring zero imputation or separate modeling of zero and nonzero components. Parameter estimation is performed using a Monte Carlo expectation-maximization algorithm that accommodates the latent structure. We further develop a score test for detecting structured differences in composition across groups, providing a parametric alternative to commonly used distance-based methods. Simulation studies demonstrate that the proposed method closely approximates the induced distribution and achieves higher power for detecting structured compositional changes, particularly when observations include many zero-valued components. An application to a dietary intervention study illustrates that the method identifies meaningful microbiota shifts not detected by standard approaches.

  • The Sepsis ImmunoScore Predicts Sepsis, Mortality, and Deterioration Better than Clinical Scores and Widely Available Biomarkers

    medRxiv · 2025-10-05

    preprintOpen access

    BACKGROUND Early identification of patients at risk for sepsis, mortality, and clinical deterioration is essential for improving outcomes, but existing diagnostic and predictive tools have limited accuracy. The objective was to evaluate the performance of an FDA-authorized AI tool, the Sepsis ImmunoScore, compared to widely available biomarkers and clinical tools for diagnosis of sepsis and prediction of in-hospital mortality and intensive care unit (ICU) admission. METHODS This multicenter observational study included 6,027 adult patients suspected of infection across 7 U.S. hospital sites. The Sepsis ImmunoScore’s predictive performance was compared to the sequential organ failure assessment (SOFA) score, procalcitonin (PCT), C-reactive protein (CRP), Systemic Inflammatory Response Syndrome (SIRS) score, National Early Warning Score (NEWS), and quick SOFA (qSOFA). Primary outcomes included sepsis as defined by Sepsis-3 criteria, in-hospital mortality, and ICU admission. Predictive accuracy was assessed using area under the receiver operating characteristic curve (AUC), and 95% confidence intervals were generated and hypothesis testing conducted using the bootstrap method. RESULTS The Sepsis ImmunoScore demonstrated statistically significant superior performance across all outcomes. For sepsis prediction, the Sepsis ImmunoScore achieved an AUC of 0.82, compared to SOFA (0.72), procalcitonin (PCT) (0.70),C-reactive protein (CRP) (0.61), SIRS (0.59), NEWS (0.69), and qSOFA (0.67). For in-hospital mortality prediction, the Sepsis ImmunoScore achieved an AUC of 0.80, outperforming SOFA (0.72), PCT (0.67), CRP (0.58), SIRS (0.60), NEWS (0.72), and qSOFA (0.69). For ICU admission, the Sepsis ImmunoScore reached an AUC of 0.74, superior to SOFA (0.63), PCT (0.64), CRP (0.54), SIRS (0.60), NEWS (0.70), and qSOFA (0.65). All differences between the Sepsis ImmunoScore and comparators were statistically significant. CONCLUSIONS The Sepsis ImmunoScore significantly improved predictive accuracy for sepsis, in-hospital mortality, and ICU admission compared to six conventional clinical scores and biomarkers. This AI-based tool may enhance risk stratification and clinical decision-making, potentially leading to more timely sepsis interventions and improved outcomes. KEY POINTS Question How does the FDA-authorized Sepsis ImmunoScore compare to conventional sepsis tools at diagnosing and predicting sepsis, clinical deterioration, and in-hospital mortality? Findings In a multicenter observational cohort of 6,027 patients with suspected infection, the Sepsis ImmunoScore demonstrated statistically superior performance compared to PCT, CRP, SOFA, qSOFA, SIRS, and NEWS in predicting all outcomes: sepsis diagnosis, ICU admission, and in-hospital mortality. Meaning Because the Sepsis ImmunoScore outperforms existing sepsis diagnostics, it could potentially enhance risk stratification and clinical decision-making for patients with suspected infection, enabling more appropriate and timely interventions.

  • Probabilistic exponential family inverse regression and its applications

    Biometrics · 2025-04-02

    article

    Rapid advances in high-throughput sequencing technologies have led to the fast accumulation of high-dimensional data, which is harnessed for understanding the implications of various factors on human disease and health. While dimension reduction plays an essential role in high-dimensional regression and classification, existing methods often require the predictors to be continuous, making them unsuitable for discrete data, such as presence-absence records of species in community ecology and sequencing reads in single-cell studies. To identify and estimate sufficient reductions in regressions with discrete predictors, we introduce probabilistic exponential family inverse regression (PrEFIR), assuming that, given the response and a set of latent factors, the predictors follow one-parameter exponential families. We show that the low-dimensional reductions result not only from the response variable but also from the latent factors. We further extend the latent factor modeling framework to the double exponential family by including an additional parameter to account for the dispersion. This versatile framework encompasses regressions with all categorical or a mixture of categorical and continuous predictors. We propose the method of maximum hierarchical likelihood for estimation, and develop a highly parallelizable algorithm for its computation. The effectiveness of PrEFIR is demonstrated through simulation studies and real data examples.

  • Integrating Prior Knowledge From Genome-Scale Metabolic Model With Metabolomics for Diet Assessment

    IEEE Transactions on Computational Biology and Bioinformatics · 2025-04-15

    articleOpen access

    Dietary biomarker metabolite detection is frequently studied but lacks insight into underlying biomechanism and suffers empirically from small cohorts of feeding trials. Our earlier work engineered 3 novel features to integrate prior knowledge from a genome-scale metabolic model with metabolomes to suggest diet-relevant underlying metabolic reactions and subsystems and improve predictive modeling. This study extends our earlier work by inspecting the impact of using reaction and subsystem features together, the effect of prior knowledge volume on diet assessment, and the robustness of proposed features for multi-diet assessment. We also propose a new feature in this work. We notice several experimental settings perform better with reaction and subsystem features together. We see that diet assessment can improve with higher volumes of prior, but the volume often becomes irrelevant as long as some amount of prior is used. We show our features generalize well for multi-diet assessment.

  • Reinforcement Learning with Continuous Actions Under Unmeasured Confounding

    Journal of the American Statistical Association · 2025-12-02

    articleSenior authorCorresponding

    This article addresses the challenge of offline policy learning in continuous action spaces when unmeasured confounders are present. While most existing research focuses on policy evaluation within partially observable Markov decision processes (POMDPs) and assumes discrete action spaces, we advance this field by establishing a novel identification result to enable the nonparametric estimation of policy value for a given target policy under an infinite-horizon framework. Leveraging this identification, we develop a minimax estimator and introduce a policy-gradient-based algorithm to identify the in-class optimal policy that maximizes the estimated policy value. Furthermore, we provide theoretical results regarding the consistency, finite-sample error bound, and regret bound of the resulting optimal policy. Extensive simulations and a real-world application using the German Family Panel data demonstrate the effectiveness of our proposed methodology. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

  • Unraveling the Molecular Complexity of Myxomatous Mitral Valve Degeneration: Integrating Transcriptomic and miRNA Profiling Using Random Forests with Boruta Feature Selection

    Journal of the Heart Valve Society · 2025-04-01

    article

    Background Myxomatous mitral valve disease (MMVD) is a degenerative disorder marked by excess tissue fibrosis and matrix remodeling, leading to leaflet prolapse. While recent studies linked transforming growth factor beta (TGF-β) activation to MMVD development and progression, upstream regulators of this and other modulatory/causal pathways remain largely unexplored. Objectives In this study, we utilized high-throughput sequencing to conduct unbiased analyses of mRNA and microRNA (miRNA) levels in myxomatous and healthy mitral valve tissues, aiming to uncover novel molecular mechanisms involved in MMVD. Methods We defined differentially expressed mRNAs and miRNAs transcripts displaying a fold-change > 1.5 and P < .05. Pathway analysis was performed using Ingenuity Pathway Analysis and DAVID, with key findings validated via qRT-PCR. A total of 2378 transcripts were differentially expressed between myxomatous and normal valves. Established pathways, including TGF-β signaling, were confirmed as major contributors to MMVD. Additionally, Random Forests with Boruta Feature Selection identified transcripts with a 95% likelihood of importance, and subsequent pathway analysis on this subset of genes revealed unique signaling pathways. Results Most notably, circadian rhythm disruption emerged as a novel, highly ranked pathway in MMVD. Key miRNAs, such as miR-1, miR-133a, and miR-217 were highlighted as highly relevant, with miRNA–mRNA interactions displaying distinct molecular signatures predictive of MMVD. Conclusions Collectively, this study represents the first comprehensive analysis of both miRNA and mRNA expression in MMVD, revealing both established and novel disease-associated pathways. The discovery of circadian rhythm disturbances and new regulatory miRNAs suggests promising directions for further research and potential therapeutic targets for nonsurgical treatment strategies in patients with MMVD.

  • Predicting Cognitive Outcome Through Nutrition and Health Markers Using Supervised Machine Learning

    Journal of Nutrition · 2025-05-12 · 2 citations

    articleOpen access

    BACKGROUND: Machine learning (ML) use in health research is growing, yet its application to predict cognitive outcomes using diverse health indicators is underinvestigated. OBJECTIVES: We used ML models to predict cognitive performance based on a set of health and behavioral factors, aiming to identify key contributors to cognitive function for insights into potential personalized interventions. METHODS: Data from 374 adults aged 19-82 y (227 females) were used to develop ML models predicting cognitive performance (reaction time in milliseconds) on a modified Eriksen flanker task. Features included demographics, anthropometric measures, dietary indices (Healthy Eating Index, Dietary Approaches to Stop Hypertension, Mediterranean, and Mediterranean-Dietary Approaches to Stop Hypertension Intervention for Neurodegenerative Delay), self-reported physical activity, and systolic and diastolic blood pressures. The data set was split (80:20) for training and testing. Predictive models (decision trees, random forest, AdaBoost, XGBoost, gradient boosting, linear, ridge, and lasso regression) were used with hyperparameter tuning and crossvalidation. Feature importance was calculated using permutation importance, whereas performance using mean absolute error (MAE) and mean squared error. RESULTS: ). Age was the most significant feature (score: 0.208), followed by diastolic blood pressure (0.169), BMI (0.079), systolic blood pressure (0.069), and Healthy Eating Index (0.048). Ethnicity (0.005) and sex (0.003) had minimal predictive effect. CONCLUSIONS: Age, blood pressure, and BMI show strong associations with cognitive performance, whereas diet quality has a subtler effect. These findings highlight the potential of ML models for developing personalized interventions and preventive strategies for cognitive decline.

  • Nasal and systemic immune responses correlate with viral shedding after influenza challenge in people with complex preexisting immunity

    Science Translational Medicine · 2025-08-06 · 6 citations

    articleOpen access

    Each year in the United States, ~50% of adults ≥18 years old are vaccinated against influenza viruses, with protective efficacy averaging 40.5% over the past 20 years. To model annual seasonal influenza, a cohort of 74 adults, who were unscreened for preexisting A/H1N1 immunity and half of whom were recently immunized with licensed QIV (mean of 64 days), were challenged with A/H1N1 influenza virus. Transcriptomic, proteomic, and VDJ repertoire analyses were performed on nasal and peripheral blood samples from participants to identify nasal mucosal and systemic immune responses that correlated with viral shedding and immune correlates of protection. Viral-shedding participants showed increased T cell, but not B cell, VDJ diversity with expansion of low-frequency B cell clones postchallenge, including broadly neutralizing motifs. Nonshedding participants demonstrated decreased clonality and increased richness of B and T cell VDJ clones, increased preinoculation nasal mucosal immune gene and serum protein expression, and increased ex vivo peripheral blood mononuclear cell responses. Nasal mucosal responses in participants shedding virus for 2 or more days showed higher early viral loads and exhibited stronger induction of antiviral responses compared with those in participants who shed virus for 1 day. Last, participants with a single day of viral shedding were three times more likely to be female. These data shed light on the complex immune responses in the nasal mucosa and the periphery after influenza vaccination and infection, which will be critical for next-generation vaccine development.

  • Multiplexed cytokine profiling identifies diagnostic signatures for latent tuberculosis and reactivation risk stratification

    PLoS ONE · 2025-04-09 · 4 citations

    articleOpen access

    Active tuberculosis (TB) is caused by Mycobacterium tuberculosis (Mtb) bacteria and is characterized by multiple phases of infection, leading to difficulty in diagnosing and treating infected individuals. Patients with latent tuberculosis infection (LTBI) can reactivate to the active phase of infection following perturbation of the dynamic bacterial and immunological equilibrium, which can potentially lead to further Mtb transmission. However, current diagnostics often lack specificity for LTBI and do not inform on TB reactivation risk. We hypothesized that immune profiling readily available QuantiFERON-TB Gold Plus (QFT) plasma supernatant samples could improve LTBI diagnostics and infer risk of TB reactivation. We applied a whispering gallery mode, silicon photonic microring resonator biosensor platform to simultaneously quantify thirteen host proteins in QFT-stimulated plasma samples. Using machine learning algorithms, the biomarker concentrations were used to classify patients into relevant clinical bins for LTBI diagnosis or TB reactivation risk based on clinical evaluation at the time of sample collection. We report accuracies of over 90% for stratifying LTBI + from LTBI- patients and accuracies reaching over 80% for classifying LTBI + patients as being at high or low risk of reactivation. Our results suggest a strong reliance on a subset of biomarkers from the multiplexed assay, specifically IP-10 for LTBI classification and IL-10 and IL-2 for TB reactivation risk assessment. Taken together, this work introduces a 45-minute, multiplexed biomarker assay into the current TB diagnostic workflow and provides a single method capable of classifying patients by LTBI status and TB reactivation risk, which has the potential to improve diagnostic evaluations, personalize treatment and management plans, and optimize targeted preventive strategies in Mtb infections.

Frequent coauthors

  • Marvin S. Swartz

    American Society of Law, Medicine and Ethics

    51 shared
  • Alan R. Ellis

    North Carolina State University

    51 shared
  • Kristen Hassmiller Lich

    51 shared
  • Elizabeth M. La

    51 shared
  • J Morrissey

    51 shared
  • Rebecca Wells

    St George's, University of London

    49 shared
  • Wenzhuo Zhou

    20 shared
  • David J. Baer

    19 shared

Labs

Education

  • Ph.D, Biostatistics

    University of North Carolina at Chapel Hill

    2013
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Ruoqing Zhu

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup