Ramkiran Gouripeddi

· Research Assistant ProfessorVerified

University of Utah · Biomedical Informatics

Active 2009–2026

h-index16

Citations1.5k

Papers12166 last 5y

Funding$7.1M

Faculty page

See your match with Ramkiran Gouripeddi — sign in to PhdFit.Sign in

About

Dr. Ramkiran Gouripeddi is an Assistant Professor in the Department of Biomedical Informatics at the University of Utah and a Senior Biomedical Informatics Scientist at the Center for Clinical and Translational Science. He earned his MS from Arizona State University in 2009 and his medical degree from MGR Medical University in India. His research interests encompass clinical and clinical research informatics, with a focus on understanding the requirements of the clinical research community and developing tools to enable, accelerate, and scale clinical research. His work particularly involves informatics methods for comparative effectiveness research, health-services research, machine learning, data mining for knowledge discovery and personalized medicine, biomedical data modeling, and biomedical terminologies and ontologies. Dr. Gouripeddi has played a significant role in extending and deploying the FURTHeR platform for comparative effectiveness research and has led multiple projects involving data integration, metadata discovery, and recruitment streamlining for biomedical studies. He is the R&D lead for the OpenFurther project, an open-source informatics solution for biomedical data integration and federation, and has contributed to projects such as bioCADDIE, PHIS+, and the PaTH PCORnet Clinical Data Research Network. With extensive experience in clinical machine learning and a background as a practicing medical doctor, he is well-qualified to support large-scale data projects and public health research, working collaboratively with key personnel to advance data integration for public health surveillance and research.

Research topics

Medicine
Computer Science
Internal medicine
Virology
Biology
Sociology
Political Science
Demography
Intensive care medicine
Pathology

Selected publications

Reliable Uncertainty Under Class Imbalance and Distribution Shift: Class-Conditional Conformal Prediction of Multiple Sclerosis
medRxiv · 2026-05-15
article
Abstract Objectives To evaluate whether class-conditional conformal prediction (CP) can provide reliable uncertainty quantification (UQ) under severe class imbalance and distribution shift, using multiple sclerosis (MS) diagnosis from magnetic resonance imaging (MRI) as a clinical exemplar. Methods We evaluated marginal and class-conditional CP using 720 T2-weighted MRI scans (142 MS, 578 controls). A convolutional neural network trained on 3 T data was evaluated under distribution shift (1.5 T acquisitions and synthetic image degradations). Through 100 Monte Carlo experiments, we assessed coverage guarantees, class-specific performance, and relationships between calibration set size, coverage variance, and uncertainty. Results Marginal CP severely under-covered the minority MS class (16.9% mean coverage at 1.5 T vs. 95.2% for controls) despite valid population-level guarantees. Class-conditional CP dramatically improved MS coverage to 77.5% at 1.5 T and 85.8% at 3 T, significantly reducing severe undercoverage (<80%) frequency while maintaining >89% control coverage. Minority class coverage variance increased due to limited calibration samples, matching theoretical Beta-binomial predictions. CP maintained validity under distribution shift; prediction set sizes scaled monotonically with shift severity, yielding clinically interpretable UQ. Conclusions Class-conditional CP successfully mitigates systematic undercoverage of minority disease classes while maintaining validity under distribution shift. The approach offers a practical, model-agnostic solution for uncertainty quantification applicable across clinical AI systems, though increased coverage variance for less represented conditions reflects fundamental statistical constraints. By characterizing these variance trade-offs, this framework enables more reliable deployment of diagnostic AI in heterogeneous clinical environments across diverse medical domains where minority disease class detection is critical.
Publisher DOI
Development of Prediction Models for Mortality and Debility Among Critically Ill Adults Receiving High-flow Nasal Cannula Therapy
SSRN Electronic Journal · 2026-01-01
preprintOpen access
Publisher DOI
Unlocking the Power of Data Harmonization in Environmental Health Sciences: A Comprehensive Exploration of Significance, Use Cases, and Recommendations for Standardization Efforts
Environmental Health Perspectives · 2025-06-06 · 4 citations
reviewOpen access
BACKGROUND: The field of environmental health sciences increasingly demands comprehensive and diverse datasets, particularly in response to emerging research areas such as climate change, mixtures, and exposomics. The data needed to address the complexity of environmental health research questions often extend beyond the boundaries of a single study or data resource. Traditional data management approaches struggle to harmonize the ever-expanding and heterogeneous data sources needed for research in the environmental health sciences. Harmonization may help address this issue as it involves aligning and standardizing various elements of data to allow comprehensive analysis, data pooling and interpretation across studies. OBJECTIVES: The primary objective is to inform researchers about the transformative potential of embracing harmonization methodologies and to motivate contributions to ongoing efforts, thereby fostering advancements. METHODS: Using the Environmental Health Language Collaborative's Data Harmonization Use Case, we provide a practical illustration of existing data harmonization approaches, identify gaps, and emphasize future research and application directions. We selected two publicly available environmental epidemiology studies on the topic of childhood asthma and three studies on the topic of biomarkers of metals exposure during pregnancy and birth outcomes and applied several existing harmonization approaches to assess interoperability. DISCUSSION: Our process revealed the potential limitations of many existing harmonization approaches, with notable failures to identify common variables across independent datasets and lack of agreement between human and computer-based approaches. This use case identified various challenges with existing approaches, including reliance on often incomplete data documentation and large amounts of manual effort. To address these challenges, we recommend the continued advancement and dissemination of community data standards, the development of software and tools to facilitate harmonization through automation, and strategic efforts to promote engagement in data harmonization within the environmental health sciences community. Collaborative science is needed to advance our understanding of environmental contributors to health, and realizing the harmonization potential of our scientific data is a step toward improved collaboration. https://doi.org/10.1289/EHP15410.
Publisher DOI
363 The art and science of data navigation for translational research
Journal of Clinical and Translational Science · 2025-03-25
articleOpen access1st authorCorresponding
Objectives/Goals: Translational researchers spend significant amounts of time finding available datasets and other research data resources for their purposes. Objectives of this program are develop and evaluate a multipronged approach to supporting researchers with existing data resources. Methods/Study Population: We established a dedicated service with expertise in data resources to increase awareness, understanding, and utilization of existing data resources. This program assists investigators and trainees discover appropriate data resources, formulate scientific problems in computable formats, advise on state-of-the-art data analytics, data management, build collaborations, mentor data users, and develop a service pipeline for streamlined data resource project management. This is accomplished through these essential functions: (1) Discover, catalog, document, and manage metadata resources, (2) train and present data resources to the research community, (3) provide individual consultations, and (4) explore and assess novel data resources. Results/Anticipated Results: In a phased approach, the data navigation program is performing outreach to the research community and integrating with existing data efforts on campus, presenting and demonstrating existing data resources, established a consultation service, and building core competencies into long-term usage and navigation of resources across campus. Evaluating the program monthly has shown an increase in various metrics for evaluating commitment and engagement including number of requests for access to data resource, consultations, publications and presentations, co-authorship, and proposals. Unawareness and inappropriate use of data resources leads to delays in performing research and potentially unnecessary duplications of efforts. Discussion/Significance of Impact: Our data navigation program has increased use of data resources in research. Next steps are to continue evaluation and further streamline informatics approaches to data discovery, abstraction, formulation, and analysis. Harmonized data resource programs are important translational science approach to foster the next generation of research.
Publisher OA PDF DOI
The Consistency of Hypercapnic Respiratory Failure Case Definitions in Electronic Health Record Data
CHEST Journal · 2025-08-29
articleOpen access
Publisher OA PDF DOI
Structural Homology and Electrostatic Potential Comparisons of Epitope Pair Candidates for Molecular Mimicry Triggering of Type 1 Diabetes Mellitus
bioRxiv (Cold Spring Harbor Laboratory) · 2025-09-04
preprintOpen access
Abstract Background Molecular mimicry, where foreign and self-peptides contain similar epitopes, can induce autoimmune responses. Identifying potential molecular mimics and studying their properties is key to understanding the onset of autoimmune diseases such as type 1 diabetes mellitus (T1DM). Previous work identified pairs of infectious epitopes (E INF ) and T1DM epitopes (E T1D ) that demonstrated sequence homology; however, structural homology was not considered. Correlating sequence homology with structural properties is important for translational investigation of potential molecular mimics. This work compares sequence homology with structural homology by calculating the structures and electrostatic potential surfaces of the epitope pairs identified in previous work from our laboratory. Results For each pair of E INF and E T1D , the root mean square deviations (RMSD) were calculated between their predicted structures and their electrostatic potentials. Structures were predicted using the AlphaFold software program. Of the 53 epitope pairs considered here, only 10 did not exhibit any matching (i.e. less than 3 residues overlap). When considering all residues the RMSD ranges from 0.33 Å to 11.66 Å with an average of 2.68 Å. Twenty-two pairs (42%) have RMSD of less than 1.5 Å and 30 (58%) less than 3 Å. Conclusions Most of the E INF /E T1D pairs selected by sequence homology show similar structural and electrostatic distributions, indicating that the E INF may also bind to the same protein targets, i.e. the major histocompatibility complex molecules, for T1DM, leading to molecular mimicry onset of the disease. These findings suggest that searching for epitope pairs using sequence homology, a much less computationally demanding approach, leads to strong candidates for molecular mimicry that should be considered for further study. But structure homology, electrostatic potential calculations and full docking calculations may be necessary to advance the in-silico molecular mimicry predictions, which may be useful to select the most promising candidates for experimental studies.
Publisher OA PDF DOI
20 Decoding auto-immunity: Uncovering pre-onset infectious disease patterns of idiopathic inflammatory myopathies
Journal of Clinical and Translational Science · 2025-03-25
articleOpen access
Objectives/Goals: Idiopathic inflammatory myopathies (IIMs) are autoimmune diseases influenced by genetic and environmental factors. This study aims to explore infection patterns preceding IIM onset by applying temporal data mining and machine learning to deidentified patient records and corroborate results from molecular analysis. Methods/Study Population: The dataset used in this work was extracted from TriNetX with a focus on patients who have IIM. Risks for developing the outcomes were assessed using case–control cohorts. For each participant, information was extracted about diagnosis code, date of infection, and study visit in which the infection was reported. This data were then temporally encoded and used to generate sequence files for each of the outcomes. Unsupervised temporal machine learning was then preformed on these files to detect frequent subsequences of infections. Python library scikit-learn was used to perform the unsupervised machine learning with k-means clustering. Results/Anticipated Results: The results of this study identify infections associated with the onset of IIM by analyzing temporal infection patterns. Frequent sequences of infections uncovered, with specific patterns linked to different cohorts, offer insights into the etiology of IIM. Common and cohort-specific infection sequences will help validate existing research and provide new avenues for exploring the disease mechanisms. The findings will highlight significant infection patterns, which will inform our understanding of IIM onset across various patient populations. Discussion/Significance of Impact: The results will provide key insights into pre-symptomatic infection sequences related to IIM onset, enhancing understanding of its etiology and pathogenesis. These findings may aid in developing more precise screening methods for early detection and confirm previous results from analyzing immune signatures of infections in IIM.
Publisher OA PDF DOI
In Silico Estimation of the Performance of Transcutaneous CO2 Sensors for Detecting Hypercapnia in Newly Admitted Inpatients
American Journal of Respiratory and Critical Care Medicine · 2025-05-01
article
Abstract Rationale: A method to reliably identify which patients have an elevated arterial partial pressure of CO2 (PaCO2) is required to rigorously study hypercapnic respiratory failure. Arterial blood gas (ABG) sampling is the reference standard, but it is painful, can cause complications, and is therefore not always obtained in usual care. Requiring ABG sampling may dissuade patients from participating in prospective studies, and may result in biased detection of hypercapnia in studies using passive detection. We sought to combine previously reported estimates of transcutaneous CO2 (TcCO2) sensor accuracy with the distribution of PaCO2 results to evaluate whether TcCO2 monitors might be accurate enough to identify hypercapnia among hospitalized adults. Methods: Inpatient encounters occurring Jan 1 to Dec 31, 2022, in which an ABG was drawn on the day of admission were requested from the TriNetX research network, which aggregates electronic health record data from 76 medical centers and roughly 115 million patients across the United States. We simulated a TcCO2 reading for each PaCO2 measurement using test agreement estimates from the meta-analysis by Conway et al. (Thorax, 2017) which estimated a mean bias of TcCO2 0.09 mmHg lower than PaCO2 and a population standard deviation (accounting for both within- and between-study variance) of 4.60 mmHg. Results were classified as true negatives (PaCO2 and TcCO2 &lt; 45mmHg), false positives (PaCO2 &lt; 45mmHg, TcCO2 ≥ 45mmHg), true positives (PaCO2 and TcCO2 ≥ 45mmHg), or false negatives (PaCO2 ≥ 45mmHg, TcCO2 &lt; 45mmHg). Operating characteristics were subsequently calculated. Results: 158,228 ABGs were included (57.8% critical care; 54.9% male; 35.3% ventilated; 65.4% non-Hispanic white, 14.4% Black, 5.5% Hispanic; mean age 62.1 ± 16.4 years) showing a mean PaCO2 of 42.6 ±17.3mmHg. Hypercapnia was present in 47,995 (30.3%). Simulated TcCO2 measurements yielded the following operating characteristic estimates: sensitivity 84.2%, specificity 91.0%, negative predictive value 93.0%, and positive predictive value 80.4%. Conclusions: Our simulation suggests the accuracy of TcCO2 for binary classification of hypercapnia is likely to be high because many admitted patients have PaCO2 values sufficiently far from the threshold to make classification errors unlikely, given reported limits of agreement. Two limitations of this work are that patients receiving ABGs may have more extreme PaCO2 derangements than those without ABGs and disagreements between TcCO2-PaCO2 might be non-Gaussian. Nonetheless, TcCO2 may be a useful tool to capture the occurrence of hypercapnia more reliably among inpatients.
Publisher DOI
Documentation of social determinants of health for patients with type 2 diabetes in Epic Cosmos
JAMIA Open · 2025-08-08 · 3 citations
articleOpen access
Objectives: Type 2 diabetes (T2D) is a growing public health burden with persistent racial and ethnic disparities. . This study assessed the completeness of social determinants of health (SdoH) data for patients with T2D in Epic Cosmos, a nationwide, cross-institutional electronic health recors (EHR) database. Materials and Methods: The study included adults with T2D (ICD-10: E11.*) with encounters between 2022 and 2024. We analyzed 11 individual-level SDoH data elements across 5 domains-financial strain, food insecurity, housing instability, intimate partner violence, and transportation needs-and 4 components of the Social Vulnerability Index (SVI), representing neighborhood-level SDoH. Data completeness for each data element (ie, the proportion of individuals with non-missing values) was evaluated using generalized linear models, adjusting for source healthcare organization, sex, and age. Results: Among 12 031 927 individuals with T2D, adjusted completeness for individual-level SDoH data elements ranged from 11.2% to 31.5%, varying by data element and racial/ethnic group. American Indian or Alaska Native, Asian, Hispanic, and Native Hawaiian or Other Pacific Islander individuals had lower completeness for all individual-level SDoH compared to White individuals. In contrast, SVI data elements were available for nearly all patients since they are derived from patient addresses routinely collected in EHRs. Discussion: While SVI data elements were widely available, individual-level SDoH data elements had significant missingness, limiting their usability for secondary analyses. Racial/ethnic disparities in SDoH completeness further complicate their use. Conclusion: Standardized, equitable SDoH collection is critical to close documentation gaps, reduce disparities, and enable accurate, bias-resistant analyses in T2D care.
Publisher OA PDF DOI
Social vulnerability, lower broadband internet access, and rurality associated with lower telemedicine use in U.S. Counties
JAMIA Open · 2025-07-03 · 6 citations
articleOpen access
Objective: Our objective was to determine how social vulnerabilities, broadband access, and rurality relate to telemedicine use across the United States through large-scale analysis of real-world telemedicine data. Materials and Methods: We conducted a retrospective, observational study of dyadic U.S. telemedicine sessions that occurred January 1, 2022 to December 31, 2022, linked to the 2020 Centers for Disease Control and Prevention Social Vulnerability Index (SVI) and the National Center for Health Statistics Urban-Rural Classification Scheme for Counties. We examined county-level telemedicine use rates (sessions per 1000 population) in relation to SVI indexes, broadband internet access, and rurality classifications using polynomial regression and data visualization. Results: We found a negative, nonlinear association between overall social and socioeconomic status vulnerabilities and telemedicine use. Telemedicine rates in urban counties exceeded that of rural counties. There was more variability in telemedicine use for the urban counties according to social vulnerability and broadband access. Discussion: Rurality and broadband access demonstrated a greater effect on telemedicine use than social vulnerability, and the relationship between social vulnerability, broadband access, and telemedicine use differed for rural versus urban areas. Conclusion: This observational study of nearly 8 million U.S. telemedicine sessions showed that rurality and broadband access are key drivers of telemedicine use and may be more important than many social vulnerabilities in determining community-level telemedicine use. We also found nuanced differences in the relationship between social vulnerability and telemedicine use between rural and urban counties, and at different levels of broadband access.
Publisher DOI

Recent grants

NIH Grant U54EB021973
NIH · $7.1M · 2020

Frequent coauthors

Julio C. Facelli
106 shared
Randy Madsen
University of Utah
41 shared
Katherine Sward
University of Utah
34 shared
Mollie Cummins
28 shared
Mohammad A. S. Masoum
Utah Valley University
24 shared
Ryan Butcher
Charleston Area Medical Center
23 shared
Peter Mo
University of Utah
18 shared
Sneha Kumar Kasera
18 shared

Education

M.S.
Arizona State University
2009
M.D.
MGR Medical University, India

Awards & honors

FURTHeR: An infrastructure for clinical, translational and c…
An Informatics Architecture for an Exposome, AMIA 2016 Joint…
A Framework for Metadata Management and Automated Discovery…
Streamlining Study Design and Statistical Analysis for Quali…
Molecular Mimicry Impact of the COVID-19 Pandemic: Sequence…

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Ramkiran Gouripeddi

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you