Ravi B. Parikh

Verified

University of Pennsylvania · Rehabilitation Medicine

Active 2011–2026

h-index34

Citations5.5k

Papers305243 last 5y

Funding—

Faculty page

See your match with Ravi B. Parikh — sign in to PhdFit.Sign in

Research topics

Computer Science
Medicine
Political Science
Business
Medical physics
Internal medicine
Public relations
Database

Selected publications

Utah’s Prescription-Renewal Pilot Program — Autonomous AI Managing Patient Care
New England Journal of Medicine · 2026-04-18
article
Publisher DOI
Abstract PS3-04-06: Benchmarking Large Language Models for Clinical Decision Support in Breast Cancer Care: A Multi-Institutional Expert Evaluation
Clinical Cancer Research · 2026-02-17
article
Abstract Background: Artificial intelligence (AI) and large language models (LLMs) are increasingly explored as tools to support clinical decision-making in oncology. However, evidence validating their performance in complex breast cancer clinical scenarios (BCCS) remains limited. Given breast cancer’s diverse subtypes, evolving standards of care, and the need for nuanced, personalized treatment, we compared three LLMs for treatment decision-making to assess their capabilities and determine their readiness for integration into real-world breast oncology clinics. Methods: Ten breast cancer cases mimicking real-world scenarios were posed to three LLMs: ChatGPT-4o (GPT), DeepSeek-R1 (DS), and OpenEvidence (OE). 16 breast medical oncologists (BMOs; assistant to full professors) from 9 academic centers graded responses using a 5-point Likert scale (1 = poor, 5 = excellent) for clinical accuracy, clarity, relevance, and usability. Questions(Q) 1-9 assessed treatment decision-making cases; Q10 tested multimodal image interpretation (MMI) skills. Repeated measures ANOVA evaluated model differences, followed by Tukey’s post hoc comparisons. Results: On BCCS Q1-9 spanning all major breast cancer subtypes and treatment settings, including early-stage, metastatic, neoadjuvant, and adjuvant, OE achieved the highest mean score (3.91 ± 0.48; 2.57-4.43), significantly outperforming both GPT (3.19 ± 0.67; 2.36-3.93) and DS (2.93 ± 0.54; 1.50-3.86) in overall performance (p &lt; 0.0001), with large effect sizes (Cohen’s d = 1.12 vs. GPT; d = 2.84 vs. DS). Repeated measures ANOVA identified significant differences among models in 5/9 (56%) Q (p &lt; 0.05), as shown in the Table. Pairwise comparisons showed OE outperformed GPT in 4/5 (80%) and DS in 5/5 (100%) significant BCCS. GPT modestly outperformed DS in 2/5 (40%) (Cohen’s d = 0.41). GPT showed the highest inter-reviewer variability (SD = 0.67 vs. 0.48-0.54), indicating less agreement among BMOs on its responses. OE provided well-supported treatment recommendations and recurrence risk assessments with citations, though it lacked MMI. DS relied on optical character recognition in Q10 (inflammatory breast cancer image; 2.93 ± 1.33), limiting its utility in image-based BCCS. GPT, the only vision-enabled model, scored highest (4.29 ± 1.14), highlighting its strong potential for MMI integration in breast cancer diagnostic workflows. Conclusions: This is the first comparison of GPT, DS, and OE in BCCS. OE generated the most guideline-concordant treatment choices across BCCS, showing strong potential as a clinical decision support tool, though its verbosity may require streamlining. GPT showed moderate performance, while DS lagged in clinical relevance and accuracy. These findings highlight the promise of LLMs in breast oncology and the need for further refinement to ensure reliability and real-world applicability. Citation Format: Z. Shah, S. S. Afridi, M. Ombada, A. M. Roy, A. LeVee, S. Premji, V. Gupta, N. M. Lopetegui, D. M. Quiroga, R. L. Sacks, S. Shaikh, Y. Abdou, H. Yu, A. Madabhushi, R. Parikh, L. N. Chaudhary, E. Levine, M. Lambertini, K. Kalinsky, S. Kabraji, S. Gandhi.. Benchmarking Large Language Models for Clinical Decision Support in Breast Cancer Care: A Multi-Institutional Expert Evaluation [abstract]. In: Proceedings of the San Antonio Breast Cancer Symposium 2025; 2025 Dec 9-12; San Antonio, TX. Philadelphia (PA): AACR; Clin Cancer Res 2026;32(4 Suppl):Abstract nr PS3-04-06.
Publisher DOI
Machine learning model to forecast patient availability for oncology clinical trials.
Journal of Clinical Oncology · 2025-05-28
articleSenior author
1566 Background: Automating eligibility criteria assessment for oncology clinical trials is an emerging application of machine learning (ML). However, machine learning applications to predict patient availability – the likelihood of a patient beginning a new treatment (time to next treatment) in a prespecified time window – are not well described. We used a large clinicogenomic database of patients diagnosed with solid cancer indications to train ML model to predict patient availability for clinical trials. Methods: This was a retrospective study based on data drawn from the ConcertAI Oncology Research database, enriched by key variables derived from unstructured data. Line of therapy was derived from expert rules applied to structured medications data. Our cohort consisted of patients with confirmed diagnosis of solid cancers without a second malignancy. Patient follow-up period started on the date of diagnosis of metastasis and ended on the earlier of last date of activity / date of death. Random observation date was set between start and end dates to label patients. Patients administered a new treatment after the random observation date were labelled evet, else censored (no new treatment began). Label date is start of new treatment and end dates for event & censored cases respectively. The time to event (TTE) was defined as the duration between the random observation and the label dates. In the event cases, this duration is the time to next treatment (TTNT). Over 2000 features based on variables broadly grouped as tumor-specific biomarkers (PTEN, KRAS, etc.), ECOG, staging, disease status, medications, and imaging (evidence of image, not report) were employed to build multiple ML models. Temporal validation of the models was performed by setting up a simulated index date and predicting the probability of patient beginning a new treatment within 60 days of the simulated index date. Patients receiving new treatment within the 60 days were true positives. Results: TTE models were trained on a cohort comprised of 90K patients across 12 cancer indications with 54% patients starting a new treatment. Median age and overall survival (OS) of the cohort was 73 years and 703 days respectively. Temporal validation was performed on 25K patients with similar demographics/OS and 58% patients starting new treatment. Multiple ML methods were used to train models, with boosted gradient model demonstrating highest c-index of 0.73 based on 87 features. Temporal validation demonstrated AUC and weighted F1 of 87% and 67% respectively. True positive cases were assigned high predicted probability in 75% of the cases. Conclusions: AI models supporting 12 solid cancer indications accurately predicted patient availability. These models can be integrated into real-time clinical workflows alongside patient eligibility models to provide clinicians and patients visibility in ascertaining a patient’s likelihood of being eligible for a clinical trial.
Publisher DOI
Multisite validation of biomechanical computed tomography for osteoporosis assessment and fracture prediction in patients with high-risk or metastatic prostate cancer.
Journal of Clinical Oncology · 2025-05-28
articleSenior author
12044 Background: Long-term androgen deprivation therapy (ADT) among men with high-risk localized or metastatic prostate cancer (PCa) results in a 10-20% risk of significant bone fracture at 10 years. While guidelines recommend routine bone mineral density (BMD) screening for men with PCa receiving ADT, most do not undergo it. We studied whether Biomechanical Computed Tomography (BCT), a radiomic technique to opportunistically measure femoral and vertebral bone strength and BMD from CT scans performed for routine staging, can predict incident fractures among men with PCa beginning ADT. Methods: In this retrospective cohort study among 2 academic cancer centers and 2 Veterans Affairs facilities, we identified 711 men with de novo high-risk localized or metastatic PCa diagnosed between 2010-2018. CT scans at PCa diagnosis were analyzed by BCT. Incident fractures were ascertained by trained radiologists and defined as any new fractures occurring after the initial CT scan. We used Cox proportional hazards models to estimate associations of fragile femur bone strength (≤3500N), fragile vertebral bone strength (≤6500N), or osteoporosis (femoral neck areal BMD T-score ≤-2.5 or trabecular volumetric BMD ≤80 mg/cm3), with incident fracture, adjusted for age, BMI, & race/ethnicity. Results: 673 men were eligible (mean age 69.8±9.3, 49.3% white, 35.2% received prior ADT, 42.3% had prior BMD screening). 198 (29.4%) incident fractures were observed. Using BCT, 123 men (18.3%) had fragile femur bone strength or osteoporosis, and 105 men (15.6%) had fragile vertebral bone strength or osteoporosis by volumetric BMD. BCT-derived fragile femur or vertebral bone strength (adjusted HR 1.81, 95% CI 1.27-2.59) and osteoporosis by BMD criteria (aHR 2.20, 95% CI 1.32-3.62) were associated with incident fracture (Table). Conclusions: In men with high-risk PCa, opportunistic BCT analysis of routine staging CT scans improves osteoporosis detection and fracture prediction, identifying men who may have not otherwise qualified for antiresorptive treatment. BCT is a novel approach to risk-stratify men with PCa for early fracture risk mitigation. Association between BMD and bone strength with incident fracture. Femur Bone Strength HR (95% CI) p-value Normal (≥5000N) 1 (reference) Low (3500-5000N) 1.80 (1.25–2.58) 0.002 Fragile (≤3500N) 2.50 (1.59–3.93) <0.001 Vertebral Bone Strength Normal (≥ 8500N) 1 (reference) Low (6500–8500N) 1.09 (0.60–1.99) 0.78 Fragile (≤6500N) 1.71 (0.96–3.25) 0.07 Femoral Neck BMD (T-score) Normal (≥ –1.0) 1 (reference) Low Bone Density/Osteopenia (–1.0 - –2.5) 1.92 (1.38–2.67) <0.001 Osteoporosis (≥ –2.5) 2.20 (1.32–3.62) 0.003 Vertebral Trabecular BMD Normal (≥120 mg/cm 3 ) 1 (reference) Low Bone Density/Osteopenia (80-120 mg/cm 3 ) 0.99 (0.53–1.84) 0.98 Osteoporosis <jats:italic toggle=
Publisher DOI
Effect of broad-based genomic sequencing on survival outcomes in advanced non-small cell lung cancer: A national cohort study, 2011-2023.
Journal of Clinical Oncology · 2025-05-28
articleSenior author
1520 Background: Use of broad-based genomic sequencing (BGS) for advanced non-small cell lung cancer (aNSCLC) is rising. While older studies have found no survival benefit with BGS, its impact on survival outcomes in the era of modern targeted therapy is unknown. Methods: In this retrospective cohort study, the 2011-2023 Flatiron Health Database—a nationally representative database of electronic health records from > 280 US cancer clinics—was queried for patients with Stage IIIB-IV NSCLC who received at least one line of systemic therapy with ≥12 months follow-up. Primary exposure was receipt of BGS vs. “Focused” biomarker testing (i.e., ALK FISH, EGFR PCR) within 90 days of first- and second-line therapy start. To address baseline confounding, we used 1:1 nearest-neighbor propensity score matching based on age at initial diagnosis, sex, self-reported race/ethnicity, histology (squamous vs. non-squamous), insurance status, smoking status, ECOG performance status, practice type (academic vs. community), stage at diagnosis, advanced diagnosis year, and practice rate of BGS. Adjusted Cox proportional hazards models compared median progression-free survival (mPFS) and median overall survival (mOS) between groups. Sensitivity analyses adjusted for biomarker status and used an instrumental variable approach. Results: Our initial unmatched cohort consisted of 35,060 patients (BGS, n = 14,192; Focused, n = 20,868; 52% female, 3.5% Asian, 9.3% Black, 3.8% Hispanic, 79% community practice). In the propensity-matched first-line therapy cohort (BGS vs. Focused, n = 10,008 in each group; all standardized mean differences < 0.1), BGS was associated with greater mPFS (6.4 vs. 6.0 months; adjusted HR [95%CI], 0.96 [0.92-1.0], p = 0.046) and mOS (16 vs. 14 months; 0.91 [0.86-0.95], p < 0.001). Sensitivity analyses were consistent with primary results. Patients receiving BGS had higher rates of ALK/EGFR positivity (18% vs. 13%) and receipt of targeted therapy (20% vs. 17%). Upon adjustment for biomarker status, however, BGS remained associated with improved OS (1.01 [0.88-0.98], p = 0.004) but not PFS (0.98 [0.94-1.03], p = 0.4). In the propensity-matched second-line therapy cohort, no associations between BGS and survival outcomes were observed. Conclusions: This is the first national analysis of survival outcomes in aNSCLC to demonstrate a survival benefit with BGS. These findings support guideline endorsement and payer coverage of BGS prior to 1 st line therapy. Median (95%CI) Univariate HR (95%CI) Adjusted HR (95%CI) First Line PFS, months Focused 6.0 (5.8-6.1) — — BGS 6.4 (6.2-6.5) 0.95 (0.91-0.99) 0.96 (0.92-1.00) OS Focused 14 (14-15) — — BGS 16 (16-17) 0.90 (0.86-0.95) 0.91 (0.86-0.95) Second Line PFS Focused 3.8 (3.5-4.1) — — BGS 4.2 (4.0-4.5) 0.91 (0.82-1.01) 0.92 (0.83-1.02) OS Focused 13 (12-14) — — BGS 12 (12-14) 1.02 (0.91-1.15) 1.01 (0.89-1.14)
Publisher DOI
Association of deep learning CT response assessment and interpretable components with overall survival in advanced NSCLC: Validation in a trial of sasanlimab and a real-world dataset.
Journal of Clinical Oncology · 2025-05-28
article
1559 Background: Identifying advanced non–small cell lung cancer (aNSCLC) patients who derive long-term benefit from immune checkpoint inhibitors (ICIs) remains a significant challenge. Radiomic analyses, particularly leveraging deep learning, hold promise for improving prognostic accuracy beyond tumor size metrics. We developed serialCTRS, a novel biomarker using deep learning to quantify thoracic CT changes from baseline to 3 months post-treatment, predicting overall survival (OS) in patients receiving PD-(L)1 inhibitors. Methods: SerialCTRS was previously trained and validated on a multi-institutional Real-World Dataset (RWD) (training: 1,171 aNSCLC patients, 14,424 CT scans; validation: 612 patients; Sako et al. SITC, 2024). For this study, we retrospectively validated serialCTRS in two distinct cohorts of aNSCLC patients: (1) a clinical trial (N = 52) treated with the PD-1 inhibitor sasanlimab in the second or later line and (2) a fully blinded RWD from Baylor Scott & White Health system (N = 147), an institution not used for training. The pipeline—spanning image quality control, preprocessing, feature extraction, and survival modeling—operated without manual annotations. To enhance interpretability, we developed 3D submodels for prognostic signals related to (i) tumor burden, (ii) body composition, and (iii) lung vasculature. Predictive performance was compared to RECIST 1.1 using concordance index (c-index) and ROC-AUC for 24-month OS (OS24 AUC). Results: SerialCTRS outperformed RECIST in OS prediction and remained a significant predictor after multivariate adjustments with other known predictors including age, sex, PD-L1 TPS, and NLR across both validation cohorts. In the sasanlimab cohort, serialCTRS achieved a c-index of 0.77, surpassing RECIST (0.72), with an OS24 AUC of 0.86 (95% CI: 0.74–0.98). In the Baylor cohort, serialCTRS demonstrated a c-index of 0.68 vs. RECIST (0.62) and an OS24 AUC of 0.76 (0.67–0.86). Submodels targeting individual components achieved c-indices of 0.65 (tumor burden), 0.61 (body composition), and 0.61 (vasculature) in the sasanlimab cohort, and 0.63, 0.61, and 0.59, respectively, in the Baylor cohort. Combining the submodels improved c-indices to 0.69 (sasanlimab) and 0.66 (Baylor), demonstrating complementary signal among radiographic features. Conclusions: SerialCTRS outperformed RECIST 1.1 in predicting OS in independent clinical trial and RWD datasets. Interpretable submodels highlighted the prognostic value of tumor burden, body composition, and vasculature changes. SerialCTRS offers a promising tool for personalizing therapy and accelerating drug development in aNSCLC, with a fully automated pipeline for robust and scalable clinical use. Future work will focus on larger, more diverse cohorts to validate utility in guiding precision oncology.
Publisher DOI
Estimating Misreporting in the Presence of Genuine Modification: A Causal Perspective
ArXiv.org · 2025-05-29
preprintOpen access
In settings where ML models are used to inform the allocation of resources, agents affected by the allocation decisions might have an incentive to strategically change their features to secure better outcomes. While prior work has studied strategic responses broadly, disentangling misreporting from genuine modification remains a fundamental challenge. In this paper, we propose a causally-motivated approach to identify and quantify how much an agent misreports on average by distinguishing deceptive changes in their features from genuine modification. Our key insight is that, unlike genuine modification, misreported features do not causally affect downstream variables (i.e., causal descendants). We exploit this asymmetry by comparing the causal effect of misreported features on their causal descendants as derived from manipulated datasets against those from unmanipulated datasets. We formally prove identifiability of the misreporting rate and characterize the variance of our estimator. We empirically validate our theoretical results using a semi-synthetic and real Medicare dataset with misreported data, demonstrating that our approach can be employed to identify misreporting in real-world scenarios.
Publisher OA PDF DOI
Derivation and External Validation of Objective Performance Status Among Patients With Metastatic Cancer
JCO Oncology Practice · 2025-07-25 · 1 citations
articleSenior author
PURPOSE Performance status (PS) assessment is used to determine clinical trial eligibility among patients with cancer, but may be inaccurately assessed by oncology clinicians. Wearable accelerometers may allow objective assessment of physical activity, a proxy for PS. In this analysis of two prospective studies, we derive and externally validate objective PS (OPS) by measuring the association between daily physical activity and overall survival among patients with metastatic cancer. MATERIALS AND METHODS For the derivation cohort, we prospectively measured daily physical activity using a wearable accelerometer among patients with metastatic cancer during the screening period for a phase 1 clinical trial in Spain. We used univariable survival analysis, AUCs, and Youden's index to derive an OPS cutoff in mean daily distance walked. We used a multivariable Cox model to calculate the association between OPS and 180-day mortality. We subsequently externally validated OPS in a separate prospective trial of patients with metastatic lung and GI cancers receiving chemotherapy at a large academic health center in the United States. RESULTS Full data were available for 123 patients (70 derivation; 53 validation). In the derivation cohort, we defined an OPS cutoff at 1,200 m walked per day. Poor OPS was associated with higher mortality than good OPS in the derivation (180-day mortality, 81.6% v 38.4%; adjusted hazard ratio [aHR], 6.82 [95% CI, 3.44 to 13.5]; P < .001) and external validation cohorts (180-day mortality, 36% v 8%; aHR, 7.07 [95% CI, 1.37 to 36.6]; P = .02). CONCLUSION OPS is an independent, externally validated prognostic indicator and could serve as an objective surrogate for traditional methods of PS assessment in clinical trials and choice of therapy for patients with cancer.
Publisher DOI
1118 Deep learning serial CT imaging biomarker for predicting overall survival: a real-world validation in multiple indications of advanced solid tumors treated with immune checkpoint inhibitors
Regular and Young Investigator Award Abstracts · 2025-11-01
articleOpen access
Figure 1 Kaplan-Meier OS plots stratified by 12-week Serial CTRS for all indications (A), metastatic RCC and kidney cancers (B), metastatic melanoma and skin cancers (C), and extensive-stage SCLC (D).Predetermined thresholds were uniformly applied to all data
Publisher OA PDF DOI
Global Health in the Age of AI: Charting a Course for Ethical Implementation and Societal Benefit
Minds and Machines · 2025-07-02 · 4 citations
articleOpen access
Publisher OA PDF DOI

Frequent coauthors

Justin E. Bekelman
University of Pennsylvania
113 shared
Lawrence N. Shulman
eHealth Initiative
85 shared
Mitesh S. Patel
Ascension
83 shared
Christopher R. Manz
Dana-Farber Cancer Institute
78 shared
Amol S. Navathe
University of Pennsylvania
72 shared
Lynn M. Schuchter
University of Pennsylvania
60 shared
Samuel U Takvorian
University of Pennsylvania
46 shared
Jinbo Chen
45 shared

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Ravi B. Parikh

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you

Ravi B. Parikh

Research topics

Selected publications

Frequent coauthors

See your match with Ravi B. Parikh