
Jon Steingrimsson
· Associate Professor of Biostatistics, Director of the NextGen Graduate Program in BiostatisticsVerifiedBrown University · Biostatistics
Active 2014–2026
About
Jon A. Steingrimsson is an Associate Professor of Biostatistics at Brown University. He obtained his PhD in statistics from Cornell University in 2015 and his MS from Cornell University in 2010. Prior to joining Brown, he was a postdoctoral fellow in the Department of Biostatistics at Johns Hopkins University. His research interests include generalizing or transporting treatment effects and risk prediction models to different populations, machine learning algorithms for risk prediction with censored data, and identifying treatment effect heterogeneity. Dr. Steingrimsson is an active member of the ECOG-ACRIN cancer research group. He teaches courses related to probability, statistics, and machine learning, including topics such as analysis of lifetime data, linear models, and statistical inference.
Research topics
- Computer Science
- Artificial Intelligence
- Data Mining
- Machine Learning
- Mathematics
- Econometrics
- Medicine
- Statistics
Selected publications
Clinical Cancer Research · 2026-02-17
articleAbstract Background: The ECOG-ACRIN Tomosynthesis Mammographic Imaging Screening Trial is a clinical trial being conducted to determine if tomosynthesis (TM) should replace digital mammography (DM) for breast cancer screening based on the impact of the use of each technology on the number of advanced breast cancers in the population of women being screened over several years. If TM is better at finding the types of cancers that are most likely to lead to mortality, the population screened with TM would be expected to have fewer of these types of cancers over time compared with the population screened with DM. TMIST also includes secondary aims in the areas of imaging assessment, medical physics, breast biology and pathology, long-term follow-up, and health care utilization. Besides collection of demographic and clinical information, the study also has a collection of TM and DM screening mammograms, blood and buccal samples, and pathology diagnostic slides and tissue blocks. Methods: Asymptomatic women ages 45 to 74 were enrolled into TMIST across 133 sites located in the United States, Canada, Argentina, Italy, Peru, Chile, South Korea, Spain, Taiwan, and Thailand. At the time of enrollment, women were randomized to undergo screening mammography with either TM or DM annually or biennially based on risk factors for their first 5 years on the study with up to 3 additional years of long-term follow-up under their routine breast screening protocols. Pathology materials for all enrolled women who undergo biopsy or surgery are being collected. Tissue blocks collected will undergo PAM-50 analysis, plus an immune signature. TMIST participants also voluntarily can contribute blood and/or buccal smears to the TMIST biorepository. Update: TMIST met the study accrual goal of 108,508 in December 2024. 21% of the United States participants are African American, with 49% Hispanic participation worldwide. 68.1% of TMIST participants have opted to provide buccal smears, and 66.5% have opted to provide blood to the biorepository. The study will complete follow-up in December 2027. Two ancillary studies not dependent on study endpoints with separate external funding are underway. The first active ancillary project is a study aimed at increasing participation of African American TMIST participants in the biospecimen sub-study of TMIST. New educational materials were developed to provide more information about the purpose of biospecimen collection within TMIST and the process of collecting blood and buccal samples. We are tracking the number of women approached with the new information and providing a patient incentive for those participants who re-consent and provide the samples in this sub-study. Three sites are currently participating that have large numbers of African American TMIST participants who initially declined to participate in the biospecimen collection sub-study. The first site began approaching eligible TMIST participants in April 2025. The sub-study continues through 2026. The second active ancillary project utilizes a case-control design to determine whether lower compression pressure applied during screening mammography correlates with interval cancers. The study is the first in TMIST to utilize selected imaging data and limited clinical information in a protected enclave. No images or data leave EA systems, with an external algorithm being applied to images by staff managing the TMIST imaging archive and statistical analysis being performed by the EA TMIST statistical team. There are multiple additional ancillary projects that are in development that are using TMIST data to 1) estimate the value of AI algorithms that predict short-term risk of breast cancer, 2) assess the rate of overdiagnosis, and 3) aid in the development of individualized screening recommendations (which also will use data from All of US), and others. Citation Format: E. Pisano, C. Gatsonis, M. D. Schnall, M. A. Troester, E. Cole, J. Cormack, J. Steingrimsson, I. F. Gareen, M. Yaffe, L. C. Collins, A. Curtis, R. Carlos, K. D. Miller, C. Comstock. ECOG-ACRIN tomosynthesis mammographic imaging screening trial (TMIST): Update for 2025 [abstract]. In: Proceedings of the San Antonio Breast Cancer Symposium 2025; 2025 Dec 9-12; San Antonio, TX. Philadelphia (PA): AACR; Clin Cancer Res 2026;32(4 Suppl):Abstract nr PS5-07-19.
Alzheimer s & Dementia Behavior & Socioeconomics of Aging · 2026-01-29
articleOpen accessAbstract INTRODUCTION Relationships between Alzheimer's disease neuropathology, residential neighborhood, and cognitive impairment remain incompletely understood. METHODS We examined whether residence within a disadvantaged neighborhood was associated with amyloid positron emission tomography (PET) positivity. We used data from the observational, multisite, Imaging Dementia–Evidence for Amyloid Scanning study that included cognitively impaired Medicare beneficiaries. Our secondary analysis examined multivariable‐adjusted associations between neighborhood disadvantage (measured by Area Deprivation Index [ADI] deciles 1–90 vs. 91–100 representing greatest disadvantage) and amyloid PET positivity. RESULTS Among 15,346 White, 829 Latino, 637 Black/African American, and 321 Asian individuals, 51% were female, mean age was 75.7 years, 535 (3.8%) resided in ADI 91 to 100 decile, and 61.6% were amyloid PET positive. The ADI 91‐100 decile was associated with lower odds of PET positivity by visual interpretation (odds ratio [OR] 0.80, 95% confidence interval [CI] 0.67–0.96, p ≤ .001) but not PET Centiloid value ≥ 40 versus ≤ 10 (OR 0.81, 95% CI 0.66–1.01, p = 0.060). CONCLUSION Residence in the most disadvantaged neighborhoods may be associated with lower amyloid pathology in cognitively impaired individuals.
Newly Diagnosed Individuals in Molecular HIV-1 Clusters in Rhode Island Over 3 Decades
The Journal of Infectious Diseases · 2025-06-12
articleOpen accessBACKGROUND: Characterizing clustering rates of people with HIV in high-risk populations can offer insights on the HIV epidemic, enhancing efforts to control its spread. METHODS: We investigated longitudinal dynamics of clustering rates among individuals newly diagnosed with HIV-1. Data were extracted from the medical records of all people with HIV in Rhode Island with available viral sequences. Partial pol sequences were grouped by HIV diagnosis year, and clusters were identified in annual phylogenies. Clustering trends were estimated within 11 sociodemographic variables with the Mann-Kendall statistic. Associations with clustering propensity and changes over time were tested via generalized linear mixed effects models. RESULTS: HIV-1 sequences from 2630 individuals representing the statewide epidemic were analyzed across 33 annual datasets (1991-2023). Over this period, a continuous increase in clustering rates among newly diagnosed individuals was observed despite decreasing diagnoses over the last decade. Significant upward trends in clustering were seen among newly diagnosed men who have sex with men, males, the 21- to 40-year age group, non-Hispanic or Latino people, White persons, those with subtype B, and US-born individuals but not among people who inject drugs, females, and incarcerated individuals. Analyses of relative associations between groups within variables corroborated these results. CONCLUSIONS: Analyses focusing on molecular HIV clusters among newly diagnosed people in a statewide epidemic over 3 decades revealed significant evolving trends among those at highest risk of HIV transmission, patterns not seen in the overall population. These findings inform the design and development of targeted public health interventions aimed at high-risk populations to curb HIV spread.
Epidemiology · 2025-06-05
articleOpen accessWe discuss generalizability analyses under a partially nested trial design, where part of the trial is nested within a cohort of trial-eligible individuals, while the rest of the trial is not nested. This design arises, for example, when only some centers participating in a trial are able to collect data on non-randomized individuals, or when data on non-randomized individuals cannot be collected for the full duration of the trial. Our work is motivated by the Necrotizing Enterocolitis Surgery Trial, which compared initial laparotomy versus peritoneal drain for infants with necrotizing enterocolitis or spontaneous intestinal perforation. During the first phase of the study, data were collected from randomized individuals as well as consenting non-randomized individuals; during the second phase of the study, however, data were only collected from randomized individuals, resulting in a partially nested trial design. We propose methods for generalizability analyses with partially nested trial designs. We describe identification conditions and propose estimators for causal estimands in the target population of all trial-eligible individuals, both randomized and non-randomized, in the part of the data where the trial is nested while using trial information spanning both parts. We evaluate the estimators in a simulation study and provide an illustration using the Necrotizing Enterocolitis Surgery Trial study.
BMC Global and Public Health · 2025-06-25 · 1 citations
articleOpen accessBACKGROUND: Social and economic factors have considerable influence on the lives of people living with HIV (PLHIV). These factors shape their health behaviors, willingness to engage with other members of their communities for support, and ability to seek appropriate and timely treatment options. Evidence has shown that microfinance initiatives, by providing access to credit and social networks, have the potential to help PLHIV overcome some of these barriers. The objective of this study was to understand the association between microfinance membership and viral load suppression among HIV patients. METHODS: We used data from the Academic Model Providing Access to Healthcare (AMPATH)-Kenya's Group Integrated Savings for Health Empowerment (GISHE), a microfinance initiative (MFI), to study the association between GISHE participation and viral load suppression. Our longitudinal dataset consisted of a matched group of 3609 HIV patients. We examined the association between GISHE membership and viral load suppression by addressing the missing data problem with respect to the viral load count via multiple imputation. RESULTS: Our study revealed that GISHE membership was associated with increased viral load suppression (adjusted odds ratio (AOR) = 1.15; 95% confidence interval (CI), 1.03-1.29). Further, the study found that male patients were less likely to be virally suppressed (AOR = 0.85; 95% CI, 0.74-0.97), as were the patients in the most advanced disease stage (AOR = 0.71; 95% CI, 0.52-0.95). The finding that GISHE participation was associated with a greater likelihood of viral load suppression held even after addressing the missing data problem. CONCLUSIONS: We conclude that GISHE-type programs hold promise as scalable interventions to combat HIV/AIDS in Kenya and other countries where the disease is a generalized epidemic.
Multi-source analyses of average treatment effects with failure time outcomes
Lifetime Data Analysis · 2025-07-04
articleTree-based methods for estimating heterogeneous model performance and model combining
ArXiv.org · 2025-06-02
preprintOpen accessSenior authorModel performance is frequently reported only for the overall population under consideration. However, due to heterogeneity, overall performance measures often do not accurately represent model performance within specific subgroups. We develop tree-based methods for the data-driven identification of subgroups with differential model performance, where splitting decisions are made to maximize heterogeneity in performance between subgroups. We extend these methods to tree ensembles, including both random forests and gradient boosting. Lastly, we illustrate how these ensembles can be used for model combination. We evaluate the methods through simulations and apply them to lung cancer screening data.
Statistical Methods in Medical Research · 2025-10-21
articleIn multicenter randomized trials, when effect modifiers have a different distribution across centers, comparisons between treatment groups that average (standardize) effects over centers may not apply to any of the populations underlying the individual centers. In the presence of such heterogeneity, interpreting the evidence produced by a multicenter trial in the context of the local population underlying each center may be necessary. Here, we identify center-specific effects under conditions that are largely supported by the study design and are weaker than those underlying popular methods for the analysis of multicenter studies, in the presence of associations between center membership and the outcome ("center-outcome associations" conditional on baseline covariates and treatment). We then consider an additional testable condition of "no center-outcome associations," given baseline covariates and treatment. We propose methods for estimating center-specific average treatment effects, when center-outcome associations are present and when they are absent. When center-outcome associations are absent, we discuss how the proposed methods are often more efficient and make weaker conditions than related transportability methods applied to multicenter trials. We evaluate the performance of the methods in a simulation study and illustrate their implementation using data from the Hepatitis C Antiviral Long-Term Treatment Against Cirrhosis trial.
Diagnostic and Prognostic Research · 2025-10-01
articleOpen accessSenior authorBACKGROUND: When a machine learning model is developed and evaluated in a setting where the treatment assignment process differs from the setting of intended model deployment, failure to account for this difference can lead to suboptimal model development and biased estimates of model performance. METHODS: We consider the setting where data from a randomized trial and an observational study emulating the trial are available for machine learning model development and evaluation. We provide two approaches for estimating the model and assessing model performance under a hypothetical treatment strategy in the target population underlying the observational study. The first approach uses counterfactual predictions from the observational study only and relies on the assumption of conditional exchangeability between treated and untreated individuals (no unmeasured confounding). The second approach leverages the exchangeability between treatment groups in the trial (supported by study design) to "transport" estimates from the trial to the population underlying the observational study, relying on an additional assumption of conditional exchangeability between the populations underlying the observational study and the randomized trial. RESULTS: We examine the assumptions underlying both approaches for fitting the model and estimating performance in the target population and provide estimators for both objectives. We then develop a joint estimation strategy that combines data from the trial and the observational study, and discuss benchmarking of the trial and observational results. CONCLUSIONS: Both the observational and transportability analyses can be used to fit a model and estimate performance under a counterfactual treatment strategy in the population underlying the observational data, but they rely on different assumptions. In either case, the assumptions are untestable, and deciding which method is more appropriate requires careful contextual consideration. If all assumptions hold, then combining the data from the observational study and the randomized trial can be used for more efficient estimation.
Radiology · 2025-04-01 · 2 citations
articleOpen accessModels using clinical and MRI-based radiomic features identified from ductal carcinoma in situ lesions improved prediction of disease upstaging at surgery compared with standard clinical information alone.
Frequent coauthors
- 104 shared
Helga S. Marques
Cancer Research Center
- 104 shared
Patrick J. Bolan
- 103 shared
Savannah C. Partridge
Memorial Sloan Kettering Cancer Center
- 102 shared
Michael A. Boss
American College of Radiology
- 102 shared
Nola M. Hylton
University of California, San Francisco
- 102 shared
Thomas L. Chenevert
University of Michigan–Ann Arbor
- 101 shared
Michael Hirano
University of California, San Francisco
- 101 shared
Anum S. Kazerouni
University of Washington
Labs
Education
- 2015
Ph.D.
Cornell University
- 2010
M.S.
Cornell University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Jon Steingrimsson
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup