
Rifat Atun
· Professor of Global Health SystemsVerifiedHarvard University · General Management
Active 1998–2026
About
Dr Rifat Atun is a Professor of Global Health Systems at Harvard University and serves as the Faculty Chair for the Harvard Ministerial Leadership Program. His research focuses on two major areas: the performance of health systems and how the design and implementation of health systems reforms impact health outcomes, as well as the adoption and diffusion of innovations within health systems. These innovations include health technologies, disease control programs, primary healthcare reforms, and innovative financing in global health.
Research topics
- Computer Science
- Political Science
- Medicine
- Environmental health
- Information Retrieval
- Sociology
- Data science
- Demography
- Virology
- Mathematics
- World Wide Web
- Gerontology
- Internal medicine
- Geography
- Psychology
- Knowledge management
- Medical education
- Public relations
Selected publications
University of Birmingham Research Portal (University of Birmingham) · 2026-03-26
articleOpen accessBMJ Public Health · 2026-01-01 · 3 citations
articleOpen accessBackground: Chronic obstructive pulmonary disease (COPD) remains a major global health challenge, contributing significantly to morbidity and mortality. This study aims to provide a comprehensive analysis of the burden of COPD by age, sex and Sociodemographic Index (SDI), in addition to its attributable risk factors across 204 countries and territories from 1990 to 2021. Methods: This study is a systematic analysis of data from the Global Burden of Disease (GBD) 2021 from 1990 to 2021 across 204 countries and territories. The study calculates age-standardised rates (ASRs) for prevalence, deaths and disability-adjusted life-years (DALYs) by adjusting rates to a global age distribution and computed estimated annual percentage changes (EAPC) for these ASRs and the relative COPD burden, while also exploring the relationships between the SDI and age-standardised DALYs per 1000 population via linear regression. Results: In 2021, there were an estimated 213.4 million prevalent COPD cases globally, with an ASR of 2512.9 per 100 000. From 1990 to 2021, the EAPC for ASRs in prevalence was -0.044%, while the EAPC for percentage in prevalence was 1.224%. COPD caused 3.7 million deaths, with an ASR of 45.2 per 100 000, and 79.8 million DALYs, with an ASR of 940.7 per 100 000. The leading risk factor for COPD globally was particulate matter pollution, where it accounted for 41.7% of the global DALYs. Appreciable geographical and demographic variations were observed, with North America exhibiting the greatest ASRs for prevalence and South Asia showing the greatest ASRs for death rates. Conclusions: The study highlights the persistent and evolving global burden of COPD, emphasising the significant impact of environmental factors such as particulate matter pollution. It underscores the need for targeted public health interventions and resource allocation, particularly in low-income and middle-income countries, to mitigate the growing COPD challenge. To enhance COPD management, the recommendations include implementing regional plans to mitigate particulate pollution, strengthening surveillance of air quality and health outcomes, developing integrated health strategies and supporting a global framework for air quality improvement.
medRxiv · 2025-09-09
preprintOpen accessSenior authorABSTRACT Introduction Persistent socioeconomic and caste inequalities in India drive disparities in healthcare access. Machine learning (ML) models offer promise for forecasting service use and unmet needs, but may perpetuate algorithmic bias against disadvantaged groups. We evaluated both performance and fairness of several ML algorithms across diverse caste and socioeconomic subgroups. Methods We used nationally representative data from India to develop machine learning models predicting outpatient care, hospitalization, and unmet healthcare need among older adults. We trained logistic regression, random forest, XGBoost, and LightGBM models using demographic, social, and health-related predictors. Synthetic Minority Oversampling Technique (SMOTE) was applied to address class imbalance. We assessed model performance using AUROC and evaluated fairness across caste and income subgroups. Fairness strategies included removing sensitive features (neutral models) and training stratified models within subgroups. We used SHapley Additive exPlanations (SHAP) to identify the most influential predictors across outcomes. Results Among 55,962 older adults in India, 53.4% had at least one outpatient visit, 6.5% were hospitalized, and 7.9% reported unmet healthcare needs. Model performance varied across outcomes and groups. The best-performing model (LightGBM) achieved AUROCs of 0.78 for unmet need, 0.76 for outpatient care, and 0.70 for hospitalization. Predictive accuracy was higher in the lowest socioeconomic group (MPCE 1, AUROC = 0.79) compared to the highest (MPCE 5, AUROC = 0.75). Removing sensitive predictors such as caste or income had minimal impact (change in AUROC <0.02), and subgroup-specific models led to mixed results, with only marginal improvement for Scheduled Castes (AUROC from 0.78 to 0.80). Including social and health determinants substantially improved model performance (e.g., hospitalization AUROC increased from 0.57 to 0.70). Top predictors included self-rated health, region, grip strength, and socioeconomic status. Balancing techniques like SMOTE did not meaningfully enhance performance. Conclusions Machine learning models can effectively predict healthcare use and unmet needs among older adults in India. Incorporating social and health determinants improves model accuracy, but eliminating bias requires structural changes beyond technical adjustments. Fairness-aware model development and deployment are essential to ensure predictive tools contribute to more equitable healthcare systems. What is already known on this topic Machine learning (ML) has shown promise in predicting healthcare use and unmet need, particularly in high-income settings. Structural inequalities such as caste and income may influence healthcare access, but few ML studies have evaluated how these social factors affect model performance. Fairness concerns in ML are increasingly recognized, yet methods to assess or address them in low- and middle-income country (LMIC) settings remain limited. What this study adds This study evaluates ML model performance across caste and income subgroups using nationally representative data from older adults in India. It shows that model accuracy varies by subgroup, with better performance among Scheduled Castes and lower-income groups for predicting unmet need. Fairness interventions such as removing sensitive features or training stratified models offer limited benefit and do not fully resolve performance disparities. SHAP analysis identifies social and health determinants—especially self-rated health, caste, region, and income—as key drivers of predictions. How this study might affect research, practice or policy Encourages routine subgroup evaluation to ensure ML models do not exacerbate existing health inequities. Challenges the assumption that removing sensitive variables like caste or income improves fairness, emphasizing the need to address structural drivers directly. Supports the integration of social determinants into model development to enhance equity, transparency, and relevance in public health applications.
PLOS Digital Health · 2025-11-26
articleOpen accessSenior authorMachine learning (ML) models are increasingly applied to predict body mass index (BMI) and related outcomes, yet their fairness across socioeconomic and caste groups remains uncertain, particularly in contexts of structural inequality. Using nationally representative data from more than 55,000 adults aged 45 years and older in the Longitudinal Ageing Study in India (LASI), we evaluated the accuracy and fairness of multiple ML algorithms-including Random Forest, XGBoost, Gradient Boosting, LightGBM, Deep Neural Networks, and Deep Cross Networks-alongside logistic regression for predicting underweight, overweight, and central adiposity. Models were trained on 80% of the data and tested on 20%, with performance assessed using AUROC, accuracy, sensitivity, specificity, and precision. Fairness was evaluated through subgroup analyses across socioeconomic and caste groups and equity-based metrics such as Equalized Odds and Demographic Parity. Feature importance was examined using SHAP values, and bias-mitigation methods were implemented at pre-processing, in-processing, and post-processing stages. Tree-based models, particularly LightGBM and Gradient Boosting, achieved the highest AUROC values (0.79-0.84). Incorporating socioeconomic and health-related variables improved prediction, but fairness gaps persisted: performance declined for scheduled tribes and lower socioeconomic groups. SHAP analyses identified grip strength, gender, and residence as key drivers of prediction differences. Among mitigation strategies, Reject Option Classification and Equalized Odds Post-processing moderately reduced subgroup disparities but sometimes decreased overall performance, whereas other approaches yielded minimal gains. ML models can effectively predict obesity and adiposity risk in India, but addressing bias is essential for equitable application. Continued refinement of fairness-aware ML methods is needed to support inclusive and effective public-health decision-making.
EClinicalMedicine · 2025-09-17 · 2 citations
articleOpen accessBackground: Estimates of maternal mortality are important for informing policy and resource allocation, both globally and for individual countries, and to track progress towards Sustainable Development Goals. The Global Maternal Health (GMatH) model was developed for policy analysis and produces global and country-level estimates of maternal mortality. Estimates are also produced by models from the United Nations (UN) and Global Burden of Disease (GBD). Methods: We compared country-level estimates for 2020 of maternal deaths and the maternal mortality ratio (MMR) across the UN (v2023), GBD (v2021), and GMatH (v2023) models. We summarized the differences, assessed model convergence, and characterized the available empirical mortality data for countries with large differences to shed light on potential reasons for these differences. Findings: On average, the GMatH estimates of country-level maternal deaths in 2020 were 272 larger (43% higher) than the UN estimates, and 728 larger (49% higher) than the GBD estimates. Country-level MMRs were on average 22.3 higher (19% higher) than the UN estimates and 48.1 higher (22% higher) than the GBD estimates. Overall, 87.9% of the UN country-level MMR estimates were convergent with the GMatH model, and 82.8% of the GBD MMR estimates were convergent, but large differences were found for some countries. Among countries with the largest differences across models, survey-based estimates of the pregnancy mortality ratio were usually the only empirical mortality data available. Interpretation: Although estimates of maternal mortality are similar across the GMatH, UN, and GBD models for most countries, there are also large differences. Our structural modelling approach leverages multiple types of data across the reproductive life course, including pregnancy mortality ratios, allowing for more robust estimation of maternal health indicators. Comparing results across models helps to build confidence in estimates where they are similar and sheds light on potential reasons for differences where they diverge to help refine estimates and guide policies to reduce maternal mortality. Funding: John D. and Catherine T. MacArthur Foundation, 10-97002-000-INP.
medRxiv · 2025-07-07 · 1 citations
preprintOpen accessSenior authorABSTRACT Background Machine learning (ML) models are widely used to predict body mass index (BMI), yet their fairness across socioeconomic and caste groups remains uncertain, especially in countries with structure inequalities. This study evaluated the accuracy and fairness of ML models in predicting underweight, overweight, and central adiposity, examined the impact of socioeconomic and household factors, identified key predictive features, and assessed the effect of bias mitigation techniques on model performance. Methods This study analysed data from the nationally representative Longitudinal Ageing Study in India (LASI) with over 55,000 individuals aged 45 and older. We applied ML models (Random Forest, XGBoost, Gradient Boosting, LightGBM, DNN, DCN) alongside logistic regression. Model were trained (80%) and tested (20%), evaluated using AUROC, accuracy, sensitivity, specificity, and precision. Fairness assessment included subgroup analyses across socioeconomic and caste groups, equity-based fairness (e.g. Equalized Odds, Demographic Parity). Feature importance was examined using SHAP values. Bias mitigation techniques were applied at three stages: pre-processing (Disparate Impact Remover, Reweighting), in-processing (Exponential Gradient Reduction), and post- processing (Calibrated Equalized Odds, Reject Option Classification). Prediction density analysis assessed class separability across subgroup. Results Tree-based models—especially LightGBM and Gradient Boosting—along with Logistic Regression, consistently delivered the highest AUROC scores in predicting underweight, overweight, and high waist circumference outcomes (AUROC= 0.79-0.84). Incorporating socioeconomic and health-related variables gradually enhanced model performance; for example, the AUROC for underweight prediction increased from 0.74 to 0.78. However, our analysis revealed notable fairness issues: models performed worse for scheduled tribes and lower socioeconomic groups, as evidenced by reduced sensitivity and specificity in these subgroups. Feature importance analysis using SHAP values indicated that variables such as grip strength, gender, and residence were the key drivers of prediction differences; specifically, lower grip strength and rural residence were linked to underweight, whereas higher grip strength, urban residence, and female gender were associated with overweight and central adiposity. Regarding bias mitigation, techniques like Reject Option Classification and Equalized Odds Postprocessing showed some potential for reducing subgroup disparities by aligning the performance of low- and high-performing groups. Nevertheless, these adjustments sometimes came with trade-offs, and other methods—such as Exponentiated Gradient Reduction and Adversarial Debiasing—resulted in substantial declines in overall performance. While approaches like Disparate Impact Remover, Reweighting, and the Stratified Subgroup Best Model produced only modest changes relative to the unmitigated model, our findings highlight persistent fairness challenges. Conclusions ML models can effectively predict obesity and adiposity risks in India, but addressing biases is critical for equitable application. There are needs to further refinement of fairness awareness ML approaches in public health, particularly in the context of India’s diverse population for more inclusive and effective policy decisions. AUTHOR SUMMARY India now faces the paradox of widespread under-nutrition alongside a rising tide of obesity among its older population. We asked whether state-of-the-art machine-learning models could accurately identify individuals at highest risk of under-weight, overweight–obesity, and central adiposity while treating all social groups equitably. Using nationally representative data on more than 55,000 adults aged 45 years and above, we compared gradient-boosted decision trees, random forests, logistic regression, and other approaches with conventional regression techniques. Overall, the modern algorithms produced the strongest predictions. Yet a closer look revealed systematic shortfalls for scheduled tribes, scheduled castes, and the lowest income quintile—even when the models achieved excellent accuracy in the population as a whole. We then applied several well-established bias-mitigation strategies, such as re-weighting the training data and post-processing the decision thresholds. These interventions reduced the performance gap for disadvantaged groups, albeit at a modest cost to overall accuracy. By combining careful fairness audits with Shapley-based interpretation of feature importance, we illuminate how socioeconomic and caste-related factors shape both nutritional risk and prediction error. Our findings underscore that fair, trustworthy decision support systems in public health must be designed explicitly with equity objectives, rather than assuming that technical excellence alone will guarantee just outcomes.
Evaluation of performance of generative large language models for stroke care
npj Digital Medicine · 2025-07-29 · 10 citations
articleOpen accessSenior authorStroke is a leading cause of global morbidity and mortality, disproportionately impacting lower socioeconomic groups. In this study, we evaluated three generative LLMs-GPT, Claude, and Gemini-across four stages of stroke care: prevention, diagnosis, treatment, and rehabilitation. Using three prompt engineering techniques-Zero-Shot Learning (ZSL), Chain of Thought (COT), and Talking Out Your Thoughts (TOT)-we applied each to realistic stroke scenarios. Clinical experts assessed the outputs across five domains: (1) accuracy; (2) hallucinations; (3) specificity; (4) empathy; and (5) actionability, based on clinical competency benchmarks. Overall, the LLMs demonstrated suboptimal performance with inconsistent scores across domains. Each prompt engineering method showed strengths in specific areas: TOT does well in empathy and actionability, COT was strong in structured reasoning during diagnosis, and ZSL provided concise, accurate responses with fewer hallucinations, especially in the Treatment stage. However, none consistently met high clinical standards across all stroke care stages.
medRxiv · 2025-10-28 · 2 citations
preprintOpen accessSenior authorABSTRACT Economic evaluations of artificial intelligence (AI) in healthcare are expanding rapidly, yet underlying costing methods remains heterogenous, and frequently incomplete for health technology assessment (HTA) and policy decision-making. In our systematic review of 55 studies published between 2010 and 2025, we found that fewer than half of the studies reported explicit costing methods; most pricing analyses failed to describe the basis of fees, subscription terms, or duration of coverage; and few analyses distinguished between average and incremental costs or accounted for economies of scale. Lifecycle expenditures—including development, validation, integration, maintenance, retraining, and decommissioning—were largely omitted, while electricity consumption, data hosting, and cloud infrastructure costs were almost never considered. Sensitivity analysis was the exception rather than the norm, and reporting of cost offsets such as reduced hospital admissions or workforce time savings was inconsistent. To address these gaps, we propose a 20-item reporting checklist to standardise the costing and pricing of AI interventions. The checklist complements existing HTA frameworks while capturing features unique to AI, such as continuous retraining, reliance on data infrastructure, and recurrent maintenance. We also introduce an AI Costing Inventory and Calculator that operationalises a lifecycle approach, enabling systematic recording of resource use, unit costs, inflation adjustments, and total and incremental costs, including offsets. These tools extend the emerging CHEERS-AI reporting framework by embedding a lifecycle perspective into costing, thereby enabling consistent estimation of resource and cos components and strengthening the methodological foundations of AI economic evaluation for policy use.
Journal of Medical Internet Research · 2025-05-30
articleOpen access[No abstract available]
The Lancet Oncology · 2025-06-03
articleSenior author
Frequent coauthors
- 296 shared
Till Bärnighausen
University Hospital Heidelberg
- 172 shared
Jennifer Manne‐Goehler
Brigham and Women's Hospital
- 154 shared
Justine Davies
- 127 shared
Pascal Geldsetzer
Stanford University
- 100 shared
Michaela Theilmann
Heidelberg University
- 92 shared
Sebastián Vollmer
- 88 shared
John Tayu Lee
Australian National University
- 83 shared
Felícia Marie Knaul
University of Miami
Education
- 1994
Ph.D., Public Health
Harvard University
- 1991
Other, Public Health
Harvard University
- 1986
B.S., Medicine
University of London
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Rifat Atun
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup