Jaideep Srivastava

Verified

University of Minnesota · Computer Science and Engineering

Active 1986–2026

h-index51

Citations19.2k

Papers588105 last 5y

Funding$124k

Faculty page

See your match with Jaideep Srivastava — sign in to PhdFit.Sign in

About

Jaideep Srivastava is a professor in the Department of Computer Science & Engineering at the University of Minnesota Twin Cities, where he also serves as the Director of Undergraduate Studies for Data Science. He joined the department in 1988 after completing his Ph.D. in Computer Science at the University of California, Berkeley. His educational background includes a B.S. in Computer Science from the Indian Institute of Technology Kanpur and an M.S. and Ph.D. in Computer Science from UC Berkeley. Srivastava's research interests encompass databases, data mining, and multimedia systems. He has made significant contributions to the fields of data science and machine learning, with a focus on developing predictive models and analyzing large-scale social networks. He is the co-founder and president of Ninja Metrics, Inc., a startup specializing in analytics for social systems and social media. His professional experience includes roles such as Research Director and Chief Scientist at Qatar Computing Research Institute, as well as industry positions including Chief Technology Officer for Persistent Systems, Director of Data Analytics for Yodlee, Inc., and data mining architect for Amazon. Srivastava has been recognized with awards such as the IEEE Fellowship and the PAKDD Distinguished Contributions Award, and he has been involved in numerous research projects related to health, social influence, and large-scale social network analysis.

Research topics

Political Science
Computer Science
Computer Security
Business
Medicine
Psychiatry
Engineering
Human–computer interaction
Internet privacy
Physical therapy
Law
Internal medicine
Commerce
Pediatrics
Marketing
Advertising

Selected publications

Automated Auditing of Hospital Discharge Summaries for Care Transitions
ArXiv.org · 2026-04-07
articleOpen accessSenior author
Incomplete or inconsistent discharge documentation is a primary driver of care fragmentation and avoidable readmissions. Despite its critical role in patient safety, auditing discharge summaries relies heavily on manual review and is difficult to scale. We propose an automated framework for large-scale auditing of discharge summaries using locally deployed Large Language Models (LLMs). Our approach operationalizes core transition-of-care requirements such as follow-up instructions, medication history and changes, patient information and clinical course, etc. into a structured validation checklist of questions based on DISCHARGED framework. Using adult inpatient summaries from the MIMIC-IV database, we utilize a privacy-preserving LLM to identify the presence, absence, or ambiguity of key documentation elements. This work demonstrates the feasibility of scalable, automated clinical auditing and provides a foundation for systematic quality improvement in electronic health record documentation.
Publisher OA PDF
BiMind: A Dual-Head Reasoning Model with Attention-Geometry Adapter for Incorrect Information Detection
ArXiv.org · 2026-04-07
articleOpen accessSenior author
Incorrect information poses significant challenges by disrupting content veracity and integrity, yet most detection approaches struggle to jointly balance textual content verification with external knowledge modification under collapsed attention geometries. To address this issue, we propose a dual-head reasoning framework, BiMind, which disentangles content-internal reasoning from knowledge-augmented reasoning. In BiMind, we introduce three core innovations: (i) an attention geometry adapter that reshapes attention logits via token-conditioned offsets and mitigates attention collapse; (ii) a self-retrieval knowledge mechanism, which constructs an in-domain semantic memory through kNN retrieval and injects retrieved neighbors via feature-wise linear modulation; (iii) the uncertainty-aware fusion strategies, including entropy-gated fusion and a trainable agreement head, stabilized by a symmetric Kullback-Leibler agreement regularizer. To quantify the knowledge contributions, we define a novel metric, Value-of-eXperience (VoX), to measure instance-wise logit gains from knowledge-augmented reasoning. Experiment results on public datasets demonstrate that our BiMind model outperforms advanced detection approaches and provides interpretable diagnostics on when and why knowledge matters.
Publisher OA PDF
BiMind: A Dual-Head Reasoning Model with Attention-Geometry Adapter for Incorrect Information Detection
arXiv (Cornell University) · 2026-04-07
preprintOpen accessSenior author
Incorrect information poses significant challenges by disrupting content veracity and integrity, yet most detection approaches struggle to jointly balance textual content verification with external knowledge modification under collapsed attention geometries. To address this issue, we propose a dual-head reasoning framework, BiMind, which disentangles content-internal reasoning from knowledge-augmented reasoning. In BiMind, we introduce three core innovations: (i) an attention geometry adapter that reshapes attention logits via token-conditioned offsets and mitigates attention collapse; (ii) a self-retrieval knowledge mechanism, which constructs an in-domain semantic memory through kNN retrieval and injects retrieved neighbors via feature-wise linear modulation; (iii) the uncertainty-aware fusion strategies, including entropy-gated fusion and a trainable agreement head, stabilized by a symmetric Kullback-Leibler agreement regularizer. To quantify the knowledge contributions, we define a novel metric, Value-of-eXperience (VoX), to measure instance-wise logit gains from knowledge-augmented reasoning. Experiment results on public datasets demonstrate that our BiMind model outperforms advanced detection approaches and provides interpretable diagnostics on when and why knowledge matters.
Publisher DOI
Automated Auditing of Hospital Discharge Summaries for Care Transitions
arXiv (Cornell University) · 2026-04-07
preprintOpen accessSenior author
Incomplete or inconsistent discharge documentation is a primary driver of care fragmentation and avoidable readmissions. Despite its critical role in patient safety, auditing discharge summaries relies heavily on manual review and is difficult to scale. We propose an automated framework for large-scale auditing of discharge summaries using locally deployed Large Language Models (LLMs). Our approach operationalizes core transition-of-care requirements such as follow-up instructions, medication history and changes, patient information and clinical course, etc. into a structured validation checklist of questions based on DISCHARGED framework. Using adult inpatient summaries from the MIMIC-IV database, we utilize a privacy-preserving LLM to identify the presence, absence, or ambiguity of key documentation elements. This work demonstrates the feasibility of scalable, automated clinical auditing and provides a foundation for systematic quality improvement in electronic health record documentation.
Publisher DOI
Evaluating Reasoning-Based Scaffolds for Human-AI Co-Annotation: The ReasonAlign Annotation Protocol
arXiv (Cornell University) · 2026-03-22
articleOpen accessSenior author
Human annotation is central to NLP evaluation, yet subjective tasks often exhibit substantial variability across annotators. While large language models (LLMs) can provide structured reasoning to support annotation, their influence on human annotation behavior remains unclear. We introduce ReasonAlign, a reasoning-based annotation scaffold that exposes LLM-generated explanations while withholding predicted labels. We frame this as a controlled study of how reasoning affects human annotation behavior, rather than a full evaluation of annotation accuracy. Using a two-pass protocol inspired by Delphi-style revision, annotators first label instances independently and then revise their decisions after viewing model-generated reasoning. We evaluate the approach on sentiment classification and opinion detection tasks, analyzing changes in inter-annotator agreement and revision behavior. To quantify these effects, we introduce the Annotator Effort Proxy (AEP), a metric capturing the proportion of labels revised after exposure to reasoning. Our results show that exposure to reasoning is associated with increased agreement alongside minimal revision, suggesting that reasoning primarily helps resolve ambiguous cases without inducing widespread changes. These findings provide insight into how reasoning explanations shape annotation consistency and highlight reasoning-based scaffolds as a practical mechanism for supporting human-AI annotation workflows.
Publisher OA PDF
Data Skeleton Learning: Scalable active clustering with sparse graph structures
Pattern Recognition · 2025-12-29
articleSenior author
Publisher DOI
The Psychology of Falsehood: A Human-Centric Survey of Misinformation Detection
2025-01-01
articleOpen access
Arghodeep Nandi, Megha Sundriyal, Euna Mehnaz Khan, Jikai Sun, Emily K. Vraga, Jaideep Srivastava, Tanmoy Chakraborty. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.
Publisher OA PDF DOI
Time-Adaptive Machine Learning Models for Predicting the Severity of Heart Failure with Reduced Ejection Fraction
Diagnostics · 2025-03-13 · 5 citations
articleOpen accessSenior author
Background: Heart failure with reduced ejection fraction is a complex condition that necessitates adaptive, patient-specific management strategies. This study aimed to evaluate the effectiveness of a time-adaptive machine learning model, the Passive-Aggressive classifier, in predicting heart failure with reduced ejection fraction severity and capturing individualized disease progression. Methods: A time-adaptive Passive-Aggressive classifier was employed, using clinical data and Brain Natriuretic Peptide levels as class designators for heart failure with reduced ejection severity. The model was personalized for individual patients by sequentially incorporating clinical visit data from 0–9 visits. The model’s adaptability and effectiveness in capturing individual health trajectories were assessed using accuracy and reliability metrics as more data were added. Results: With the progressive introduction of patient-specific data, the model demonstrated significant improvements in predictive capabilities. By incorporating data from nine visits, significant gains in accuracy and reliability were achieved, with the One-Versus-Rest AUC increasing from 0.4884 with no personalization (zero visits) to 0.8253 (nine visits). This demonstrates the model’s ability to handle diverse patient presentations and the dynamic nature of disease progression. Conclusions: The findings show the potential of time-adaptive machine learning models, particularly the Passive-Aggressive classifier, in managing heart failure with reduced ejection fraction and other chronic diseases. By enabling precise, patient-specific predictions, these approaches support early detection, tailored interventions, and improved long-term outcomes. This study highlights the feasibility of integrating adaptive models into clinical workflows to enhance the management of heart failure with reduced ejection fraction and similar chronic conditions.
Publisher OA PDF DOI
Status and opportunities of machine learning applications in obstructive sleep apnea: A narrative review
Computational and Structural Biotechnology Journal · 2025-01-01 · 5 citations
reviewOpen access
Background: Obstructive sleep apnea (OSA) is a prevalent and potentially severe sleep disorder characterized by repeated interruptions in breathing during sleep. Machine learning models have been increasingly applied in various aspects of OSA research, including diagnosis, treatment optimization, and developing biomarkers for endotypes and disease mechanisms. Objective: This narrative review evaluates the application of machine learning in OSA research, focusing on model performance, dataset characteristics, demographic representation, and validation strategies. We aim to identify trends and gaps to guide future research and improve clinical decision-making that leverages machine learning. Methods: This narrative review examines data extracted from 254 scientific publications published in the PubMed database between January 2018 and March 2023. Studies were categorized by machine learning applications, models, tasks, validation metrics, data sources, and demographics. Results: Our analysis revealed that most machine learning applications focused on OSA classification and diagnosis, utilizing various data sources such as polysomnography, electrocardiogram data, and wearable devices. We also found that study cohorts were predominantly overweight males, with an underrepresentation of women, younger obese adults, individuals over 60 years old, and diverse racial groups. Many studies had small sample sizes and limited use of robust model validation. Conclusion: Our findings highlight the need for more inclusive research approaches, starting with adequate data collection in terms of sample size and bias mitigation for better generalizability of machine learning models in OSA research. Addressing these demographic gaps and methodological opportunities is critical for ensuring more robust and equitable applications of artificial intelligence in healthcare.
Publisher DOI
Status and Opportunities of Machine Learning Applications in Obstructive Sleep Apnea: A Narrative Review
medRxiv · 2025-03-01 · 3 citations
reviewOpen access
Background: Obstructive sleep apnea (OSA) is a prevalent and potentially severe sleep disorder characterized by repeated interruptions in breathing during sleep. Machine learning models have been increasingly applied in various aspects of OSA research, including diagnosis, treatment optimization, and developing biomarkers for endotypes and disease mechanisms. Objective: This narrative review evaluates the application of machine learning in OSA research, focusing on model performance, dataset characteristics, demographic representation, and validation strategies. We aim to identify trends and gaps to guide future research and improve clinical decision-making that leverages machine learning. Methods: This narrative review examines data extracted from 254 scientific publications published in the PubMed database between January 2018 and March 2023. Studies were categorized by machine learning applications, models, tasks, validation metrics, data sources, and demographics. Results: Our analysis revealed that most machine learning applications focused on OSA classification and diagnosis, utilizing various data sources such as polysomnography, electrocardiogram data, and wearable devices. We also found that study cohorts were predominantly overweight males, with an underrepresentation of women, younger obese adults, individuals over 60 years old, and diverse racial groups. Many studies had small sample sizes and limited use of robust model validation. Conclusion: Our findings highlight the need for more inclusive research approaches, starting with adequate data collection in terms of sample size and bias mitigation for better generalizability of machine learning models in OSA research. Addressing these demographic gaps and methodological opportunities is critical for ensuring more robust and equitable applications of artificial intelligence in healthcare.
Publisher OA PDF DOI

Recent grants

EAGER: Collaborative Research: Some Assembly Required: Understanding the Emergence of Teams and Ecosystems of Teams
NSF · $124k · 2012–2016

Frequent coauthors

Ee‐Peng Lim
Singapore Management University
672 shared
P. Krishna Reddy
Martin College
651 shared
Hiroshi Motoda
Osaka University
650 shared
Kyu-Young Whang
Thammasat University
650 shared
Graham Williams
Australian National University
649 shared
Jian Pei
Duke University
649 shared
Huan Liu
648 shared
Enhong Chen
University of Science and Technology of China
648 shared

Labs

Jaideep SrivastavaPI

Education

PhD, Electrical Engineering and Computer Science
University of California Berkeley
1988
B Tech, Computer Science
Indian Institute of Technology Kanpur
1983

Awards & honors

2013: PAKDD Distinguished Contributions Award
2004: IEEE Fellow
2002: IBM Faculty Award

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Jaideep Srivastava

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you