
Carl F Kesselman
· William H. Keck Professor of Engineering and Professor of Industrial and Systems Engineering, Computer Science, Population and Public Health Sciences, and Biomedical SciencesVerifiedUniversity of Southern California · Daniel J. Epstein Department of Industrial and Systems Engineering
Active 1984–2026
About
Carl Kesselman leads ISI's Informatics Systems Research division. Created to understand how to build informatics systems that can help tackle the hardest problems of great societal impact, the work of the division spans grid computing, information security, service-oriented architectures, and sociotechnical systems and reproducibility.
Research topics
- Computer Science
- Biology
- Database
- Data science
- Data Mining
- World Wide Web
- Programming language
- Genetics
- Computational biology
Selected publications
Reproducibility Beyond Artifacts: Interactional Support for Collaborative Machine Learning
2026-04-13
articleSenior authorFrom Data to Decision: Data-Centric Infrastructure for Reproducible ML in Collaborative eScience
2025-09-15 · 1 citations
articleReproducibility remains a central challenge in machine learning (ML), especially in collaborative eScience projects where teams iterate over data, features, and models. Current ML workflows are often dynamic yet fragmented, relying on informal data sharing, ad hoc scripts, and loosely connected tools. This fragmentation impedes transparency, reproducibility, and the adaptability of experiments over time. This paper introduces a data-centric framework for lifecycle-aware reproducibility, centered around six structured artifacts: Dataset, Feature, Workflow, Execution, Asset, and Controlled Vocabulary. These artifacts formalize the relationships between data, code, and decisions, enabling ML experiments to be versioned, interpretable, and traceable over time. The approach is demonstrated through a clinical ML use case of glaucoma detection, illustrating how the system supports iterative exploration, improves reproducibility, and preserves the provenance of collaborative decisions across the ML lifecycle.
IHMValidation: Assessment of Integrative Structure Models Deposited to the Protein Data Bank
Journal of Molecular Biology · 2025-12-01
articleOpen accessPDB-IHM is a branch of the Protein Data Bank (PDB), a Worldwide Protein Data Bank (wwPDB) Core Archive, that expands its scope by allowing for additional biomolecular structure representations and types of experimental information (i.e., integrative/hybrid structure models). As of October 2025, PDB-IHM contained 374 entries, benefitting from multi-scale and multi-state representations and 17 types of experimental data. These structure models are assigned PDB accession codes and are archived alongside other experimental structures in the PDB. Rigorous interpretation of a structure model requires assessment of underlying data quality, consistency with the input data, and estimates of positional uncertainty of its components. Herein, we present the IHMValidation pipeline (https://validate.pdb-ihm.org; https://github.com/salilab/IHMValidation) based on recommendations from the wwPDB Integrative Methods Task Force plus the small-angle scattering (SAS), chemical crosslinking mass spectrometry (crosslinking-MS), and cryo-electron microscopy and tomography (3DEM) communities. The IHMValidation report (available in both PDF and HTML formats) comprises six sections: (i) overview; (ii) model details; (iii) data quality assessments; (iv) local geometry assessments (i.e., model quality); (v) fit of the model to the data used to generate it; and (vi) fit of the model to the data used for validation. Future expansions of the IHMValidation pipeline will: (i) reflect recommendations coming from additional experimental communities, including Förster resonance energy transfer (FRET) and hydrogen/deuterium exchange MS (HDX-MS); (ii) include other validation criteria, such as Bayesian likelihoods for the data; and (iii) represent estimates of structure model uncertainty based on the variation among alternative models satisfying input data.
medRxiv · 2025-08-24
preprintOpen accessABSTRACT Purpose To compare the performance of a foundation model and a supervised learning-based model for detecting referable glaucoma from fundus photographs. Design Evaluation of diagnostic technology. Participants 6,116 participants from the Los Angeles County Department of Health Services Teleretinal Screening Program. Methods Fundus photographs were labeled for referable glaucoma (cup-to-disc ratio ≥ 0.6) by certified optometrists. Four deep learning models were trained on cropped and uncropped images (Training N = 8,996; Validation N = 3,002) using two architectures: a vision transformer with self-supervised pretraining on fundus photographs (RETFound) and a convolutional neural network (VGG-19). Models were evaluated on a held-out test set (N = 1,000) labeled by glaucoma specialists and an external test set (N = 300) from University of Southern California clinics. Performance was assessed while varying training set size and stratifying by demographic factors. xRAI was used for saliency mapping. Main Outcome Measures Area under the receiver operating characteristic curve (AUC-ROC) and threshold-specific metrics. Results The cropped image VGG-19 model achieved the highest AUC-ROC (0.924 [0.907-0.940]), which was comparable ( p = 0.07) to the cropped image RETFound model (0.911 [0.892-0.930]), which achieved the highest Youden-optimal performance (sensitivity 82.6%, specificity 88.2%) and F1 score (0.801). Cropped image models outperformed their uncropped counterparts within each architecture ( p < 0.001 for AUC-ROC comparisons). RETFound models had a performance advantage when trained on smaller datasets (N < 2000 images), and the uncropped image RETFound model performed best on external data ( p < 0.001 for AUC-ROC comparisons). The cropped image RETFound model performed consistently across ethnic groups ( p = 0.20), while the others did not ( p < 0.04); performance did not vary by age or gender. Saliency maps for both architectures consistently included the optic nerve. Conclusion While both RETFound and VGG-19 models performed well for classification of referable glaucoma, foundation models may be preferable when training data is limited and when domain shift is expected. Training models using images cropped to the region of the optic nerve improves performance regardless of architecture but may reduce model generalizability.
Ophthalmology Science · 2025-02-25 · 8 citations
articleOpen access<h3>Purpose</h3> Develop and test a deep learning (DL) algorithm for detecting referable glaucoma. <h3>Design</h3> Retrospective cohort study. <h3>Participants</h3> A total of 6116 patients from the Los Angeles County (LAC) Department of Health Services (DHS) were included. <h3>Methods</h3> Fundus photographs and patient-level labels of referable glaucoma (cup-to-disc ratio ≥0.6) provided by 21 certified optometrists. A DL algorithm based on the Visual Geometry Group-19 architecture was trained using patient-level labels generalized to images from both eyes. Area under the receiver operating curve (AUROC), sensitivity, and specificity were calculated to assess algorithm performance using an independent test set that was also graded by 13 clinicians with 0 to 10 years of experience. Algorithm performance was tested using reference labels provided by either LAC DHS optometrists or an expert panel of 3 glaucoma specialists. <h3>Main Outcome Measures</h3> Area under the receiver operating curve, sensitivity, and specificity. <h3>Results</h3> The DL algorithm was trained using 12 998 images from 5616 patients (2086 referable glaucoma, 3530 nonglaucoma). In this data set, the mean age was 56.8 ± 10.5 years with 54.8% women, 68.2% Latinos, 8.9% Blacks, 6.0% Asians, and 2.7% Whites. One thousand images from 500 patients (250 referable glaucoma, 250 nonglaucoma) with similar demographics (<i>P</i> ≥ 0.57) were used to test the algorithm. Algorithm performance matched or exceeded that of all independent clinician graders in detecting patient-level referable glaucoma based on LAC DHS optometrist (AUROC = 0.92) or expert panel (AUROC = 0.93) reference labels. Clinician grader sensitivity (range, 0.33–0.99) and specificity (range, 0.68–0.98) ranged widely and did not correlate with years of experience (<i>P</i>≥ 0.49). Algorithm performance (AUROC = 0.93) also matched or exceeded the sensitivity (range, 0.78–1.00) and specificity (range, 0.32–0.87) of 6 certified LAC DHS optometrists in the subsets of the test data set they graded. <h3>Conclusions</h3> A DL algorithm for detecting referable glaucoma trained using patient-level data provided by certified LAC DHS optometrists approximates or exceeds performance by ophthalmologists and optometrists, who exhibit variable sensitivity and specificity unrelated to experience level. Implementation of this algorithm in screening workflows could help reallocate resources and provide more reproducible and timely glaucoma care. <h3>Financial Disclosure(s)</h3> Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Ophthalmology Science · 2025-11-15
articleOpen accessPurpose: To compare the performance of a vision transformer-based foundation model (RETFound) and a supervised convolutional neural network (VGG-19) for detecting referable glaucoma from fundus photographs. Design: An evaluation of diagnostic technology. Participants: Six thousand one hundred sixteen participants from the Los Angeles County Department of Health Services Teleretinal Screening Program. Methods: Fundus photographs were labeled for referable glaucoma (cup-to-disc ratio ≥0.6) by certified optometrists. Four deep learning models were trained on cropped and uncropped images (training N = 8996; validation N = 3002) using 2 architectures: RETFound, a vision transformer with self-supervised pretraining on fundus photographs, and VGG-19. Models were evaluated on a held-out test set (N = 1000) labeled by glaucoma specialists and an external test set (N = 300) from University of Southern California clinics. Performance was assessed while varying training set size and stratifying by demographic factors. xRAI was used for saliency mapping. Main Outcome Measures: Area under the receiver operating characteristic curve (AUC-ROC) and threshold-specific metrics. Results: < 0.04). Performance did not vary by age or gender. Saliency maps for both architectures consistently included the optic nerve. Conclusions: Although both RETFound and VGG-19 models performed well for classification of referable glaucoma, foundation models may be preferable when training data are limited and when domain shift is expected. Training models using images cropped to the region of the optic nerve improves performance regardless of architecture but may reduce model generalizability. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
PDB-IHM: A System for Deposition, Curation, Validation, and Dissemination of Integrative Structures
Journal of Molecular Biology · 2025-01-27 · 11 citations
articleOpen access• Structures determined by integrative/hybrid methods (IHM) are archived alongside experimental structures in PDB as PDB-IHM. • These structures are processed by PDB-IHM in parallel to the wwPDB OneDep system and disseminated synchronously with PDB. • Integrative structures are assigned PDB accession codes and digital object identifiers. • PDB-IHM provides an API for depositing collections of structures and the workflows include generation of validation reports. • Integrative structures can be accessed from the PDB archive and from the PDB-IHM website . Structures of many large biomolecular assemblies are now being determined using integrative approaches. In these approaches, information derived from multiple experimental and computational methods is combined to compute three-dimensional structures of multi-protein complexes and other macromolecular machines. A standalone prototype data resource for integrative structures called PDB-Dev was built, based on recommendations of the Integrative and Hybrid Methods (IHM) Task Force of the Worldwide Protein Data Bank (wwPDB). This effort included developing data standards and software tools for collecting, curating, validating, visualizing, archiving, and disseminating integrative structures that span diverse spatiotemporal scales and conformational states. Mechanisms have been created to validate integrative structures based on the experimental data underpinning them. Building upon this foundational framework, PDB-Dev has been further expanded to handle large dynamic macromolecular systems and integrative structures that combine, for example, experimental restraints with atomic coordinates computed by machine learning algorithms. Data standards and supporting tools have also been extended to capture information about biomolecular dynamics, such as conformational transitions and related kinetic data derived from biophysical methods. Recently, PDB-Dev was unified with the PDB archive and rebranded as PDB-IHM ( pdb-ihm.org ), further promoting FAIR (Findable, Accessible, Interoperable, and Reusable) principles of data stewardship for integrative structural biology.
Acute Angle Closure Incidence in a Large Countywide Safety Net Teleretinal Screening Program
JAMA Ophthalmology · 2025-09-18 · 1 citations
articleOpen accessImportance: Pharmacologic pupillary dilation is vital for eye disease screening but is often avoided due to concerns about triggering acute angle closure (AAC), a sight-threatening ophthalmic emergency. Objective: To assess AAC incidence after dilation and validate the use of International Classification of Diseases (ICD) codes for identifying AAC cases. Design, Setting, and Participants: This retrospective cohort study used data from a primary care-based teleretinal diabetic retinopathy screening (TDRS) program. Eligible participants were Los Angeles County Department of Health Services patients who underwent teleretinal screening by dilated fundus photography between August 23, 2013, and March 1, 2024. Potential AAC cases were identified using ICD codes for angle closure, including AAC glaucoma, primary angle-closure glaucoma, and anatomical narrow angle, within 3 months of dilation. All urgent care, emergency department, and eye clinic encounters within the next calendar day after TDRS and encounters with Current Procedural Terminology codes for iridectomy/iridotomy or lens extraction within 14 calendar days of TDRS were also identified. Manual medical record review was conducted to verify AAC cases and extract clinical information. Data were analyzed from July 2024 to June 2025. Exposures: Dilation with tropicamide, 1.0%, or tropicamide, 0.5%. Main Outcomes and Measures: Cumulative incidence of AAC after dilation. Results: Of 84 008 included patients, 46 255 (55.1%) were female, and the mean (SD) age was 55.4 (10.7) years. There were a total of 168 796 dilations, with a mean (SD) of 2.01 (1.50) dilations per patient. Manual medical record review confirmed 4 AAC cases after dilation: 3 coded as AAC glaucoma and 1 as anatomical narrow angle. The AAC risk was 2.4 (95% CI, 0.05-4.69) per 100 000 dilations (0.002%) or 4.8 (95% CI, 0.10-9.43) per 100 000 patients (0.005%). All 4 AACs occurred in female patients, had narrow angles in the nonpresenting eye on gonioscopy, and presented within 1 day with AAC symptoms, including eye pain and blurry vision. Conclusions and Relevance: AAC risk was less than 1 in 40 000 per dilation in a high-volume TDRS program serving a diverse safety net population, supporting the overall safety of dilation in this setting. Further discussion about AAC risk as a contraindication to dilation is warranted.
2025-09-15
article1st authorCorrespondingArtificial Intelligence and Machine Learning have emerged as a promising approach to scientific investigations, but there is a persistent shortage of high-quality, properly annotated datasets suitable for training models. Here, we outline some of the widely reported characteristics for making AI-ready data and compare that with FAIR data. We discuss the limitations of traditional data repositories and the challenges associated with establishing a data repository that can grow with scientific communities and accommodate rapid evolution in research priorities. Finally, we introduce the SCALE principles for repository design that offer a proven framework for creating sustainable, scalable data repositories that can adapt to new data models and methodologies, ensuring that software infrastructure serves research needs rather than constraining them.
IHMValidation: Assessment of Integrative Structure Models Deposited to the Protein Data Bank
SSRN Electronic Journal · 2025-01-01
preprintOpen access
Recent grants
NIH · $5.8M · 2016
NIH · $20.5M · 2013
Collaborative Research: Personalizing Recommendations in a Large-scale Education Analytics Pipeline
NSF · $100k · 2015–2017
Dynamic mapping of the complete synaptome using recombinant probes
NIH · $9.7M · 2014–2020
Bio-Informatics Research Network Coordinating Center (BIRN-CC)
NIH · $8.3M · 2008–2014
Frequent coauthors
- 188 shared
Ian Foster
University of Illinois Chicago
- 73 shared
Ewa Deelman
- 54 shared
Ann Chervenak
- 46 shared
Karl Czajkowski
University of Southern California
- 39 shared
Steven Tuecke
University of Chicago
- 36 shared
Robert Schuler
University of Southern California
- 30 shared
Laura Pearlman
- 26 shared
Gurmeet Singh
Indian Institute of Wheat and Barley Research
Labs
Education
- 1991
Ph.D, Computer Science
University of California Los Angeles
- 1984
Masters of Science, Electrical Engineering
University of Southern California
- 1982
BSEE, Electrical Engineering
University at Buffalo - The State University of New York
Awards & honors
- Lovelace Medal from the British Computer Society
- Goode Memorial Award from the IEEE Computer Society
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Carl F Kesselman
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup