Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Anthony Gitter

Anthony Gitter

· Associate Professor; Investigator, Morgridge Institute for ResearchVerified

University of Wisconsin-Madison · Biostatistics and Medical Informatics

Active 1967–2026

h-index32
Citations7.2k
Papers19690 last 5y
Funding$3.0M
See your match with Anthony Gitter — sign in to PhdFit.Sign in

About

Anthony Gitter is an associate professor in the Department of Biostatistics and Medical Informatics and affiliate faculty in the Department of Computer Sciences at the University of Wisconsin-Madison. He also holds the Jeanne M. Rowe Chair at the Morgridge Institute for Research in the John W. and Jeanne M. Rowe Center for Research in Virology and Research Computing. Additionally, he is an affiliate of the Data Science Institute and the Center for Genomic Science Innovation, and a member of the UW Carbone Cancer Center Cancer Genetic and Epigenetic Mechanisms Scientific Program. His research group focuses on using network modeling to integrate genomic, transcriptomic, and proteomic data to provide a cohesive view of biological processes, with a special emphasis on virology and oncology. They also explore machine learning applications in biochemistry, including computationally-guided chemical screening and protein engineering. Anthony Gitter received his Ph.D. in Computer Science from Carnegie Mellon University and completed a joint postdoctoral fellowship at Microsoft Research New England and the Massachusetts Institute of Technology.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Machine Learning
  • Data Mining
  • Genetics
  • Software engineering
  • Biology
  • Computational biology
  • Database
  • Statistics
  • Mathematics
  • Mathematics education
  • Psychology

Selected publications

  • A Computational Community Blind Challenge on Pan-Coronavirus Drug Discovery Data

    ChemRxiv · 2026-01-06

    articleOpen access

    Computational blind challenges offer critical, unbiased assessment opportunities to assess and accelerate scientific progress, as demonstrated by a breadth of breakthroughs over the last decade. We report the outcomes and key insights from an open science community blind challenge focused on computational methods in drug discovery, using lead optimization data from the AI-driven Structure-enabled Antiviral Platform (ASAP) Discovery Consortium’s pan-coronavirus antiviral discovery program, in partnership with Polaris and the OpenADMET project. This collaborative initiative invited global participants from both academia and industry to develop and apply computational methods to predict the biochemical potency and crystallographic ligand poses of small molecules against key coronavirus targets, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) and Middle East Respiratory Syndrome Coronavirus (MERS-CoV) main protease (Mpro), as well as multiple ADMET assay endpoints, using previously undisclosed comprehensive experimental drug discovery datasets as benchmarks. By evaluating submissions across multiple tasks and compounds, we established performance leaderboards and conducted meta-analyses to assess methodological strengths, common pitfalls, and areas for improvement. This analysis provides a foundation for best practices in real-world machine learning evaluation, grounded in community-driven benchmarking. We also highlight how next-generation platforms, such as Polaris, enable rigorous challenge design, embedded evaluation frameworks, and broad community engagement. This paper reports the collective findings of the challenge, offering a high-level overview of the data, evaluation infrastructure, and top- performing strategies. We further provide context and support for the accompanying papers authored by the challenge participants in this special issue, which explore individual approaches in greater depth. Together, these contributions aim to advance reproducible, trustworthy, and high-impact computational methods in drug discovery, and to explore best practices and pitfalls in future blind challenge design and execution, including planned initiatives for the OpenADMET project.

  • HaihuaWang-hub/2020-workflows-paper: 2020 workflow paper

    Zenodo (CERN European Organization for Nuclear Research) · 2026-01-20

    otherOpen access

    2020 workflow paper

  • HaihuaWang-hub/2020-workflows-paper: 2020 workflow paper

    Zenodo (CERN European Organization for Nuclear Research) · 2026-01-20

    otherOpen access

    2020 workflow paper

  • A Computational Community Blind Challenge on Pan-Coronavirus Drug Discovery Data.

    Apollo (University of Cambridge) · 2026-03-23

    articleOpen access

    Computational blind challenges offer critical, unbiased opportunities to assess and accelerate scientific progress, as demonstrated by a breadth of breakthroughs over the past decade. We report the outcomes and key insights from an open science community blind challenge focused on computational methods in drug discovery, using lead optimization data from the AI-driven Structure-enabled Antiviral Platform Discovery Consortium's pan-coronavirus antiviral discovery program, in partnership with Polaris and the OpenADMET project. This collaborative initiative invited global participants from both academia and industry to develop and apply computational methods to predict the biochemical potency and crystallographic ligand poses of small molecules against key coronavirus targets, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) and Middle East Respiratory Syndrome Coronavirus (MERS-CoV) main protease (Mpro), as well as multiple ADMET assay end points, using previously undisclosed comprehensive experimental drug discovery data sets as benchmarks. By evaluating submissions across multiple tasks and compounds, we established performance leaderboards and conducted meta-analyses to assess methodological strengths, common pitfalls, and areas for improvement. This analysis provides a foundation for best practices in real-world machine learning evaluation, grounded in community-driven benchmarking. We also highlight how next-generation platforms, such as Polaris, enable rigorous challenge design, embedded evaluation frameworks, and broad community engagement. This paper reports the collective findings of the challenge, offering a high-level overview of the data, evaluation infrastructure, and top-performing strategies. We further provide context and support for the accompanying papers authored by the challenge participants in this special issue, which explore individual approaches in greater depth. Together, these contributions aim to advance reproducible, trustworthy, and high-impact computational methods in drug discovery, and to explore best practices and pitfalls in future blind challenge design and execution, including planned initiatives for the OpenADMET project.

  • slolab/aac-manuscript: First preprint version

    Open MIND · 2026-02-16

    other

    The first version of the preprint for the AAC manuscript.

  • seandavi/awesome-single-cell: 2026-02-02

    Zenodo (CERN European Organization for Nuclear Research) · 2026-02-02

    otherOpen access

    Monthly release for 2026-02-02. This release was automatically created by GitHub Actions.\nThis release triggers an update to the Zenodo record: https://zenodo.org/records/1169173

  • Chemical Language Model Linker: Blending Text and Molecules with Modular Adapters

    Journal of Chemical Information and Modeling · 2025-08-21 · 6 citations

    articleOpen accessSenior authorCorresponding

    The development of large language models and multimodal models has enabled the appealing idea of generating novel molecules from text descriptions. Generative modeling would shift the paradigm from relying on large-scale chemical screening to find molecules with desired properties to directly generating those molecules. However, multimodal models combining text and molecules are often trained from scratch, without leveraging existing high-quality pretrained models. Training from scratch consumes more computational resources and prohibits model scaling. In contrast, we propose a lightweight adapter-based strategy named Chemical Language Model Linker (ChemLML). ChemLML blends the two single domain models and obtains conditional molecular generation from text descriptions while still operating in the specialized embedding spaces of the molecular domain. ChemLML can tailor diverse pretrained text models for molecule generation by training relatively few adapter parameters. We find that the choice of molecular representation used within ChemLML, SMILES versus SELFIES, has a strong influence on conditional molecular generation performance. SMILES is often preferable despite not guaranteeing valid molecules. We raise issues in using the entire PubChem data set of molecules and their associated descriptions for evaluating molecule generation and provide a filtered version of the data set as a generation test set. To demonstrate how ChemLML could be used in practice, we generate candidate protein inhibitors and use docking to assess their quality and also generate candidate membrane permeable molecules.

  • Protein Set Transformer: a protein-based genome language model to power high-diversity viromics

    Nature Communications · 2025-11-23 · 2 citations

    articleOpen access

    Exponential increases in microbial and viral genomic data demand transformational advances in scalable, generalizable frameworks for their interpretation. Standard homology-based functional analyses are hindered by the rapid divergence of microbial and especially viral genomes and proteins that significantly decreases the volume of usable data. Here, we present Protein Set Transformer (PST), a protein-based genome language model that models genomes as sets of proteins without considering sparsely available functional labels. Trained on >100k viruses, PST outperforms other homology- and language model-based approaches for relating viral genomes based on shared protein content. Further, PST demonstrates protein structural and functional awareness by clustering capsid-fold-containing proteins with known capsid proteins and uniquely clustering late gene proteins within related viruses. Our data establish PST as a valuable method for diverse viral genomics, ecology, and evolutionary applications. We posit that the PST framework can be a foundation model for microbial genomics when trained on suitable data.

  • Responsible Biodesign Workshop: AI, Protein Design, and the Biosecurity Landscape – Recommended Actions

    2025-06-04

    preprintOpen access

    This report presents Recommended Actions from the January 2025 Responsible Biodesign Workshop, which convened leading experts across AI-enabled biomolecular design and biosecurity policy. Building on existing community commitments for the Responsible Development of AI for Protein Design, the Recommended Actions aim to guide scientists, policy practitioners, and funding bodies in ensuring safe and beneficial development of AI-enabled biomolecular design tools. The Recommended Actions focus on advancing AI-Resilient nucleic acid synthesis security screening, assessing the risk-benefit landscape of biomolecular design capabilities, and building fora for sustained engagement between scientists and policy practitioners.

  • Biophysics-based protein language models for protein engineering

    Nature Methods · 2025-09-01 · 28 citations

    articleOpen access

    Protein language models trained on evolutionary data have emerged as powerful tools for predictive problems involving protein sequence, structure and function. However, these models overlook decades of research into biophysical factors governing protein function. We propose mutational effect transfer learning (METL), a protein language model framework that unites advanced machine learning and biophysical modeling. Using the METL framework, we pretrain transformer-based neural networks on biophysical simulation data to capture fundamental relationships between protein sequence, structure and energetics. We fine-tune METL on experimental sequence-function data to harness these biophysical signals and apply them when predicting protein properties like thermostability, catalytic activity and fluorescence. METL excels in challenging protein engineering tasks like generalizing from small training sets and position extrapolation, although existing methods that train on evolutionary signals remain powerful for many types of experimental assays. We demonstrate METL's ability to design functional green fluorescent protein variants when trained on only 64 examples, showcasing the potential of biophysics-based protein language models for protein engineering.

Recent grants

Frequent coauthors

  • Casey S. Greene

    152 shared
  • Halie M. Rando

    Smith College

    88 shared
  • Simina M. Boca

    AstraZeneca (Brazil)

    59 shared
  • Alexandra Lee

    59 shared
  • Ronan Lordan

    University of Pennsylvania

    56 shared
  • Nils Wellhausen

    University of Pennsylvania

    56 shared
  • Shengchao Liu

    47 shared
  • Moayad Alnammi

    King Fahd University of Petroleum and Minerals

    37 shared

Labs

  • Gitter LabPI

    Computational biology research group at the University of Wisconsin-Madison and Morgridge Institute

Education

  • Ph.D., Biostatistics

    University of Wisconsin–Madison

    2007
  • M.S., Biostatistics

    University of Wisconsin–Madison

    2003
  • B.S., Mathematics

    University of Wisconsin–Madison

    2001
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Anthony Gitter

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup