Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Alexander Tropsha

Alexander Tropsha

· KH Lee Distinguished Professor and Associate DeanVerified

University of North Carolina at Chapel Hill · Toxicology

Active 1991–2025

h-index93
Citations36.5k
Papers558164 last 5y
Funding$24.0M1 active
See your match with Alexander Tropsha — sign in to PhdFit.Sign in

About

Alexander Tropsha is the KH Lee Distinguished Professor and Associate Dean at the University of North Carolina Eshelman School of Pharmacy. His major research area is Biomolecular Informatics, which involves understanding the relationships between molecular structures—both organic and macromolecular—and their properties, such as activity or function. He focuses on building validated and predictive quantitative models that relate molecular structure to biological function, utilizing statistical and machine learning approaches. These models are exploited to make verifiable predictions about the putative functions of untested molecules.

Research signals

Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.

Research topics

  • Computer Science
  • Machine Learning
  • Medicine
  • Pharmacology
  • Bioinformatics
  • Virology
  • Data science
  • Biology

Selected publications

  • Protein–ligand data at scale to support machine learning

    Nature Reviews Chemistry · 2025-07-23 · 10 citations

    review
  • Activity prediction and identification of mis‐annotated chemical compounds using extreme descriptors

    UNC Libraries · 2025-10-25

    articleOpen access1st authorCorresponding

    Data pre‐processing that includes removal of descriptors with low variance is a standard first step in quantitative structure–activity relationship modeling. In this paper, we study low‐variance descriptors and show that some of them contain significant amounts of useful information. In particular, we define the notion of extreme descriptors (those variables that have the same value for almost all compounds and only a few values that are different from the common median). We show that extreme descriptors can be helpful for activity prediction in a standard binary classification setting. Moreover, we demonstrate using two case studies ( M 2 muscarinic receptors and skin sensitization) that extreme descriptors can be used for the identification of possibly mislabeled compounds. Because of these previously unknown, but important, properties, extreme descriptors should be considered in quantitative structure–activity relationship modeling studies. Copyright © 2016 John Wiley & Sons, Ltd. In this paper authors explore low‐variance (extreme) descriptors and show that some of them contain significant amount of useful information. Furthermore, authors demonstrate that extreme descriptors can be helpful for activity prediction in a standard binary classification setting and can be used for the identification of possibly mislabeled compounds

  • Challenges of broad-spectrum antiviral drug discovery and development for emerging pathogens

    Drug Discovery Today · 2025-09-25

    reviewSenior authorCorresponding
  • Conserved Filovirus Proteins as Targets of Broad-Spectrum Antivirals

    bioRxiv (Cold Spring Harbor Laboratory) · 2025-09-28 · 1 citations

    preprintOpen access

    Abstract Filoviruses are enveloped, non-segmented, negative-strand RNA viruses belonging to the Filoviridae family, which includes five genera: Ebolavirus , Marburgvirus , Cuevavirus, Striavirus , and Thamnovirus . Members of this family cause severe and, often, fatal hemorrhagic fevers in humans and non-human primates, with high mortality rates. To date, only two filoviruses, Ebola virus (EBOV) and Marburg virus (MARV), are known to infect humans and are listed as priority pathogens by the World Health Organization due to their potential for re-emergence and the current lack of effective vaccines and antiviral treatments. In this study, we identify and characterize conserved binding sites within key filoviral proteins to support the development of broad-spectrum, direct-acting antiviral agents. We validated the significance of these conserved regions for drug discovery using existing experimental data. Our analysis revealed notably high sequence similarity among proteins from filoviruses capable of infecting humans (EBOV, TAFV, BDBV, SUDV, MARV, and RAVV) compared to those from non-zoonotic species, with the highest conservation observed in the L and VP40 proteins—both critical for viral genome transcription and replication. Furthermore, we compiled and analyzed available experimental data on known antiviral compounds targeting these proteins, identifying several agents with cross-filovirus activity, including Galidesivir, Remdesivir, and Favipiravir. The integrated approach described here—combining sequence and structural conservation analysis with chemical structure and antiviral activity data—demonstrates a strategy that could be extended to the development of broad-spectrum therapeutics across multiple viral families. HIGHLIGHTS Conserved filovirus sites targeted for broad-spectrum antivirals. Structural modeling identifies key antiviral binding sites. Viral internal proteins are crucial targets for inhibition. Remdesivir validates conserved polymerase as a druggable target. Study highlights need for pan-filovirus drug screening TOC GRAPHIC

  • Oy Vey! A Comment on “Machine Learning of Toxicological Big Data Enables Read-Across Structure Activity Relationships Outperforming Animal Test Reproducibility”

    UNC Libraries · 2025-09-05

    articleOpen access

    A study by (Luechtefeld et al., 2018) described the development of a suite of in silico models, termed read-across structure activity relationships (RASAR), that have “balanced accuracies in the 80%–95% range across 9 health hazards with no constraints on tested compounds.” This work can be considered groundbreaking for apparently exceeding the most optimistic expectations for quantitative structure-activity relationship (QSAR) modeling accuracy, especially without restrictions on model applicability domains. Predictive in silico models have been facilitating replacement and reduction of animal testing in toxicology (Wold et al., 1985); however, it is also recognized that “these methods are not always reliable and must be assessed on their individual merit for the compound and context in question” (Cronin et al., 2017). It is widely acknowledged that QSAR and other in silico models should be subject to rigorous testing and validation (Dearden et al., 2009; Fourches et al., 2010, 2015; OECD, 2014; Tropsha, 2010). Thus, we were curious to understand what technological advances have enabled the RASAR models to achieve accuracy that, for the first time in the history of QSAR, was “outperforming animal tests reproducibility” (Luechtefeld et al., 2018).

  • Modeling Protein–Protein and Protein–Ligand Interactions by the <scp>ClusPro</scp> Team in <scp>CASP16</scp>

    Proteins Structure Function and Bioinformatics · 2025-10-20 · 12 citations

    articleOpen access

    In the CASP16 experiment, our team employed hybrid computational strategies to predict both protein-protein and protein-ligand complex structures. For protein-protein docking, we combined physics-based sampling-using ClusPro FFT docking and molecular dynamics-with AlphaFold (AF)-based sampling, followed by AF-based refinement. Our method produced numerous high-accuracy complex models, including cases where AF alone failed, underscoring the critical role of physics-based sampling alongside deep learning-based refinement. For protein-ligand docking, we integrated the ClusPro LigTBM template-based approach with a machine learning-based confidence model for rescoring. The method preserves conserved interaction fragments derived from homologous complexes, followed by local resampling using physics-based sampling and a diffusion model. Our template-based strategy achieved a mean lDDT-PLI of 0.69 across 233 targets, which was highly competitive. These results demonstrate that combining physics-based modeling with AI-driven refinement can significantly enhance the accuracy of both protein-protein and protein-ligand structure predictions.

  • In silico Drug Discovery: Bridging the Gaps in Preclinical Translation

    Drug Discovery Today · 2025-12-03 · 2 citations

    articleOpen access
  • Small molecule antiviral compound collection (SMACC): A comprehensive, highly curated database to support the discovery of broad-spectrum antiviral drug molecules

    UNC Libraries · 2025-06-25

    articleOpen access
  • Machine Learning Models and a Web Portal for Predicting Cytochrome P450 Activity

    ChemRxiv · 2025-10-15

    article

    Cytochrome P450 (CYP) family of enzymes plays an integral role in drug metabolism and excretion. This application note describes the development of a novel computational CYP profiler (CYP-Pro) as a drug development tool. To enable new model development, we integrated and curated the largest, to the best of our knowledge, dataset comprising 26587 entries, including both inhibitors and substrates of CYP2D6, CYP3A4, and CYP2C9. We have built and externally validated Quantitative Structure-Activity Relationship (QSAR) models that can accurately predict whether molecules of interest are expected to be inhibitors or substrates. The models were assessed mainly by Positive Predictive Value (PPV), which ranged between 0.14 and 0.92. CYP-Pro showed the highest accuracy in predicting compounds selectively metabolized by CYP3A4 alone or by CYP2D6 and CYP2C9 without CYP3A4 involvement. All models are incorporated into the previously developed PhaKinPro portal (https://phakinpro.mml.unc.edu). CYP-Pro is unique in that it provides separate models for predicting CYP inhibitors vs. substrates, prioritizes high positive predictive value (PPV) as a pragmatic metric of accuracy to support the experimental testing of a small number of predicted substrates and inhibitors, enhances interpretability with fragment maps, and ensures reliability through strict applicability domain control. We expect that this new tool will aid researchers in early identification of compounds with favorable metabolic profiles, reducing the risks of drug-drug interactions and improving the efficiency of drug development efforts.

  • Medicines, Diseases, Indications, and Contraindications (MeDIC): a foundational resource to support drug repurposing

    Nucleic Acids Research · 2025-12-12

    articleOpen accessSenior author

    Drug databases typically aim to provide reference information on medications and their uses but often lack strict definitions of the terms drug (e.g. approved or a clinical candidate) or disease, and do not focus on any specific context of use. The recent emergence of biomedical knowledge graphs, which integrate diverse biomedical data into a contiguous, harmonized knowledge network, has enabled innovation in drug repurposing (identification of novel uses of existing drugs). This objective has created a new set of requirements and challenges for drug databases to be used for generating high-confidence, testable drug repurposing hypotheses. To address this challenge, we have developed MeDIC as an open, foundational database built from government regulatory sources only, which comprises highly curated lists of drugs (including combination therapies), diseases, indications (i.e. drug approvals to treat specific diseases), contraindications, and additional metadata. MeDIC allows for easy maintainability, open-source adaptability, and ongoing updates concordant with updates of primary sources. To facilitate downstream use, MeDIC is provided in a tabulated format, and each drug, disease, indication, or contraindication entry is mapped to multiple ontologies. We offer MeDIC as a web-based, freely accessible (https://medic.renci.org), downloadable (including lists and source code), searchable, and machine learning-friendly resource for patients, providers, and researchers.

Recent grants

Frequent coauthors

  • Eugene Muratov

    227 shared
  • Denis Fourches

    North Carolina State University

    129 shared
  • Vinícius M. Alves

    101 shared
  • Alexander Golbraikh

    University of North Carolina at Chapel Hill

    84 shared
  • Igor V. Tetko

    64 shared
  • Stephen J. Capuzzi

    University of North Carolina at Chapel Hill

    58 shared
  • Alexandre Varnek

    Centre National de la Recherche Scientifique

    51 shared
  • Hao Zhu

    Obstetrics and Gynecology Hospital of Fudan University

    49 shared

Education

  • Ph.D., Toxicology

    University of North Carolina at Chapel Hill

    1993
  • M.S., Toxicology

    University of North Carolina at Chapel Hill

    1989
  • B.S., Chemistry

    University of Belgrade

    1984
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Alexander Tropsha

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup