Barnabas Poczos

· Associate ProfessorVerified

Carnegie Mellon University · Machine Learning Department

Active 2002–2026

h-index58

Citations12.3k

Papers37561 last 5y

Funding$1.1M

Faculty page Lab page

See your match with Barnabas Poczos — sign in to PhdFit.Sign in

Research topics

Computer Science
Artificial Intelligence
Physics
Engineering
Engineering drawing

Selected publications

Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates
arXiv (Cornell University) · 2026-04-13
preprintOpen accessSenior author
We study learning to learn for regression problems through the lens of hyperparameter tuning. We propose the Langevin Gradient Descent Algorithm (LGD), which approximates the mean of the posterior distribution defined by the loss function and regularizer of a convex regression task. We prove the existence of an optimal hyperparameter configuration for which the LGD algorithm achieves the Bayes' optimal solution for squared loss. Subsequently, we study generalization guarantees on meta-learning optimal hyperparameters for the LGD algorithm from a given set of tasks in the data-driven setting. For a number of parameters $d$ and hyperparameter dimension $h$, we show a pseudo-dimension bound of $O(dh)$, upto logarithmic terms under mild assumptions on LGD. This matches the dimensional dependence of the bounds obtained in prior work for the elastic net, which only allows for $h=2$ hyperparameters, and extends their bounds to regression on convex loss. Finally, we show empirical evidence of the success of LGD and the meta-learning procedure for few-shot learning on linear regression using a few synthetically created datasets.
Publisher DOI
Efficient Convexification of Kolmogorov-Arnold Networks with Polynomial Functional Forms Via a Continuous Graham Scan Approach
arXiv (Cornell University) · 2026-04-04
articleOpen access
Deterministic global optimization of nonlinear models is important in many scientific and engineering applications. This framework typically involves repeatedly solving convex relaxations of the nonconvex problem, meaning that the strength of the relaxations and the cost of computing them directly determine overall efficiency and solution quality. In this work, we develop a tailored continuous convexification framework for Kolmogorov-Arnold Networks in which the univariate components are polynomial functions. By exploiting the additive separable structure of this architecture, the relaxation problem reduces to computing tight convex envelopes of univariate polynomials. We propose a continuous variant of the classical Graham Scan that constructs these envelopes exactly by identifying the bitangents of the polynomial convex hull without discretization or factorable reformulations. We establish the correctness of the algorithm and characterize its computational complexity, and show how these envelopes can be combined to construct strong convex relaxations for polynomial KANs. Computational results demonstrate that the proposed relaxations are both strong and robust, often producing bounds that are comparable, or even orders of magnitude tighter than relaxations of state-of-the-art global optimization solvers while remaining computationally efficient.
Publisher OA PDF
SenSet defines cell-type specific senescence signatures in the aged human lung
The EMBO Journal · 2026-04-10
articleOpen access
Cellular senescence is defined as an irreversible growth arrest observed when cells are exposed to a variety of stressors, including DNA damage, oxidative stress, or nutrient deprivation. Although senescence is a well-established driver of aging and age-related diseases, it is a highly heterogeneous process with significant variations across organisms, tissues, and cell types. The relatively low abundance of senescent cells in healthy aged tissues poses a major challenge to the longitudinal study of senescence in specific organs, including the human lung. To overcome this limitation, we developed a positive-unlabeled learning framework to generate a comprehensive list of senescence marker genes in human lungs (termed SenSet) using the largest publicly available single-cell lung dataset, the Human Lung Cell Atlas (HLCA). We validated SenSet in a highly complex ex vivo human 3D lung tissue culture model subjected to the senescence inducers bleomycin, doxorubicin, or irradiation, and established its sensitivity and accuracy in characterizing senescence. Using SenSet, we identified and validated cell-type-specific senescence signatures in distinct lung cell populations upon aging and environmental exposure. Our study provides a comprehensive analysis of senescent cells in the healthy aging lung, presenting fundamental implications for our understanding of major lung diseases, including cancer, fibrosis, chronic obstructive pulmonary disease, or asthma.
Publisher DOI
Efficient Convexification of Kolmogorov-Arnold Networks with Polynomial Functional Forms Via a Continuous Graham Scan Approach
arXiv (Cornell University) · 2026-04-04
preprintOpen access
Deterministic global optimization of nonlinear models is important in many scientific and engineering applications. This framework typically involves repeatedly solving convex relaxations of the nonconvex problem, meaning that the strength of the relaxations and the cost of computing them directly determine overall efficiency and solution quality. In this work, we develop a tailored continuous convexification framework for Kolmogorov-Arnold Networks in which the univariate components are polynomial functions. By exploiting the additive separable structure of this architecture, the relaxation problem reduces to computing tight convex envelopes of univariate polynomials. We propose a continuous variant of the classical Graham Scan that constructs these envelopes exactly by identifying the bitangents of the polynomial convex hull without discretization or factorable reformulations. We establish the correctness of the algorithm and characterize its computational complexity, and show how these envelopes can be combined to construct strong convex relaxations for polynomial KANs. Computational results demonstrate that the proposed relaxations are both strong and robust, often producing bounds that are comparable, or even orders of magnitude tighter than relaxations of state-of-the-art global optimization solvers while remaining computationally efficient.
Publisher DOI
Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates
arXiv (Cornell University) · 2026-04-13
articleOpen accessSenior author
We study learning to learn for regression problems through the lens of hyperparameter tuning. We propose the Langevin Gradient Descent Algorithm (LGD), which approximates the mean of the posterior distribution defined by the loss function and regularizer of a convex regression task. We prove the existence of an optimal hyperparameter configuration for which the LGD algorithm achieves the Bayes' optimal solution for squared loss. Subsequently, we study generalization guarantees on meta-learning optimal hyperparameters for the LGD algorithm from a given set of tasks in the data-driven setting. For a number of parameters $d$ and hyperparameter dimension $h$, we show a pseudo-dimension bound of $O(dh)$, upto logarithmic terms under mild assumptions on LGD. This matches the dimensional dependence of the bounds obtained in prior work for the elastic net, which only allows for $h=2$ hyperparameters, and extends their bounds to regression on convex loss. Finally, we show empirical evidence of the success of LGD and the meta-learning procedure for few-shot learning on linear regression using a few synthetically created datasets.
Publisher OA PDF
AmpLyze: A Deep Learning Model for Predicting the Hemolytic Concentration
ArXiv.org · 2025-07-10 · 1 citations
preprintOpen accessSenior author
Red-blood-cell lysis (HC50) is the principal safety barrier for antimicrobial-peptide (AMP) therapeutics, yet existing models only say "toxic" or "non-toxic." AmpLyze closes this gap by predicting the actual HC50 value from sequence alone and explaining the residues that drive toxicity. The model couples residue-level ProtT5/ESM2 embeddings with sequence-level descriptors in dual local and global branches, aligned by a cross-attention module and trained with log-cosh loss for robustness to assay noise. The optimal AmpLyze model reaches a PCC of 0.756 and an MSE of 0.987, outperforming classical regressors and the state-of-the-art. Ablations confirm that both branches are essential, and cross-attention adds a further 1% PCC and 3% MSE improvement. Expected-Gradients attributions reveal known toxicity hotspots and suggest safer substitutions. By turning hemolysis assessment into a quantitative, sequence-based, and interpretable prediction, AmpLyze facilitates AMP design and offers a practical tool for early-stage toxicity screening.
Publisher OA PDF DOI
Reconstructing Galaxy Cluster Mass Maps using Score-based Generative Modeling
The Open Journal of Astrophysics · 2025-07-14 · 1 citations
articleOpen accessSenior author
We present a novel approach to reconstruct gas and dark matter projected density maps of galaxy clusters using score-based generative modeling. Our diffusion model takes in mock SZ and X-ray images as conditional inputs, and generates realizations of corresponding gas and dark matter maps by sampling from a learned data posterior. We train and validate the performance of our model by using mock data from a hydrodynamical cosmological simulation. The model accurately reconstructs both the mean and spread of the radial density profiles in the spatial domain, indicating that the model is able to distinguish between clusters of different mass sizes. In the spectral domain, the model achieves close-to-unity values for the bias and cross-correlation coefficients, indicating that the model can accurately probe cluster structures on both large and small scales. Our experiments demonstrate the ability of score models to learn a strong, nonlinear, and unbiased mapping between input observables and fundamental density distributions of galaxy clusters. These diffusion models can be further fine-tuned and generalized to not only take in additional observables as inputs, but also real observations and predict unknown density distributions of galaxy clusters.
Publisher DOI
Pharmacophore-Conditioned Diffusion Model for Ligand-Based De Novo Drug Design
ArXiv.org · 2025-05-15
preprintOpen access
Developing bioactive molecules remains a central, time- and cost-heavy challenge in drug discovery, particularly for novel targets lacking structural or functional data. Pharmacophore modeling presents an alternative for capturing the key features required for molecular bioactivity against a biological target. In this work, we present PharmaDiff, a pharmacophore-conditioned diffusion model for 3D molecular generation. PharmaDiff employs a transformer-based architecture to integrate an atom-based representation of the 3D pharmacophore into the generative process, enabling the precise generation of 3D molecular graphs that align with predefined pharmacophore hypotheses. Through comprehensive testing, PharmaDiff demonstrates superior performance in matching 3D pharmacophore constraints compared to ligand-based drug design methods. Additionally, it achieves higher docking scores across a range of proteins in structure-based drug design, without the need for target protein structures. By integrating pharmacophore modeling with 3D generative techniques, PharmaDiff offers a powerful and flexible framework for rational drug design.
Publisher OA PDF DOI
Recovering time-varying networks from single-cell data
Bioinformatics · 2025-07-01 · 2 citations
articleOpen access
MOTIVATION: Gene regulation is a dynamic process that underlies all aspects of human development, disease response, and other biological processes. The reconstruction of temporal gene regulatory networks has conventionally relied on regression analysis, graphical models, or other types of relevance networks. With the large increase in time series single-cell data, new approaches are needed to address the unique scale and nature of these data for reconstructing such networks. RESULTS: Here, we develop a deep neural network, Marlene, to infer dynamic graphs from time series single-cell gene expression data. Marlene constructs directed gene networks using a self-attention mechanism where the weights evolve over time using recurrent units. By employing meta learning, the model is able to recover accurate temporal networks even for rare cell types. In addition, Marlene can identify gene interactions relevant to specific biological responses, including COVID-19 immune response, fibrosis, and aging, paving the way for potential treatments. AVAILABILITY AND IMPLEMENTATION: The code used to train Marlene is available at https://github.com/euxhenh/Marlene.
Publisher OA PDF DOI
Learning from B Cell Evolution: Adaptive Multi-Expert Diffusion for Antibody Design via Online Optimization
bioRxiv (Cold Spring Harbor Laboratory) · 2025-08-03 · 3 citations
preprintOpen accessSenior authorCorresponding
Abstract Recent advances in diffusion models have shown remarkable potential for antibody design, yet existing approaches apply uniform generation strategies that cannot adapt to each antigen’s unique requirements. Inspired by B cell affinity maturation—where antibodies evolve through multi-objective optimization balancing affinity, stability, and self-avoidance—we propose the first biologically-motivated framework that leverages physics-based domain knowledge within an online meta-learning system. Our method employs multiple specialized experts (van der Waals, molecular recognition, energy balance, and interface geometry) whose parameters evolve during generation based on iterative feedback, mimicking natural antibody refinement cycles. Instead of fixed protocols, this adaptive guidance discovers personalized optimization strategies for each target. Our experiments demonstrate that this approach: (1) discovers optimal SE(3)-equivariant guidance strategies for different antigen classes without pre-training, preserving molecular symmetries throughout optimization; (2) significantly enhances hotspot coverage and interface quality through target-specific adaptation, achieving balanced multi-objective optimization characteristic of therapeutic antibodies; (3) establishes a paradigm for iterative refinement where each antibody-antigen system learns its unique optimization profile through online evaluation; (4) generalizes effectively across diverse design challenges, from small epitopes to large protein interfaces, enabling precision-focused campaigns for individual targets.
Publisher OA PDF DOI

Recent grants

RI: III: Medium: Scalable Machine Learning for Automating Scientific Discovery in Astrophysics
NSF · $1.1M · 2016–2020

Frequent coauthors

Jeff Schneider
103 shared
Kirthevasan Kandasamy
59 shared
Junier B. Oliva
46 shared
András Lőrincz
38 shared
Chun‐Liang Li
33 shared
Siamak Ravanbakhsh
McGill University
30 shared
Zoltán Szabó
Brno University of Technology
29 shared
Sashank J. Reddi
29 shared

Labs

Barnabas Poczos's LabPI
Not provided

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Barnabas Poczos

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you