Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Constantine Caramanis

Constantine Caramanis

· ProfessorVerified

University of Texas at Austin · Electrical and Computer Engineering

Active 1999–2026

h-index40
Citations9.4k
Papers28066 last 5y
Funding$760k
See your match with Constantine Caramanis — sign in to PhdFit.Sign in

About

Constantine Caramanis is a Professor and holds the Chandra Family Endowed Distinguished Professorship in Electrical and Computer Engineering at The University of Texas at Austin. He joined the UT Electrical and Computer Engineering department in the Fall of 2006. Dr. Caramanis received his A.B. in Mathematics from Harvard University, and his M.S. and Ph.D. in Electrical Engineering and Computer Science from MIT. His research interests center on decision-making in large-scale complex systems, with a focus on learning and computation. Specifically, he is interested in robust and adaptable optimization, high dimensional statistics and machine learning, and applications to large-scale networks, including social networks, wireless networks, transportation networks, and energy networks. He also works on applications of machine learning and optimization to computer-aided design.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Algorithm
  • Mathematics
  • Statistics
  • Mathematical optimization

Selected publications

  • Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant Search

    arXiv (Cornell University) · 2026-01-08

    preprintOpen accessSenior author

    Large-scale multi-tenant retrieval systems generate extensive query logs but lack curated relevance labels for effective domain adaptation, resulting in substantial underutilized "dark data". This challenge is compounded by the high cost of model updates, as jointly fine-tuning query and document encoders requires full corpus re-indexing, which is impractical in multi-tenant settings with thousands of isolated indices. We introduce DevRev-Search, a passage retrieval benchmark for technical customer support built via a fully automated pipeline. Candidate generation uses fusion across diverse sparse and dense retrievers, followed by an LLM-as-a-Judge for consistency filtering and relevance labeling. We further propose an Index-Preserving Adaptation strategy that fine-tunes only the query encoder, achieving strong performance gains while keeping document indices fixed. Experiments on DevRev-Search, SciFact, and FiQA-2018 show that Parameter-Efficient Fine-Tuning (PEFT) of the query encoder delivers a remarkable quality-efficiency trade-off, enabling scalable and practical enterprise search adaptation.

  • Linear Regression with Unknown Truncation Beyond Gaussian Features

    ArXiv.org · 2026-02-13

    articleOpen accessSenior author

    In truncated linear regression, samples $(x,y)$ are shown only when the outcome $y$ falls inside a certain survival set $S^\star$ and the goal is to estimate the unknown $d$-dimensional regressor $w^\star$. This problem has a long history of study in Statistics and Machine Learning going back to the works of (Galton, 1897; Tobin, 1958) and more recently in, e.g., (Daskalakis et al., 2019; 2021; Lee et al., 2023; 2024). Despite this long history, however, most prior works are limited to the special case where $S^\star$ is precisely known. The more practically relevant case, where $S^\star$ is unknown and must be learned from data, remains open: indeed, here the only available algorithms require strong assumptions on the distribution of the feature vectors (e.g., Gaussianity) and, even then, have a $d^{\mathrm{poly} (1/\varepsilon)}$ run time for achieving $\varepsilon$ accuracy. In this work, we give the first algorithm for truncated linear regression with unknown survival set that runs in $\mathrm{poly} (d/\varepsilon)$ time, by only requiring that the feature vectors are sub-Gaussian. Our algorithm relies on a novel subroutine for efficiently learning unions of a bounded number of intervals using access to positive examples (without any negative examples) under a certain smoothness condition. This learning guarantee adds to the line of works on positive-only PAC learning and may be of independent interest.

  • Entropy Aware Reward Guidance for Diffusion Language Model Alignment

    Open MIND · 2026-02-04

    preprint

    Reward guidance, also known as posterior sampling, is a popular method for test-time adaptation and post-training in continuous diffusion models. In this paper, we study reward guidance for discrete diffusion language models; now, one cannot differentiate through the natural outputs of the model because they are discrete tokens. We introduce a novel mechanism called EntRGi (Entropy aware Reward Guidance) to address this issue. EntRGi dynamically interpolates between continuous token relaxations and sampled hard tokens, on a token-by-token basis, using the diffusion model's predictive entropy. We demonstrate that EntRGi maintains both reward model reliability and optimization accuracy, while existing approaches sacrifice one for the other. We empirically validate our approach on 7B-parameter diffusion language models across two settings: (1) test-time adaptation, and (2) RGRL (Reward Guided Reinforcement Learning), our recipe for post-training on reward-guided data, showing consistent improvements over state-of-the-art methods. Our code is available at https://atutej.github.io/entrgi-rgrl

  • EntRGi: Entropy Aware Reward Guidance for Diffusion Language Models

    ArXiv.org · 2026-02-04

    articleOpen access

    Reward guidance has been applied to great success in the test-time adaptation of continuous diffusion models; it updates each denoising step using the gradients from a downstream reward model. We study reward guidance for discrete diffusion language models, where one cannot differentiate through the natural outputs of the model because they are discrete tokens. Existing approaches either replace these discrete tokens with continuous relaxations, or employ techniques like the straight-through estimator. In this work, we show the downsides of both these methods. The former degrades gradient feedback because the reward model has never been trained with continuous inputs. The latter involves incorrect optimization because the gradient evaluated at discrete tokens is used to update continuous logits. Our key innovation is to go beyond this tradeoff by introducing a novel mechanism called EntRGi: Entropy aware Reward Guidance that dynamically regulates the gradients from the reward model. By modulating the continuous relaxation using the model's confidence, our approach substantially improves reward guidance while providing reliable inputs to the reward model. We empirically validate our approach on a 7B-parameter diffusion language model across 3 diverse reward models and 3 multi-skill benchmarks, showing consistent improvements over state-of-the-art methods.

  • Linear Regression with Unknown Truncation Beyond Gaussian Features

    Open MIND · 2026-02-13

    preprintSenior author

    In truncated linear regression, samples $(x,y)$ are shown only when the outcome $y$ falls inside a certain survival set $S^\star$ and the goal is to estimate the unknown $d$-dimensional regressor $w^\star$. This problem has a long history of study in Statistics and Machine Learning going back to the works of (Galton, 1897; Tobin, 1958) and more recently in, e.g., (Daskalakis et al., 2019; 2021; Lee et al., 2023; 2024). Despite this long history, however, most prior works are limited to the special case where $S^\star$ is precisely known. The more practically relevant case, where $S^\star$ is unknown and must be learned from data, remains open: indeed, here the only available algorithms require strong assumptions on the distribution of the feature vectors (e.g., Gaussianity) and, even then, have a $d^{\mathrm{poly} (1/\varepsilon)}$ run time for achieving $\varepsilon$ accuracy. In this work, we give the first algorithm for truncated linear regression with unknown survival set that runs in $\mathrm{poly} (d/\varepsilon)$ time, by only requiring that the feature vectors are sub-Gaussian. Our algorithm relies on a novel subroutine for efficiently learning unions of a bounded number of intervals using access to positive examples (without any negative examples) under a certain smoothness condition. This learning guarantee adds to the line of works on positive-only PAC learning and may be of independent interest.

  • MaxSketch: Robust Distinct Counting in Streams via Random Projections

    arXiv (Cornell University) · 2026-05-15

    preprintOpen access

    Estimating the number of distinct elements in a data stream is well understood when repeated elements are identical. In modern settings, however, observations are high-dimensional and noisy, so repeated instances of the same object are only approximately similar -- for example, different images of the same individual may vary significantly at the pixel level. Classical sketches such as HyperLogLog rely on consistent hash values for identical elements and break down in this regime. Recent work on robust distinct counting in general metric spaces achieves $\widetildeΘ(\sqrt{n})$ memory, which is tight in the worst case. We show that substantially improved memory guarantees are possible under geometric structure common in learned representations. We introduce MaxSketch, a simple max-linear sketch built from random Gaussian projections, and prove that it succeeds in estimating the number of distinct latent objects. Concretely, we show that under this assumption $m = \widetilde{O} (\log n / \varepsilon^2)$ random projections (and hence $\widetilde{O} (\log n/\varepsilon^2)$ memory) suffice to recover the true distinct count within a $(1+\varepsilon)$ factor. Experiments on image streams confirm that MaxSketch accurately estimates distinct counts and generalizes beyond the training regime. Our results bridge classical streaming algorithms and modern representation learning, showing how geometric structure can fundamentally reduce the complexity of distinct counting.

  • Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant Search

    ArXiv.org · 2026-01-01

    articleOpen accessSenior author

    Large-scale multi-tenant retrieval systems generate extensive query logs but lack curated relevance labels for effective domain adaptation, resulting in substantial underutilized "dark data". This challenge is compounded by the high cost of model updates, as jointly fine-tuning query and document encoders requires full corpus re-indexing, which is impractical in multi-tenant settings with thousands of isolated indices. We introduce DevRev-Search, a passage retrieval benchmark for technical customer support built via a fully automated pipeline. Candidate generation uses fusion across diverse sparse and dense retrievers, followed by an LLM-as-a-Judge for consistency filtering and relevance labeling. We further propose an Index-Preserving Adaptation strategy that fine-tunes only the query encoder, achieving strong performance gains while keeping document indices fixed. Experiments on DevRev-Search, SciFact, and FiQA-2018 show that Parameter-Efficient Fine-Tuning (PEFT) of the query encoder delivers a remarkable quality-efficiency trade-off, enabling scalable and practical enterprise search adaptation.

  • Asymptotically-Optimal Gaussian Bandits with Side Observations

    ArXiv.org · 2025-05-15

    preprintOpen access

    We study the problem of Gaussian bandits with general side information, as first introduced by Wu, Szepesvari, and Gyorgy. In this setting, the play of an arm reveals information about other arms, according to an arbitrary a priori known side information matrix: each element of this matrix encodes the fidelity of the information that the ``row'' arm reveals about the ``column'' arm. In the case of Gaussian noise, this model subsumes standard bandits, full-feedback, and graph-structured feedback as special cases. In this work, we first construct an LP-based asymptotic instance-dependent lower bound on the regret. The LP optimizes the cost (regret) required to reliably estimate the suboptimality gap of each arm. This LP lower bound motivates our main contribution: the first known asymptotically optimal algorithm for this general setting.

  • Test-Time Anchoring for Discrete Diffusion Posterior Sampling

    ArXiv.org · 2025-10-02

    preprintOpen access

    While continuous diffusion models have achieved remarkable success, discrete diffusion offers a unified framework for jointly modeling text and images. Beyond unification, discrete diffusion provides faster inference, finer control, and principled training-free guidance, making it well-suited for posterior sampling. Existing approaches to posterior sampling using discrete diffusion face severe challenges: derivative-free guidance yields sparse signals, continuous relaxations limit applicability, and split Gibbs samplers suffer from the curse of dimensionality. To overcome these limitations, we introduce Anchored Posterior Sampling (APS), built on two key innovations: quantized expectation for gradient-like guidance in discrete embedding space, and anchored remasking for adaptive decoding. APS achieves state-of-the-art performance among discrete diffusion samplers on both linear and nonlinear inverse problems across the standard image benchmarks. We demonstrate the generality of APS through training-free stylization and text-guided editing. We further apply APS to a large-scale diffusion language model, showing consistent improvement in question answering.

  • Anchored Diffusion Language Model

    ArXiv.org · 2025-05-24

    preprintOpen access

    Diffusion Language Models (DLMs) promise parallel generation and bidirectional context, yet they underperform autoregressive (AR) models in both likelihood modeling and generated text quality. We identify that this performance gap arises when important tokens (e.g., key words or low-frequency words that anchor a sentence) are masked early in the forward process, limiting contextual information for accurate reconstruction. To address this, we introduce the Anchored Diffusion Language Model (ADLM), a novel two-stage framework that first predicts distributions over important tokens via an anchor network, and then predicts the likelihoods of missing tokens conditioned on the anchored predictions. ADLM significantly improves test perplexity on LM1B and OpenWebText, achieving up to 25.4% gains over prior DLMs, and narrows the gap with strong AR baselines. It also achieves state-of-the-art performance in zero-shot generalization across seven benchmarks and surpasses AR models in MAUVE score, which marks the first time a DLM generates better human-like text than an AR model. Theoretically, we derive an Anchored Negative Evidence Lower Bound (ANELBO) objective and show that anchoring improves sample complexity and likelihood modeling. Beyond diffusion, anchoring boosts performance in AR models and enhances reasoning in math and logic tasks, outperforming existing chain-of-thought approaches

Recent grants

Frequent coauthors

  • Shie Mannor

    58 shared
  • Sanjay Shakkottai

    45 shared
  • Sujay Sanghavi

    30 shared
  • Jeongyeol Kwon

    23 shared
  • Yudong Chen

    21 shared
  • Xinyang Yi

    The University of Texas at Austin

    20 shared
  • Robert W. Heath

    20 shared
  • Huan Xu

    China University of Mining and Technology

    18 shared

Labs

Education

  • Ph.D. , EECS

    Massachusetts Institute of Technology

    2006
  • M.S., EECS

    Massachusetts Institute of Technology

    2001
  • A.B., Mathematics

    Harvard University

    1999

Awards & honors

  • Chandra Family Endowed Distinguished Professorship in Electr…
  • Amazon AI PhD Fellowships (Four Texas ECE Students Awarded)
  • IEEE Fellow (Caramanis Elevated to IEEE Fellow)
  • UT NXP Foundation Fellowships (Students Rasha El-Jaroudi and…
  • NSF CAREER Award (2025)
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Constantine Caramanis

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup