Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Rama Chellappa

Rama Chellappa

· Bloomberg Distinguished ProfessorVerified

Johns Hopkins University · Radiology and Radiological Science

Active 1980–2026

h-index124
Citations73.7k
Papers1.4k235 last 5y
Funding$45.4M1 active
See your match with Rama Chellappa — sign in to PhdFit.Sign in

About

Professor Rama Chellappa is a Bloomberg Distinguished Professor with joint appointments in the Departments of Electrical and Computer Engineering and the Biomedical Engineering (School of Medicine) at Johns Hopkins University (JHU). He joined JHU in August 2020 after serving as a Distinguished University Professor and Minta Martin Professor of Engineering at the University of Maryland (UMD), College Park. Professor Chellappa received his B.E. (Hons.) degree in Electronics and Communication Engineering from the University of Madras, India in 1975, his M.E. (with Distinction) from the Indian Institute of Science, Bangalore, India in 1977, and his M.S.E.E. and Ph.D. degrees in Electrical Engineering from Purdue University in 1978 and 1981, respectively. From 1981 to 1991, he was a faculty member in the Department of Electrical Engineering-Systems at the University of Southern California (USC). Between 1991 and 2020, he was a Professor of Electrical and Computer Engineering and an affiliate Professor of Computer Science at UMD, where he was also affiliated with the Center for Automation Research, the Institute for Advanced Computer Studies, and the Applied Mathematics and Scientific Computing Program. Professor Chellappa's current research interests encompass computer vision, machine learning, and artificial intelligence with applications in face recognition, 3D modeling from video, image and video-based recognition of objects, events and activities, medical imaging, domain adaptation, and generalization. His extensive academic and research career reflects a deep commitment to advancing the fields of electrical engineering and biomedical engineering through innovative research and interdisciplinary collaboration.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Statistics
  • Mathematics

Selected publications

  • DiffRegCD: Integrated Registration and Change Detection with Diffusion Features

    2026-03-06

    articleOpen access

    Change detection (CD) is critical in computer vision and remote sensing, with applications in monitoring, disaster response, and urban analysis. Most CD models assume co-registered inputs, but real imagery often suffers from parallax, viewpoint shifts, or long temporal gaps, leading to severe misalignment. Conventional register-then-detect pipelines and recent joint frameworks (e.g., BiFA, ChangeRD) remain limited: they rely on regression-only flow, global homographies, or synthetic perturbations that fail under large displacements. We propose DiffRegCD, an integrated framework that couples dense registration and change detection. DiffRegCD reformulates correspondence as a Gaussian-smoothed classification task, delivering sub-pixel accuracy and stable training. It builds on frozen multi-scale features from a pretrained denoising diffusion model, which provide invariance to viewpoint and illumination variation. Supervision is enabled by controlled affine perturbations applied to standard CD datasets, yielding paired ground truth for both flow and change detection without pseudo-labels. Experiments on aerial (LEVIR-CD, DSIFN-CD, WHU-CD, SYSU-CD) and ground-level (VL-CMU-CD) datasets show that DiffRegCD outperforms recent baselines and remains robust under wide temporal and viewpoint variation, establishing diffusion features and classification-based correspondence as a strong foundation for integrated CD. The code is available at GitHub.

  • Uncertainty-quantified Pulse Signal Recovery from Facial Video using Regularized Stochastic Interpolants

    arXiv (Cornell University) · 2026-04-12

    articleOpen access

    Imaging Photoplethysmography (iPPG), an optical procedure which recovers a human's blood volume pulse (BVP) waveform using pixel readout from a camera, is an exciting research field with many researchers performing clinical studies of iPPG algorithms. While current algorithms to solve the iPPG task have shown outstanding performance on benchmark datasets, no state-of-the art algorithms, to the best of our knowledge, performs test-time sampling of solution space, precluding an uncertainty analysis that is critical for clinical applications. We address this deficiency though a new paradigm named Regularized Interpolants with Stochasticity for iPPG (RIS-iPPG). Modeling iPPG recovery as an inverse problem, we build probability paths that evolve the camera pixel distribution to the ground-truth signal distribution by predicting the instantaneous flow and score vectors of a time-dependent stochastic process; and at test-time, we sample the posterior distribution of the correct BVP waveform given the camera pixel intensity measurements by solving a stochastic differential equation. Given that physiological changes are slowly varying, we show that iPPG recovery can be improved through regularization that maximizes the correlation between the residual flow vector predictions of two adjacent time windows. Experimental results on three datasets show that RIS-iPPG provides superior reconstruction quality and uncertainty estimates of the reconstruction, a critical tool for the widespread adoption of iPPG algorithms in clinical and consumer settings.

  • Uncertainty-quantified Pulse Signal Recovery from Facial Video using Regularized Stochastic Interpolants

    arXiv (Cornell University) · 2026-04-12

    preprintOpen access

    Imaging Photoplethysmography (iPPG), an optical procedure which recovers a human's blood volume pulse (BVP) waveform using pixel readout from a camera, is an exciting research field with many researchers performing clinical studies of iPPG algorithms. While current algorithms to solve the iPPG task have shown outstanding performance on benchmark datasets, no state-of-the art algorithms, to the best of our knowledge, performs test-time sampling of solution space, precluding an uncertainty analysis that is critical for clinical applications. We address this deficiency though a new paradigm named Regularized Interpolants with Stochasticity for iPPG (RIS-iPPG). Modeling iPPG recovery as an inverse problem, we build probability paths that evolve the camera pixel distribution to the ground-truth signal distribution by predicting the instantaneous flow and score vectors of a time-dependent stochastic process; and at test-time, we sample the posterior distribution of the correct BVP waveform given the camera pixel intensity measurements by solving a stochastic differential equation. Given that physiological changes are slowly varying, we show that iPPG recovery can be improved through regularization that maximizes the correlation between the residual flow vector predictions of two adjacent time windows. Experimental results on three datasets show that RIS-iPPG provides superior reconstruction quality and uncertainty estimates of the reconstruction, a critical tool for the widespread adoption of iPPG algorithms in clinical and consumer settings.

  • Encoding of Demographic and Anatomical Information in Chest X-Ray-Based Severe Left Ventricular Hypertrophy Classifiers

    Biomedicines · 2025-09-02 · 1 citations

    articleOpen access

    Background. Severe left ventricular hypertrophy (SLVH) is a high-risk structural cardiac abnormality associated with increased risk of heart failure. It is typically assessed using echocardiography or cardiac magnetic resonance imaging, but these modalities are limited by cost, accessibility, and workflow burden. We introduce a deep learning framework that classifies SLVH directly from chest radiographs, without intermediate anatomical estimation models or demographic inputs. A key contribution of this work lies in interpretability. We quantify how clinically relevant attributes are encoded within internal representations, enabling transparent model evaluation and integration into AI-assisted workflows. Methods. We construct class-balanced subsets from the CheXchoNet dataset with equal numbers of SLVH-positive and negative cases while preserving the original train, validation, and test proportions. ResNet-18 is fine-tuned from ImageNet weights, and a Vision Transformer (ViT) encoder is pretrained via masked autoencoding with a trainable classification head. No anatomical or demographic inputs are used during training. We apply Mutual Information Neural Estimation (MINE) to quantify dependence between learned features and five attributes: age, sex, interventricular septal diameter (IVSDd), posterior wall diameter (LVPWDd), and internal diameter (LVIDd). Results. ViT achieves an AUROC of 0.82 [95% CI: 0.78–0.85] and an AUPRC of 0.80 [95% CI: 0.76–0.85], indicating strong performance in SLVH detection from chest radiographs. MINE reveals clinically coherent attribute encoding in learned features: age > sex > IVSDd > LVPWDd > LVIDd. Conclusions. This study shows that SLVH can be accurately classified from chest radiographs alone. The framework combines diagnostic performance with quantitative interpretability, supporting reliable deployment in triage and decision support.

  • DiffProtect: Generative adversarial examples using diffusion models for facial privacy protection

    Pattern Recognition · 2025-11-24 · 1 citations

    articleOpen accessSenior author

    • Diffusion model-based adversarial attacks for facial privacy protection with high visual quality. • Face semantics regularization module preserves visual identity during facial privacy protection. • Attack acceleration strategy significantly improves efficiency while maintaining performance. • 24.5 % absolute improvement in attack success rate compared to state-of-the-art methods. • Real-world validation with commercial API and user study shows practical effectiveness. The increasingly pervasive facial recognition (FR) systems raise serious concerns about personal privacy, especially for billions of users who have publicly shared their photos on social media. To address this challenge, several adversarial attack methods have been proposed to protect individuals from being identified by unauthorized FR systems with perturbed facial images. However, these approaches suffer from poor visual quality or low attack success rates, which limit their practical utility. Recently, diffusion models have achieved tremendous success in image generation. In this work, we ask: can diffusion models be used to generate adversarial examples against FR systems to improve both visual quality and attack performance? We propose DiffProtect, a novel method leveraging a diffusion autoencoder to generate semantically meaningful perturbations on FR systems. Extensive experiments demonstrate that DiffProtect produces more natural-looking encrypted images than state-of-the-art methods while achieving significantly higher attack success rates, e.g. , 24.5 % and 25.1 % absolute improvements on the CelebA-HQ and FFHQ datasets. We further evaluate the effectiveness of DiffProtect in the real world using a commercial FR API and validate its usefulness in practice through a user study. Our code is available at https://github.com/joellliu/DiffProtect .

  • Speedy MASt3R

    2025-08-04 · 1 citations

    articleSenior author

    MASt3R redefines image matching as a 3D task but suffers from high inference latency (198ms per image pair on an A40 GPU). We introduce Speedy MASt3R, a post-training optimization framework that achieves a 54% speedup (91ms per pair) without compromising accuracy. Our approach incorporates four key techniques: (1) FlashMatch, which leverages FlashAttention v2 for efficient attention computation; (2) GraphFusion, which optimizes the computation graph using TensorRT; (3) FastNN-Lite, which reduces complexity from quadratic to linear; and (4) HybridCast, which enables mixed-precision inference. Evaluations on five benchmarks (Aachen Day-Night, InLoc, 7-Scenes, ScanNet1500, MegaDepth1500) demonstrate consistent performance, highlighting real-time 3D understanding capabilities.

  • Distillation-Guided Representation Learning for Unconstrained Video Human Authentication

    IEEE Transactions on Biometrics Behavior and Identity Science · 2025-08-04

    articleOpen access

    Human authentication is an important and challenging biometric task, particularly from unconstrained videos. While body recognition is a popular approach, gait recognition holds the promise of robustly identifying subjects based on walking patterns instead of appearance information. Previous gait-based approaches have performed well for curated indoor scenes; however, they tend to underperform in unconstrained situations. To address these challenges, we propose a framework, termed Holistic GAit DEtection and Recognition (H-GADER), for human authentication in challenging outdoor scenarios. Specifically, H-GADER leverages a Double Helical Signature to detect segments that contain human movement and builds discriminative features through a novel gait recognition method. To further enhance robustness, H-GADER encodes viewpoint information in its architecture, and distills learned representations from an auxiliary RGB recognition model; this allows H-GADER to learn from maximum amount of data at training time. At test time, H-GADER infers solely from the silhouette modality. Furthermore, we introduce a body recognition model through semantic, large-scale, self-supervised training to complement gait recognition. By conditionally fusing gait and body representations based on the presence/absence of gait information as decided by the gait detection, we demonstrate significant improvements compared to when a single modality or a naive feature ensemble is used. We evaluate our method on multiple existing State-of-The-Arts(SoTA) gait baselines and demonstrate consistent improvements on indoor and outdoor datasets, especially on the BRIAR dataset, which features unconstrained, long-distance videos, achieving a 28.9% improvement.

  • Enrich and Detect: Video Temporal Grounding With Multimodal Llms

    2025-10-19 · 1 citations

    articleOpen access

    We introduce ED-VTG, a method for fine-grained video temporal grounding utilizing multi-modal large language models. Our approach harnesses the capabilities of multimodal LLMs to jointly process text and video, in order to effectively localize natural language queries in videos through a two-stage process. Rather than being directly grounded, language queries are initially transformed into enriched sentences that incorporate missing details and cues to aid in grounding. In the second stage, these enriched queries are grounded, using a lightweight decoder, which specializes at predicting accurate boundaries conditioned on contextualized representations of the enriched queries. To mitigate noise and reduce the impact of hallucinations, our model is trained with a multiple-instance-learning objective that dynamically selects the optimal version of the query for each training sample. We demonstrate state-of-the-art results across various benchmarks in temporal video grounding and paragraph grounding settings. Experiments reveal that our method significantly outperforms all previously proposed LLM-based temporal grounding approaches and is either superior or comparable to specialized models, while maintaining a clear advantage against them in zero-shot evaluation scenarios.

  • Innovation in geriatrics: what this series means for care

    Innovation in Aging · 2025-11-08

    articleOpen access1st authorCorresponding
  • TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision

    2025-10-19

    preprintOpen access

    We address the problem of video question answering (video QA) with temporal grounding in a weakly supervised setup, without any temporal annotations. Given a video and a question, we generate an open-ended answer grounded with the start and end time. For this task, we propose TOGA: a vision-language model for Temporally Grounded Open-Ended Video QA with Weak Supervision. We instruct-tune TOGA to jointly generate the answer and the temporal grounding. We operate in a weakly supervised setup where the temporal grounding annotations are not available. We generate pseudo labels for temporal grounding and ensure the validity of these labels by imposing a consistency constraint between the question of a grounding response and the response generated by a question referring to the same temporal segment. We notice that jointly generating the answers with the grounding improves performance on question answering as well as grounding. We evaluate TOGA on grounded QA and open-ended QA tasks. For grounded QA, we consider the NExT-GQA benchmark which is designed to evaluate weakly supervised grounded question answering. For open-ended QA, we consider the MSVD-QA and ActivityNet-QA benchmarks. We achieve state-of-the-art performance for both tasks on these benchmarks.

Recent grants

Frequent coauthors

  • Anil K. Jain

    1688 shared
  • Joydeep Ghosh

    1682 shared
  • Josef Kittler

    1682 shared
  • Takeo Kanade

    1682 shared
  • Gennady Osipov

    1681 shared
  • Witold Russia

    Conference Board

    1681 shared
  • Madhu Sudan

    Harvard University Press

    1681 shared
  • Publicity Co-Chairs

    Asia University

    1681 shared

Labs

Education

  • PhD/1981, Electrical and Computer Engineering

    Purdue University

    1981
  • MSEE, Electrical and Computer Engineering

    Purdue University

    1978
  • Master of Engineering (Distinction), Electrical Communication Engineering

    Indian Institute of Science

    1977
  • Bachelor of Engineering (Hons.), Electronics and Communication Engineering

    Anna University Chennai College of Engineering Guindy

    1975

Awards & honors

  • 2020 Jack S. Kilby Signal Processing Medal
  • IEEE Life Fellow
  • Society Award from IEEE Signal Processing Society
  • IEEE Computer Society Technical Achievement Award
  • 2025 Azriel Rosenfeld Lifetime Achievement Award
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Rama Chellappa

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup