Bryan Plummer

· Assistant ProfessorVerified

Boston University · Computer Science

Active 2002–2026

h-index19

Citations3.0k

Papers139101 last 5y

Funding$499k

Faculty page Lab page Website

See your match with Bryan Plummer — sign in to PhdFit.Sign in

About

Bryan Plummer is an Assistant Professor in the Department of Computer Science at Boston University. He previously worked at Boston University as a Postdoctoral Associate and Research Assistant Professor, and is a member of the IVC Group. He obtained his PhD in the computer vision group at the University of Illinois at Urbana-Champaign. His research interests fall within the umbrella of artificial intelligence, with a focus on visual recognition, scene understanding, interpretable machine learning, and understanding the relationship between vision and language.

Research topics

Artificial Intelligence
Natural Language Processing
Computer Science
Machine Learning
Theoretical computer science
Cognitive psychology
Psychology

Selected publications

FuTCR: Future-Targeted Contrast and Repulsion for Continual Panoptic Segmentation
ArXiv.org · 2026-05-12
articleOpen accessSenior author
Continual Panoptic Segmentation (CPS) requires methods that can quickly adapt to new categories over time. The nature of this dense prediction task means that training images may contain a mix of labeled and unlabeled objects. As nothing is known about these unlabeled objects a priori, existing methods often simply group any unlabeled pixel into a single "background" class during training. In effect, during training, they repeatedly tell the model that all the different background categories are the same (even when they aren't). This makes learning to identify different background categories as they are added challenging since these new categories may require using information the model was previously told was unimportant and ignored. Thus, we propose a Future-Targeted Contrastive and Repulsive (FuTCR) framework that addresses this limitation by restructuring representations before new classes are introduced. FuTCR first discovers confident future-like regions by grouping model-predicted masks whose pixels are consistently classified as background but exhibit non-background logits. Next, FuTCR applies pixel-to-region contrast to build coherent prototypes from these unlabeled regions, while simultaneously repelling background features away from known-class prototypes to explicitly reserve representational space for future categories. Experiments across six CPS settings and a range of dataset sizes show FuTCR improves relative new-class panoptic quality over the state-of-the-art by up to 28%, while preserving or improving base-class performance with gains up to 4%.
Publisher OA PDF
FuTCR: Future-Targeted Contrast and Repulsion for Continual Panoptic Segmentation
arXiv (Cornell University) · 2026-05-12
preprintOpen accessSenior author
Continual Panoptic Segmentation (CPS) requires methods that can quickly adapt to new categories over time. The nature of this dense prediction task means that training images may contain a mix of labeled and unlabeled objects. As nothing is known about these unlabeled objects a priori, existing methods often simply group any unlabeled pixel into a single "background" class during training. In effect, during training, they repeatedly tell the model that all the different background categories are the same (even when they aren't). This makes learning to identify different background categories as they are added challenging since these new categories may require using information the model was previously told was unimportant and ignored. Thus, we propose a Future-Targeted Contrastive and Repulsive (FuTCR) framework that addresses this limitation by restructuring representations before new classes are introduced. FuTCR first discovers confident future-like regions by grouping model-predicted masks whose pixels are consistently classified as background but exhibit non-background logits. Next, FuTCR applies pixel-to-region contrast to build coherent prototypes from these unlabeled regions, while simultaneously repelling background features away from known-class prototypes to explicitly reserve representational space for future categories. Experiments across six CPS settings and a range of dataset sizes show FuTCR improves relative new-class panoptic quality over the state-of-the-art by up to 28%, while preserving or improving base-class performance with gains up to 4%.
Publisher DOI
Decompose, Mix, Adapt: A Unified Framework for Parameter-Efficient Neural Network Recombination and Compression
arXiv (Cornell University) · 2026-03-28
articleOpen accessSenior author
Parameter Recombination (PR) methods aim to efficiently compose the weights of a neural network for applications like Parameter-Efficient FineTuning (PEFT) and Model Compression (MC), among others. Most methods typically focus on one application of PR, which can make composing them challenging. For example, when deploying a large model you may wish to compress the model and also quickly adapt to new settings. However, PEFT methods often can still contain millions of parameters. This may be small compared to the original model size, but can be problematic in resource constrained deployments like edge devices, where they take a larger portion of the compressed model's parameters. To address this, we present Coefficient-gated weight Recombination by Interpolated Shared basis Projections (CRISP), a general approach that seamlessly integrates multiple PR tasks within the same framework. CRISP accomplishes this by factorizing pretrained weights into basis matrices and their component mixing projections. Sharing basis matrices across layers and adjusting its size enables us to perform MC, whereas the mixer weight's small size (fewer than 200 in some experiments) enables CRISP to support PEFT. Experiments show CRISP outperforms methods from prior work capable of dual-task applications by 4-5\% while also outperforming the state-of-the-art in PEFT by 1.5\% and PEFT+MC combinations by 1\%. Our code is available on the repository: https://github.com/appledora/CRISP-CVPR26.
Publisher OA PDF
Vision-language framework for multi-sequence brain magnetic resonance imaging
medRxiv · 2026-04-04
articleOpen access
Structural magnetic resonance imaging (MRI) is a cornerstone for diagnosing neurological disorders, yet automated interpretation of multi-sequence brain MRI remains limited by challenges in cross-sequence reasoning and protocol variability. Here we present ReMIND, a vision-language modeling framework tailored for comprehensive multi-sequence and multi-volumetric brain MRI analysis. Trained on over 73,000 deidentified patient visits encompassing more than 850,000 MRI sequences paired with radiology reports from diverse clinical and research cohorts, ReMIND combined large-scale instruction tuning on more than one million clinically grounded question-answer (QA) pairs with targeted supervised fine-tuning for radiology report generation. At inference, ReMIND employed modality-aware reranking and correction, a report-level decoding strategy that suppressed unsupported modality claims while preserving linguistic fluency and clinical coherence. Cross-cohort generalization was maintained on independent external datasets from different institutions. These findings represent an advance toward consistent and equitable brain MRI interpretation, meriting prospective evaluation to support diagnosis and management of neurological conditions.
Publisher OA PDF DOI
Decompose, Mix, Adapt: A Unified Framework for Parameter-Efficient Neural Network Recombination and Compression
arXiv (Cornell University) · 2026-03-28
preprintOpen accessSenior author
Parameter Recombination (PR) methods aim to efficiently compose the weights of a neural network for applications like Parameter-Efficient FineTuning (PEFT) and Model Compression (MC), among others. Most methods typically focus on one application of PR, which can make composing them challenging. For example, when deploying a large model you may wish to compress the model and also quickly adapt to new settings. However, PEFT methods often can still contain millions of parameters. This may be small compared to the original model size, but can be problematic in resource constrained deployments like edge devices, where they take a larger portion of the compressed model's parameters. To address this, we present Coefficient-gated weight Recombination by Interpolated Shared basis Projections (CRISP), a general approach that seamlessly integrates multiple PR tasks within the same framework. CRISP accomplishes this by factorizing pretrained weights into basis matrices and their component mixing projections. Sharing basis matrices across layers and adjusting its size enables us to perform MC, whereas the mixer weight's small size (fewer than 200 in some experiments) enables CRISP to support PEFT. Experiments show CRISP outperforms methods from prior work capable of dual-task applications by 4-5\% while also outperforming the state-of-the-art in PEFT by 1.5\% and PEFT+MC combinations by 1\%. Our code is available on the repository: https://github.com/appledora/CRISP-CVPR26.
Publisher DOI
ERM++: An Improved Baseline for Domain Generalization
2025-02-26 · 8 citations
articleSenior author
Domain Generalization (DG) aims to develop classifiers that can generalize to new, unseen data distributions, a critical capability when collecting new domain-specific data is impractical. A common DG baseline minimizes the empirical risk on the source domains. Recent studies have shown that this approach, known as Empirical Risk Minimization (ERM), can outperform most more complex DG methods when properly tuned. However, these studies have primarily focused on a narrow set of hyperparameters, neglecting other factors that can enhance robustness and prevent overfitting and catastrophic forgetting, properties which are critical for strong DG performance. In our investigation of training data utilization (i.e., duration and setting validation splits), initialization, and additional regularizers, we find that tuning these previously overlooked factors significantly improves model generalization across diverse datasets without adding much complexity. We call this improved, yet simple baseline ERM++. Despite its ease of implementation, ERM++ improves DG performance by over 5% compared to prior ERM baselines on a standard benchmark of 5 datasets with a ResNet-50 and over 15% with a ViT-B/16. It also outperforms all state-of-the-art methods on DomainBed datasets with both architectures. Importantly, ERM++ is easy to integrate into existing frameworks like DomainBed, making it a practical and powerful tool for researchers and practitioners. Overall, ERM++ challenges the need for more complex DG methods by providing a stronger, more reliable baseline that maintains simplicity and ease of use. Code is available at https://github.com/piotr-teterwak/erm_plusplus
Publisher DOI
CHAMMI-75: Pre-training multi-channel models with heterogeneous microscopy images
arXiv (Cornell University) · 2025-12-23
preprintOpen access
Quantifying cell morphology using images and machine learning has proven to be a powerful tool to study the response of cells to treatments. However, models used to quantify cellular morphology are typically trained with a single microscopy imaging type. This results in specialized models that cannot be reused across biological studies because the technical specifications do not match (e.g., different number of channels). Here, we present CHAMMI-75, an open access dataset of heterogeneous, multi-channel microscopy images from 75 diverse biological studies. We curated this resource from publicly available sources to investigate cellular morphology models that are channel-adaptive and can process any microscopy image type. Our experiments show that training with CHAMMI-75 can improve performance in multi-channel bioimaging tasks primarily because of its high diversity in microscopy modalities. This work paves the way to create the next generation of cellular morphology models for biological studies.
Publisher DOI
Scaling Up Temporal Domain Generalization via Temporal Experts Averaging
2025-01-01
articleOpen accessSenior author
Temporal Domain Generalization (TDG) aims to generalize across temporal distribution shifts, e.g., lexical change over time.Prior work often addresses this by predicting future model weights.However, full model prediction is prohibitively expensive for even reasonably sized models.Thus, recent methods only predict the classifier layer, limiting generalization by failing to adjust other model components.To address this, we propose Temporal Experts Averaging (TEA), a novel and scalable TDG framework that updates the entire model using weight averaging to maximize generalization potential while minimizing computational costs.Our theoretical analysis guides us to two steps that enhance generalization to future domains.First, we create expert models with functional diversity yet parameter similarity by fine-tuning a domain-agnostic base model on individual temporal domains while constraining weight changes.Second, we optimize the bias-variance tradeoff through adaptive averaging coefficients derived from modeling temporal weight trajectories in a principal component subspace.Expert's contributions are based on their projected proximity to future domains.Extensive experiments across 7 TDG benchmarks, 5 models, and 2 TDG settings shows TEA outperforms prior TDG methods by up to 69% while being up to 60x more efficient 1 .
Publisher OA PDF DOI
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling
arXiv (Cornell University) · 2025-01-08
preprintOpen accessSenior author
Given an isolated garment image in a canonical product view and a separate image of a person, the virtual try-on task aims to generate a new image of the person wearing the target garment. Prior virtual try-on works face two major challenges in achieving this goal: a) the paired (human, garment) training data has limited availability; b) generating textures on the human that perfectly match that of the prompted garment is difficult, often resulting in distorted text and faded textures. Our work explores ways to tackle these issues through both synthetic data as well as model refinement. We introduce a garment extraction model that generates (human, synthetic garment) pairs from a single image of a clothed individual. The synthetic pairs can then be used to augment the training of virtual try-on. We also propose an Error-Aware Refinement-based Schrödinger Bridge (EARSB) that surgically targets localized generation errors for correcting the output of a base virtual try-on model. To identify likely errors, we propose a weakly-supervised error classifier that localizes regions for refinement, subsequently augmenting the Schrödinger Bridge's noise schedule with its confidence heatmap. Experiments on VITON-HD and DressCode-Upper demonstrate that our synthetic data augmentation enhances the performance of prior work, while EARSB improves the overall image quality. In user studies, our model is preferred by the users in an average of 59% of cases.
Publisher OA PDF DOI
Walk and Read Less: Improving the Efficiency of Vision-and-Language Navigation via Tuning-Free Multimodal Token Pruning
2025-01-01
articleOpen access
Publisher OA PDF DOI

Recent grants

Collaborative Research: Image-based Readouts of Cellular State using Universal Morphology Embeddings
NSF · $499k · 2022–2025

Frequent coauthors

Kate Saenko
70 shared
Reuben Tan
21 shared
Svetlana Lazebnik
20 shared
Stan Sclaroff
Boston University
18 shared
Andrea Burns
17 shared
Donghyun Kim
14 shared
Ranjitha Kumar
University of Illinois Urbana-Champaign
12 shared
Piotr Teterwak
10 shared

Labs

Data Mining & Data ManagementPI

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Bryan Plummer

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you