Peng Gao

· Assistant ProfessorVerified

Virginia Tech · Computer Science

Active 2005–2026

h-index13

Citations711

Papers6927 last 5y

Funding—

Faculty page Lab page

See your match with Peng Gao — sign in to PhdFit.Sign in

About

Peng Gao is an Assistant Professor in the Department of Computer Science at Virginia Tech. He holds a Ph.D. in electrical engineering from Princeton University, obtained in 2019, and a master's degree in electrical engineering from Princeton University, earned in 2015. He also earned a B.E. in electrical and computer engineering from Shanghai Jiao Tong University in 2013. His research interests include data analytics, information retrieval, machine learning, natural language processing, security, software engineering, and systems. Peng Gao is based at Virginia Tech's Blacksburg campus, with additional affiliations at the Institute for Advanced Computing in Alexandria, VA, and the Virginia Tech Research Center in Arlington, VA. His contact information includes an email address (penggao@vt.edu) and phone number (540-231-9060).

Research topics

Computer Science
Computer Security
Data Mining
World Wide Web
Mathematics
Arithmetic
Data science
Discrete mathematics

Selected publications

2026-喜马拉雅和拉萨碰撞后侵入岩的成因和地球动力学意义-地球化学数据集-中山大学硕士毕业论文
Zenodo (CERN European Organization for Nuclear Research) · 2026-03-31
datasetOpen accessSenior author
This dataset includes whole-rock major-trace element and Sr-Nd isotope data for post-collisional intrusive rocks from the Himalayan orogen and the Lhasa terrane.
Publisher DOI
Perovskite lasers: From optical pumping to electrical pumping
FlexMat. · 2026-03-01
articleOpen access
Abstract Integrated photonics and quantum information technologies demand compact, energy‐efficient, and wavelength‐tunable coherent light sources. Metal halide perovskites have recently emerged as a versatile class of gain media for photonic applications, offering exceptional optical gain, compositional flexibility, and defect tolerance. This review provides a comprehensive analysis of lasing mechanisms in metal halide perovskites, encompassing free‐carrier, excitonic, and strong light‐matter coupling regimes that lead to polaritonic lasing. We further discuss how crystallographic dimensionality and excitonic interactions govern the optical gain landscape and influence lasing performance. Recent advances in continuous‐wave and electrically pumped perovskite lasers are also critically examined in terms of material composition, device architecture, and exciton dynamics. Finally, we highlight emerging strategies to suppress Auger recombination, carrier imbalance, and thermal degradation, paving the way toward stable, electrically pumped perovskite lasers for scalable on‐chip photonic and quantum information systems.
Publisher DOI
Research on Design and Characterization of a Vibration Actuator Based on 0.655PMN-0.345PT Piezoelectric Ceramics
Ferroelectrics · 2026-03-20
article
Publisher DOI
2026-喜马拉雅和拉萨碰撞后侵入岩的成因和地球动力学意义-地球化学数据集-中山大学硕士毕业论文
Zenodo (CERN European Organization for Nuclear Research) · 2026-03-31
datasetOpen accessSenior author
This dataset includes whole-rock major-trace element and Sr-Nd isotope data for post-collisional intrusive rocks from the Himalayan orogen and the Lhasa terrane.
Publisher DOI
Trace2Vec: Detecting complex multi-step attacks with explainable graph neural network
Pattern Recognition · 2025-01-18 · 9 citations
articleCorresponding
Publisher DOI
GNNMF-ATAC: A Web Server-Based ATAC-seq Motif Finding and Annotation Analysis
2025-01-10
articleOpen access
Transcription Factor Binding Sites (TFBSs) are specific DNA sequences that promote or inhibit gene expression via binding regions, thereby affecting protein synthesis. TFBSs play a crucial role in gene regulatory networks, complex diseases, and the growth of humans. Assay for Transposase-Accessible Chromatin with high-throughput Sequencing (ATAC-seq) is a high-throughput sequencing technology, providing a new perspective for studying transcription factor binding patterns (motifs). Different from sequencing ChIP-seq technologies, ATAC-seq is more efficient in detecting binding sites. Methods based on graph neural network have success on ATAC-seq motifs, but researchers make considerable effort to deploy these models. To address this issue, we developed a web server named GNNMF-ATAC for users. GNNMF-ATAC provides TFBSs prediction, motif finding, and enrichment analysis via deploying tools. Moreover, GNNMF-ATAC provides motif finding results and visualizations of 200 humans and 80 mouse ENCODE ATAC-seq datasets. GNNMF-ATAC is a valuable resource for experimental biologists seeking a user-friendly tool to find ATAC-seq motifs without programming expertise. GNNMF-ATAC is freely available at http://gnnmf.online.
Publisher DOI
Intelligent Multi-Agent Collaborative Phishing Detection Framework for Lightweight Edge Devices
2025-10-21
article
With the proliferation of lightweight edge mobile terminals such as smartwatches and smart glasses, phishing attacks targeting such devices are increasingly diversified. However, due to the computational and memory constraints of edge devices, traditional cloud-based centralized detection schemes face significant challenges, while localized detection schemes struggle to balance detection accuracy and execution efficiency due to conflicts between model complexity and device computational power. This paper proposes an Intelligent Adaptive Collaborative Phishing Detection (IACPD) architecture that deploys lightweight detection models using pruning strategies on edge endpoints like smartwatches for local preliminary screening. Suspicious URLs are encrypted and transmitted to collaborative intelligent mobile devices, such as smartphones, where high-precision models perform secondary validation, thereby forming a 'lightweight initial screening—precise re-screening' cascading architecture. Multi-agent reinforcement learning is employed to dynamically ascertain optimal task migration timing, thereby establishing a hierarchical intelligent system for phishing threat identification and response that achieves efficacious task processing whilst maintaining minimal energy consumption.
Publisher DOI
Adversarial Déjà Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks
arXiv (Cornell University) · 2025-10-24
preprintOpen access
Large language models remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Defending against novel jailbreaks represents a critical challenge in AI safety. Adversarial training -- designed to make models robust against worst-case perturbations -- has been the dominant paradigm for adversarial robustness. However, due to optimization challenges and difficulties in defining realistic threat models, adversarial training methods often fail on newly developed jailbreaks in practice. This paper proposes a new paradigm for improving robustness against unseen jailbreaks, centered on the Adversarial Déjà Vu hypothesis: novel jailbreaks are not fundamentally new, but largely recombinations of adversarial skills from previous attacks. We study this hypothesis through a large-scale analysis of 32 attack papers published over two years. Using an automated pipeline, we extract and compress adversarial skills into a sparse dictionary of primitives, with LLMs generating human-readable descriptions. Our analysis reveals that unseen attacks can be effectively explained as sparse compositions of earlier skills, with explanatory power increasing monotonically as skill coverage grows. Guided by this insight, we introduce Adversarial Skill Compositional Training (ASCoT), which trains on diverse compositions of skill primitives rather than isolated attack instances. ASCoT substantially improves robustness to unseen attacks, including multi-turn jailbreaks, while maintaining low over-refusal rates. We also demonstrate that expanding adversarial skill coverage, not just data scale, is key to defending against novel attacks. \textcolor{red}{\textbf{Warning: This paper contains content that may be harmful or offensive in nature.
Publisher OA PDF DOI
Text-Enhanced Method for LLMs Domain Adaptation in Cybersecurity
Journal of Computer Information Systems · 2025-09-25
articleCorresponding
Publisher DOI
NCL-CIR: Noise-aware Contrastive Learning for Composed Image Retrieval
2025-03-12 · 1 citations
article1st authorCorresponding
Composed Image Retrieval (CIR) seeks to find a target image using a multi-modal query, which combines an image with modification text to pinpoint the target. While recent CIR methods have shown promise, they mainly focus on exploring relationships between the query pairs (image and text) through data augmentation or model design. These methods often assume perfect alignment between queries and target images, an idealized scenario rarely encountered in practice. In reality, pairs are often partially or completely mismatched due to issues like inaccurate modification texts, low-quality target images, and annotation errors. Ignoring these mismatches leads to numerous False Positive Pair (FFPs) denoted as noise pairs in the dataset, causing the model to overfit and ultimately reducing its performance. To address this problem, we propose the Noise-aware Contrastive Learning for CIR (NCL-CIR), comprising two key components: the Weight Compensation Block (WCB) and the Noise-pair Filter Block (NFB). The WCB coupled with diverse weight maps can ensure more stable token representations of multi-modal queries and target images. Meanwhile, the NFB, in conjunction with the Gaussian Mixture Model (GMM) predicts noise pairs by evaluating loss distributions, and generates soft labels correspondingly, allowing for the design of the soft-label based Noise Contrastive Estimation (NCE) loss function. Consequently, the overall architecture helps to mitigate the influence of mismatched and partially matched samples, with experimental results demonstrating that NCL-CIR achieves exceptional performance on the benchmark datasets.
Publisher DOI

Frequent coauthors

Xusheng Xiao
20 shared
Prateek Mittal
19 shared
Sanjeev R. Kulkarni
Princeton University
18 shared
Fengyuan Xu
Nanjing University
13 shared
Kangkook Jee
The University of Texas at Dallas
12 shared
Zhaosheng Yang
Shijiazhuang University
9 shared
Weiyong Yang
NARI Group (China)
7 shared
Xingshen Wei
NARI Group (China)
7 shared

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Peng Gao

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup