Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Peng Gao

Peng Gao

· Assistant ProfessorVerified

Virginia Tech · Computer Science

Active 2005–2026

h-index13
Citations711
Papers6927 last 5y
Funding
See your match with Peng Gao — sign in to PhdFit.Sign in

About

Peng Gao is an Assistant Professor in the Department of Computer Science at Virginia Tech. He holds a Ph.D. in electrical engineering from Princeton University, obtained in 2019, and a master's degree in electrical engineering from Princeton University, earned in 2015. He also earned a B.E. in electrical and computer engineering from Shanghai Jiao Tong University in 2013. His research interests include data analytics, information retrieval, machine learning, natural language processing, security, software engineering, and systems. Peng Gao is based at Virginia Tech's Blacksburg campus, with additional affiliations at the Institute for Advanced Computing in Alexandria, VA, and the Virginia Tech Research Center in Arlington, VA. His contact information includes an email address (penggao@vt.edu) and phone number (540-231-9060).

Research topics

  • Computer Science
  • Computer Security
  • Data Mining
  • World Wide Web
  • Mathematics
  • Arithmetic
  • Data science
  • Discrete mathematics

Selected publications

  • 2026-喜马拉雅和拉萨碰撞后侵入岩的成因和地球动力学意义-地球化学数据集-中山大学硕士毕业论文

    Zenodo (CERN European Organization for Nuclear Research) · 2026-03-31

    datasetOpen accessSenior author

    This dataset includes whole-rock major-trace element and Sr-Nd isotope data for post-collisional intrusive rocks from the Himalayan orogen and the Lhasa terrane.

  • Perovskite lasers: From optical pumping to electrical pumping

    FlexMat. · 2026-03-01

    articleOpen access

    Abstract Integrated photonics and quantum information technologies demand compact, energy‐efficient, and wavelength‐tunable coherent light sources. Metal halide perovskites have recently emerged as a versatile class of gain media for photonic applications, offering exceptional optical gain, compositional flexibility, and defect tolerance. This review provides a comprehensive analysis of lasing mechanisms in metal halide perovskites, encompassing free‐carrier, excitonic, and strong light‐matter coupling regimes that lead to polaritonic lasing. We further discuss how crystallographic dimensionality and excitonic interactions govern the optical gain landscape and influence lasing performance. Recent advances in continuous‐wave and electrically pumped perovskite lasers are also critically examined in terms of material composition, device architecture, and exciton dynamics. Finally, we highlight emerging strategies to suppress Auger recombination, carrier imbalance, and thermal degradation, paving the way toward stable, electrically pumped perovskite lasers for scalable on‐chip photonic and quantum information systems.

  • Research on Design and Characterization of a Vibration Actuator Based on 0.655PMN-0.345PT Piezoelectric Ceramics

    Ferroelectrics · 2026-03-20

    article
  • 2026-喜马拉雅和拉萨碰撞后侵入岩的成因和地球动力学意义-地球化学数据集-中山大学硕士毕业论文

    Zenodo (CERN European Organization for Nuclear Research) · 2026-03-31

    datasetOpen accessSenior author

    This dataset includes whole-rock major-trace element and Sr-Nd isotope data for post-collisional intrusive rocks from the Himalayan orogen and the Lhasa terrane.

  • Trace2Vec: Detecting complex multi-step attacks with explainable graph neural network

    Pattern Recognition · 2025-01-18 · 9 citations

    articleCorresponding
  • GNNMF-ATAC: A Web Server-Based ATAC-seq Motif Finding and Annotation Analysis

    2025-01-10

    articleOpen access

    Transcription Factor Binding Sites (TFBSs) are specific DNA sequences that promote or inhibit gene expression via binding regions, thereby affecting protein synthesis. TFBSs play a crucial role in gene regulatory networks, complex diseases, and the growth of humans. Assay for Transposase-Accessible Chromatin with high-throughput Sequencing (ATAC-seq) is a high-throughput sequencing technology, providing a new perspective for studying transcription factor binding patterns (motifs). Different from sequencing ChIP-seq technologies, ATAC-seq is more efficient in detecting binding sites. Methods based on graph neural network have success on ATAC-seq motifs, but researchers make considerable effort to deploy these models. To address this issue, we developed a web server named GNNMF-ATAC for users. GNNMF-ATAC provides TFBSs prediction, motif finding, and enrichment analysis via deploying tools. Moreover, GNNMF-ATAC provides motif finding results and visualizations of 200 humans and 80 mouse ENCODE ATAC-seq datasets. GNNMF-ATAC is a valuable resource for experimental biologists seeking a user-friendly tool to find ATAC-seq motifs without programming expertise. GNNMF-ATAC is freely available at http://gnnmf.online.

  • Intelligent Multi-Agent Collaborative Phishing Detection Framework for Lightweight Edge Devices

    2025-10-21

    article

    With the proliferation of lightweight edge mobile terminals such as smartwatches and smart glasses, phishing attacks targeting such devices are increasingly diversified. However, due to the computational and memory constraints of edge devices, traditional cloud-based centralized detection schemes face significant challenges, while localized detection schemes struggle to balance detection accuracy and execution efficiency due to conflicts between model complexity and device computational power. This paper proposes an Intelligent Adaptive Collaborative Phishing Detection (IACPD) architecture that deploys lightweight detection models using pruning strategies on edge endpoints like smartwatches for local preliminary screening. Suspicious URLs are encrypted and transmitted to collaborative intelligent mobile devices, such as smartphones, where high-precision models perform secondary validation, thereby forming a 'lightweight initial screening—precise re-screening' cascading architecture. Multi-agent reinforcement learning is employed to dynamically ascertain optimal task migration timing, thereby establishing a hierarchical intelligent system for phishing threat identification and response that achieves efficacious task processing whilst maintaining minimal energy consumption.

  • Adversarial Déjà Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks

    arXiv (Cornell University) · 2025-10-24

    preprintOpen access

    Large language models remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Defending against novel jailbreaks represents a critical challenge in AI safety. Adversarial training -- designed to make models robust against worst-case perturbations -- has been the dominant paradigm for adversarial robustness. However, due to optimization challenges and difficulties in defining realistic threat models, adversarial training methods often fail on newly developed jailbreaks in practice. This paper proposes a new paradigm for improving robustness against unseen jailbreaks, centered on the Adversarial Déjà Vu hypothesis: novel jailbreaks are not fundamentally new, but largely recombinations of adversarial skills from previous attacks. We study this hypothesis through a large-scale analysis of 32 attack papers published over two years. Using an automated pipeline, we extract and compress adversarial skills into a sparse dictionary of primitives, with LLMs generating human-readable descriptions. Our analysis reveals that unseen attacks can be effectively explained as sparse compositions of earlier skills, with explanatory power increasing monotonically as skill coverage grows. Guided by this insight, we introduce Adversarial Skill Compositional Training (ASCoT), which trains on diverse compositions of skill primitives rather than isolated attack instances. ASCoT substantially improves robustness to unseen attacks, including multi-turn jailbreaks, while maintaining low over-refusal rates. We also demonstrate that expanding adversarial skill coverage, not just data scale, is key to defending against novel attacks. \textcolor{red}{\textbf{Warning: This paper contains content that may be harmful or offensive in nature.

  • Text-Enhanced Method for LLMs Domain Adaptation in Cybersecurity

    Journal of Computer Information Systems · 2025-09-25

    articleCorresponding
  • NCL-CIR: Noise-aware Contrastive Learning for Composed Image Retrieval

    2025-03-12 · 1 citations

    article1st authorCorresponding

    Composed Image Retrieval (CIR) seeks to find a target image using a multi-modal query, which combines an image with modification text to pinpoint the target. While recent CIR methods have shown promise, they mainly focus on exploring relationships between the query pairs (image and text) through data augmentation or model design. These methods often assume perfect alignment between queries and target images, an idealized scenario rarely encountered in practice. In reality, pairs are often partially or completely mismatched due to issues like inaccurate modification texts, low-quality target images, and annotation errors. Ignoring these mismatches leads to numerous False Positive Pair (FFPs) denoted as noise pairs in the dataset, causing the model to overfit and ultimately reducing its performance. To address this problem, we propose the Noise-aware Contrastive Learning for CIR (NCL-CIR), comprising two key components: the Weight Compensation Block (WCB) and the Noise-pair Filter Block (NFB). The WCB coupled with diverse weight maps can ensure more stable token representations of multi-modal queries and target images. Meanwhile, the NFB, in conjunction with the Gaussian Mixture Model (GMM) predicts noise pairs by evaluating loss distributions, and generates soft labels correspondingly, allowing for the design of the soft-label based Noise Contrastive Estimation (NCE) loss function. Consequently, the overall architecture helps to mitigate the influence of mismatched and partially matched samples, with experimental results demonstrating that NCL-CIR achieves exceptional performance on the benchmark datasets.

Frequent coauthors

  • Xusheng Xiao

    20 shared
  • Prateek Mittal

    19 shared
  • Sanjeev R. Kulkarni

    Princeton University

    18 shared
  • Fengyuan Xu

    Nanjing University

    13 shared
  • Kangkook Jee

    The University of Texas at Dallas

    12 shared
  • Zhaosheng Yang

    Shijiazhuang University

    9 shared
  • Weiyong Yang

    NARI Group (China)

    7 shared
  • Xingshen Wei

    NARI Group (China)

    7 shared
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Peng Gao

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup