Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Mengyang Gu

Mengyang Gu

· Associate ProfessorVerified

University of California, Santa Barbara · Statistics and Applied Probability

Active 2010–2026

h-index13
Citations843
Papers7345 last 5y
Funding$590k1 active
See your match with Mengyang Gu — sign in to PhdFit.Sign in

Research topics

  • Artificial Intelligence
  • Machine Learning
  • Computer Science
  • Algorithm
  • Mathematics
  • Chemistry
  • Thermodynamics
  • Chromatography
  • Physics
  • Chemical physics
  • Materials science
  • Statistics
  • Biological system

Selected publications

  • Unsupervised Cell Segmentation by Fast Gaussian Processes

    The New England Journal of Statistics in Data Science · 2026-01-01

    articleOpen accessSenior author

    Cell boundary information is crucial for analyzing cell behaviors from time-lapse microscopy videos. Existing supervised cell segmentation tools, such as ImageJ, require tuning various parameters and rely on restrictive assumptions about the shape of the objects. While recent supervised segmentation tools based on convolutional neural networks enhance accuracy, they depend on high-quality labeled images, making them unsuitable for segmenting new types of objects not in the database. We developed a novel unsupervised cell segmentation algorithm based on fast Gaussian processes for noisy microscopy images without the need for parameter tuning or restrictive assumptions about the shape of the object. We derived robust thresholding criteria adaptive for heterogeneous images containing distinct brightness at different parts to separate objects from the background, and employed watershed segmentation to distinguish touching cell objects. Both simulated studies and real-data analysis of large microscopy images demonstrate the scalability and accuracy of our approach compared with the alternatives.

  • Handbook of Bayesian, Fiducial, and Frequentist Inference

    Journal of the American Statistical Association · 2025-01-17

    article1st authorCorresponding
  • The inverse Kalman filter

    Biometrika · 2025-01-01 · 1 citations

    articleOpen accessSenior author

    Summary We introduce the inverse Kalman filter, which enables exact matrix-vector multiplication between a covariance matrix from a dynamic linear model and any real-valued vector with linear computational cost. We integrate the inverse Kalman filter with the conjugate gradient algorithm, which substantially accelerates the computation of matrix inversion for a general form of covariance matrix, where other approximation approaches may not be directly applicable. We demonstrate the scalability and efficiency of the proposed approach through applications in nonparametric estimation of particle interaction functions, using both simulations and cell trajectories from microscopy data.

  • Neural operators for forward and inverse potential–density mappings in classical density functional theory

    The Journal of Chemical Physics · 2025-10-28 · 2 citations

    article

    Neural operators are capable of capturing nonlinear mappings between infinite-dimensional functional spaces, offering a data-driven approach to modeling complex functional relationships in classical density functional theory. In this work, we evaluate the performance of several neural operator architectures in learning the functional relationships between the one-body density profile ρ(x), the one-body direct correlation function c1(x), and the external potential Vext(x) of inhomogeneous one-dimensional hard-rod fluids, using training data generated from analytical solutions of the underlying statistical-mechanical model. Several variants of the Deep Operator Network (DeepONet) and the Fourier Neural Operator (FNO) were considered, each incorporating different machine-learning architectures, activation functions, and training strategies. These operator learning methods are benchmarked against a fully connected dense neural network, which serves as a baseline. We compared their performance in terms of the mean squared error loss in establishing the functional relationships as well as in predicting the excess free energy across two test sets: (1) a group test set generated via random cross-validation (CV) to assess interpolation capability and (2) a newly constructed dataset for leave-one-group CV to evaluate extrapolation performance. Our results show that FNO achieves the most accurate predictions of the excess free energy, with the squared ReLU activation function outperforming other activation choices. Among the DeepONet variants, the Residual Multiscale Convolutional Neural Network (RMSCNN) combined with a trainable Gaussian derivative kernel (GK-RMSCNN-DeepONet) demonstrates the best performance. Additionally, we applied the trained models to solve for the density profiles at various external potentials and compared the results with those obtained from the direct mapping Vext ↦ ρ with neural operators, as well as with Gaussian process regression combined with active learning by error control, which has shown strong performance in previous studies. While the direct mapping from Vext ↦ ρ suffers from high extrapolation error and proves inefficient for out-of-distribution predictions, the neural-operator mapping ρ ↦ c1 can effectively be used to solve the density profile via the Euler-Lagrange equation or be integrated with other surrogate methods. Moreover, neural operators offer additional flexibility through specialized operations, such as significance-based predictions on uneven grids (as in GK-CNN-DeepONet) and adaptive grid resolution adjustment (as in FNO), both of which can enhance prediction accuracy.

  • Fast phase prediction of charged polymer blends by white-box machine learning surrogates

    ArXiv.org · 2025-09-08

    preprintOpen accessSenior author

    Compatibilized polymer blends are a complex, yet versatile and widespread category of material. When the components of a binary blend are immiscible, they are typically driven towards a macrophase-separated state, but with the introduction of electrostatic interactions, they can be either homogenized or shifted to microphase separation. However, both experimental and simulation approaches face significant challenges in efficiently exploring the vast design space of charge-compatibilized polymer blends, encompassing chemical interactions, architectural properties, and composition. In this work, we introduce a white-box machine learning approach integrated with polymer field theory to predict the phase behavior of these systems, which is significantly more accurate than conventional black-box machine learning approaches. The random phase approximation (RPA) calculation is used as a testbed to determine polymer phases. Instead of directly predicting the polymer phase output of RPA calculations from a large input space by a machine learning model, we build a parallel partial Gaussian process model to predict the most computationally intensive component of the RPA calculation that only involves polymer architecture parameters as inputs. This approach substantially reduces the computational cost of the RPA calculation across a vast input space with nearly 100% accuracy for out-of-sample prediction, enabling rapid screening of polymer blend charge-compatibilization designs. More broadly, the white-box machine learning strategy offers a promising approach for dramatic acceleration of polymer field-theoretic methods for mapping out polymer phase behavior.

  • Economic intellectual growth significance and youth unemployment: a cross-country analysis (2000-2023)

    International Journal of Intellectual Property Management · 2025-01-01

    article1st authorCorresponding

    The study presents a dynamic interaction between the pace of intellectual economic growth and youth unemployment in four countries: Brazil, South Africa, Germany, and the USA. The analysis spans 23 years, from 2000 to 2023. A deeper study of secondary data gathered from reliable worldwide sources is investigated in this research project using the quantitative research approach. According to the data, youth unemployment was predicted and influenced by economic growth at different times. The influence of major variables, such as the inflation rate, wage levels, and foreign direct investment (FDI), varied among the countries. Unexpectedly, South Africa's GDP growth rate correlates adversely with youth unemployment, while foreign direct investment correlates positively. The Brazil analysis identified no significant predictors, suggesting other variables may affect the model. The study employs regression models and Granger causality tests to find trends, correlations, and probable causal links between economic indicators and teenage labour market outcomes. Pre-processing and data extraction with STATA organise and purify data for research analysis.

  • Neural Operators for Forward and Inverse Potential-Density Mappings in Classical Density Functional Theory

    ArXiv.org · 2025-06-07

    preprintOpen access

    Neural operators are capable of capturing nonlinear mappings between infinite-dimensional functional spaces, offering a data-driven approach to modeling complex functional relationships in classical density functional theory (cDFT). In this work, we evaluate the performance of several neural operator architectures in learning the functional relationships between the one-body density profile $ρ(x)$, the one-body direct correlation function $c_1(x)$, and the external potential $V_{ext}(x)$ of inhomogeneous one-dimensional (1D) hard-rod fluids, using training data generated from analytical solutions of the underlying statistical-mechanical model. We compared their performance in terms of the Mean Squared Error (MSE) loss in establishing the functional relationships as well as in predicting the excess free energy across two test sets: (1) a group test set generated via random cross-validation (CV) to assess interpolation capability, and (2) a newly constructed dataset for leave-one-group CV to evaluate extrapolation performance. Our results show that FNO achieves the most accurate predictions of the excess free energy, with the squared ReLU activation function outperforming other activation choices. Among the DeepONet variants, the Residual Multiscale Convolutional Neural Network (RMSCNN) combined with a trainable Gaussian derivative kernel (GK-RMSCNN-DeepONet) demonstrates the best performance. Additionally, we applied the trained models to solve for the density profiles at various external potentials and compared the results with those obtained from the direct mapping $V_{ext} \mapsto ρ$ with neural operators, as well as with Gaussian Process Regression (GPR) combined with Active Learning by Error Control (ALEC), which has shown strong performance in previous studies.

  • Learning from Landmarks, Curves, Surfaces, and Shapes in Geomstats

    ACM Transactions on Mathematical Software · 2025-12-12 · 1 citations

    article

    We introduce the shape module of the Python package Geomstats to analyze shapes of objects represented as landmarks, curves, and surfaces across fields of natural sciences and engineering. The shape module first implements widely used shape spaces, such as the Kendall shape space, as well as elastic spaces of discrete curves and surfaces. The shape module further implements the abstract mathematical structures of group actions, fiber bundles, quotient spaces, and associated Riemannian metrics which allow users to build their own shape spaces. The Riemannian geometry tools enable users to compare, average, interpolate between shapes inside a given shape space. These essential operations can then be leveraged to perform statistics and machine learning on shape data. We present the object-oriented implementation of the shape module along with illustrative examples and show how it can be used to perform statistics and machine learning on shape spaces.

  • Synergizing chemical and AI communities for advancing laboratories of the future

    ArXiv.org · 2025-10-18

    preprintOpen accessSenior author

    The development of automated experimental facilities and the digitization of experimental data have introduced numerous opportunities to radically advance chemical laboratories. As many laboratory tasks involve predicting and understanding previously unknown chemical relationships, machine learning (ML) approaches trained on experimental data can substantially accelerate the conventional design-build-test-learn process. This outlook article aims to help chemists understand and begin to adopt ML predictive models for a variety of laboratory tasks, including experimental design, synthesis optimization, and materials characterization. Furthermore, this article introduces how artificial intelligence (AI) agents based on large language models can help researchers acquire background knowledge in chemical or data science and accelerate various aspects of the discovery process. We present three case studies in distinct areas to illustrate how ML models and AI agents can be leveraged to reduce time-consuming experiments and manual data analysis. Finally, we highlight existing challenges that require continued synergistic effort from both experimental and computational communities to address.

  • Universal Phase Identification of Block Copolymers From Physics‐Informed Machine Learning

    Journal of Polymer Science · 2025-01-25 · 11 citations

    articleOpen accessSenior authorCorresponding

    ABSTRACT Block copolymers play a vital role in materials science due to their diverse self‐assembly behavior. Traditionally, exploring the block copolymer self‐assembly and associated structure–property relationships involve iterative synthesis, characterization, and theory, which is labor‐intensive both experimentally and computationally. Here, we introduce a versatile, high‐throughput workflow toward materials discovery that integrates controlled polymerization and automated chromatographic separation with a novel physics‐informed machine‐learning algorithm for the rapid analysis of small‐angle X‐ray scattering data. Leveraging the expansive and high‐quality experimental data sets generated by fractionating polymers using automated chromatography, this machine‐learning method effectively reduces data dimensionality by extracting chemical‐independent features from SAXS data. This new approach allows for the rapid and accurate prediction of morphologies without repetitive and time‐consuming manual analysis, achieving out‐of‐sample predictive accuracy of around 95% for both novel and existing materials in the training data set. By focusing on a subset of samples with large predictive uncertainty, only a small fraction of the samples needs to be inspected to further improve accuracy. Collectively, the synergistic combination of controlled synthesis, automated chromatography, and data‐driven analysis creates a powerful workflow that markedly expedites the discovery of structure–property relationships in advanced soft materials.

Recent grants

Frequent coauthors

  • James O. Berger

    14 shared
  • Jesús Palomo

    12 shared
  • Xiaojing Wang

    10 shared
  • K. R. Anderson

    United States Geological Survey

    9 shared
  • Oliver James

    Institute for Basic Science

    9 shared
  • Hanmo Li

    8 shared
  • Jianzhong Wu

    University of California, Riverside

    6 shared
  • Xinyi Fang

    Chinese Academy of Medical Sciences & Peking Union Medical College

    6 shared

Labs

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Mengyang Gu

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup