Yudong Chen
· Associate ProfessorVerifiedUniversity of Wisconsin-Madison · Computer Sciences
Active 1997–2026
About
Yudong Chen is an Associate Professor in the Department of Computer Sciences at the University of Wisconsin-Madison. His research interests include machine learning, reinforcement learning, optimization, and high-dimensional statistics. His recent work focuses on reinforcement learning theory, non-convex and nonsmooth learning problems, stochastic optimization, and approximation. His research has been recognized with awards such as the NSF CAREER Award, the Vilas Associates Award, and paper awards from ACM SIGMETRICS, INFORMS, and the Applied Probability Society. Prior to his current position, Yudong Chen was an associate professor with tenure at the School of Operations Research and Information Engineering at Cornell University. He also completed a postdoctoral fellowship in the EECS Department at the University of California, Berkeley. He holds a Ph.D. in Electrical and Computer Engineering from the University of Texas at Austin and obtained his B.S. and M.S. degrees in Automation from Tsinghua University.
Research topics
- Computer Science
- Artificial Intelligence
- Machine Learning
- Computer Security
- Operating system
- Mathematics
- Algorithm
- Distributed computing
- Telecommunications
- Statistics
- Computer network
Selected publications
Frontiers in Plant Science · 2026-03-23
articleOpen accessIntroduction: Soil cadmium (Cd) contamination is considered to be one of the adverse stresses to which plants are subject. Research has demonstrated that exogenous calcium plays a crucial role in plant stress resistance. Methods: L.) exposed to Cd stress and supplied with either inorganic calcium or sorbitol-chelated calcium (SCC) at an equivalent calcium (Ca) concentration. This investigation was undertaken through integrated physiological, biochemical and transcriptomic analyses. Results: In the context of Cd stress, a marked inhibition in the growth parameters, photosynthetic activity, and root architecture of peanut seedlings was observed. This inhibition resulted in a significant accumulation of reactive oxygen species (ROS) within the plants. The application of exogenous calcium has been demonstrated to effectively alleviate Cd toxicity, with SCC exhibiting particularly notable efficacy in this regard. In comparison with Cd treatment, SCC significantly improved plant growth parameters and photosynthetic efficiency. Furthermore, SCC significantly enhanced superoxide dismutase (SOD) activity in tissues while concomitantly reducing malondialdehyde (MDA) and ROS levels, thereby mitigating membrane lipid oxidation. Concurrently, the analysis revealed that the SCC samples exhibited an upregulation of key genes, including AUX/IAA, GH3, SAUR, and JAZ. These genes have been implicated in promoting root growth and activating defence-related hormone pathways. Structural equation modelling further indicated that chlorophyll fluorescence exerted a significant positive influence on biomass accumulation, while excessive reactive oxygen species and osmotic regulators served as major inhibitory factors. Discussion: Consequently, SCC effectively mitigates Cd toxicity by stabilising photosynthetic systems, enhancing antioxidant defences, and regulating hormonal signalling, thereby promoting recovery of peanut seedling growth. The present study offers novel insights and a scientific basis for the efficient utilisation of Ca-containing fertilisers and the mitigation of heavy metal pollution in agricultural fields.
ArXiv.org · 2025-05-28
preprintOpen accessWe reveal that feedforward network (FFN) layers, rather than attention layers, are the primary contributors to Vision Transformer (ViT) inference latency, with their impact signifying as model size increases. This finding highlights a critical opportunity for optimizing the efficiency of large-scale ViTs by focusing on FFN layers. In this work, we propose a novel channel idle mechanism that facilitates post-training structural reparameterization for efficient FFN layers during testing. Specifically, a set of feature channels remains idle and bypasses the nonlinear activation function in each FFN layer, thereby forming a linear pathway that enables structural reparameterization during inference. This mechanism results in a family of ReParameterizable Vision Transformers (RePaViTs), which achieve remarkable latency reductions with acceptable sacrifices (sometimes gains) in accuracy across various ViTs. The benefits of our method scale consistently with model sizes, demonstrating greater speed improvements and progressively narrowing accuracy gaps or even higher accuracies on larger models. In particular, RePa-ViT-Large and RePa-ViT-Huge enjoy 66.8% and 68.7% speed-ups with +1.7% and +1.1% higher top-1 accuracies under the same training strategy, respectively. RePaViT is the first to employ structural reparameterization on FFN layers to expedite ViTs to our best knowledge, and we believe that it represents an auspicious direction for efficient ViTs. Source code is available at https://github.com/Ackesnal/RePaViT.
Optimal Single-Policy Sample Complexity and Transient Coverage for Average-Reward Offline RL
arXiv (Cornell University) · 2025-06-26
preprintOpen accessSenior authorWe study offline reinforcement learning in average-reward MDPs, which presents increased challenges from the perspectives of distribution shift and non-uniform coverage, and has been relatively underexamined from a theoretical perspective. While previous work obtains performance guarantees under single-policy data coverage assumptions, such guarantees utilize additional complexity measures which are uniform over all policies, such as the uniform mixing time. We develop sharp guarantees depending only on the target policy, specifically the bias span and a novel policy hitting radius, yielding the first fully single-policy sample complexity bound for average-reward offline RL. We are also the first to handle general weakly communicating MDPs, contrasting restrictive structural assumptions made in prior work. To achieve this, we introduce an algorithm based on pessimistic discounted value iteration enhanced by a novel quantile clipping technique, which enables the use of a sharper empirical-span-based penalty function. Our algorithm also does not require any prior parameter knowledge for its implementation. Remarkably, we show via hard examples that learning under our conditions requires coverage assumptions beyond the stationary distribution of the target policy, distinguishing single-policy complexity measures from previously examined cases. We also develop lower bounds nearly matching our main result.
ArXiv.org · 2025-02-03
preprintOpen accessSenior authorThis paper explores how theory can guide and enhance practical algorithms, using Low-Rank Adaptation (LoRA, Hu et al. 2022) in large language models as a case study. We rigorously prove that, under gradient descent, LoRA adapters align with specific singular subspaces of the one-step full fine-tuning gradient. This result suggests that, by properly initializing the adapters using the one-step full gradient, subspace alignment can be achieved immediately and applicable to both linear and nonlinear models. Building on our theory, we propose a theory-driven algorithm, LoRA-One, where the linear convergence (as well as generalization) is built and incorporating preconditioners theoretically helps mitigate the effects of ill-conditioning. Besides, our theory reveals connections between LoRA-One and other gradient-alignment-based methods, helping to clarify misconceptions in the design of such algorithms. LoRA-One achieves significant empirical improvements over LoRA and its variants across benchmarks in natural language understanding, mathematical reasoning, and code generation. Code is available at: https://github.com/YuanheZ/LoRA-One.
Microbial Ecology · 2025-07-02 · 2 citations
articleOpen accessThe hyporheic zone (HZ) of treated sewage-dominated rivers serves as a critical biogeochemical hotspot for dissolved organic nitrogen (DON) transformation, yet the mechanisms linking DON chemodiversity to microbial community dynamics remain poorly resolved. This study integrated spectroscopic fingerprinting, machine learning, and partial least squares path modeling (PLS-PM) to unravel the interactions between redox-stratified DON fractions and microbial consortia in two effluent-impacted rivers (Xi'an, China). The results revealed that DOM spectral parameters associated with distinct DON characteristics posed distinct effects on microbial communities, with the communities in oxic zones largely impacted by autobiogenic, aromatic, and protein-like DON, while the communities in suboxic zones were more intensely impacted by the humification degree of DON. Microbial communities exhibited redox-dependent niche differentiation; i.e., keystone taxa in oxic zones (e.g., Gamma-Proteobacteria) drove nitrogen assimilation, while suboxic taxa (e.g., Verrucomicrobia) prioritized stress-resistant D-amino acid metabolism. PLS-PM demonstrated that biomarkers exerted stronger control on nitrogen cycling (|path coefficients|> 0.6, P < 0.05) than keystone taxa, with summer communities showing higher model fit. Treated sewage-derived DON fostered specialized consortia through biochemical trade-offs, i.e., methionine recycling in oxic zones versus peptidoglycan modification in suboxic zones, thus highlighting the critical role of HZ in mitigating nitrogen pollution. These findings advance predictive modeling of DON-microbe interactions in anthropogenically perturbed aquatic ecosystems.
Resolving Recapture Dynamics of Rydberg Electrons via Laser-Driven Frustrated Tunneling Ionization
Physical Review Letters · 2025-03-26 · 5 citations
articleBy employing two-color counterrotating circularly polarized laser fields, we investigate the dynamics of electron recapture into Rydberg states under strong, ultrashort laser pulses, probed via coherent extreme-ultraviolet free-induction decay (XFID). Our study reveals significant distinctions between XFID and above-threshold high-order harmonic generation in terms of their ellipticity dependence on the driving-laser waveforms, yield variations with the laser-intensity ratios, and sensitivity to the driving-laser ellipticity. All these differences arise from the fundamentally distinct electron trajectories underlying the two processes. More importantly, our findings provide compelling evidence that Rydberg-electron recapture predominantly occurs at the end of the driving laser field, offering the first direct experimental confirmation of this long-proposed mechanism.
Cost-effective dynamic sampling in high dimensional online monitoring with advantage actor-critic
International Journal of Production Research · 2025-01-21 · 2 citations
articleAntibacterial Naphthoquinones from a Nicotiana tabacum Derived Endophytic Fusarium solani
Chemistry of Natural Compounds · 2025-01-01 · 1 citations
article1st authorCorrespondingPlants · 2025-02-04 · 1 citations
articleOpen accessCorrespondingHydrochar (HC) incorporation affects soil nitrogen (N) transformation, which could further affect the N leaching loss. We conducted a soil lysimeter experiment to evaluate the responses in terms of N leaching and rice yield to HC applied at a low (0.5%) or high (1.5%) rate, while considering three N inputs, i.e., 240, 192, and 144 kg/ha (named N240, N192, and N144, respectively). The results showed that the rice grain yield was highest (124.3 g/pot) for N192, while being significantly reduced to the minimum yield achieved in the study (110.3 g/pot) for N144. Interestingly, for the N input 144 kg/ha, HC application increased the rice grain yield by 6.9–8.0%, which was equivalent to that of N240. NH4+-N leaching occurred mainly during the first 4 weeks of the rice season, and HC did not influence NH4+-N leaching for both the N inputs, 192 and 240 kg/ha. However, compared to N144, N144 + HC1.5% recorded a significantly higher NH4+-N leaching loss of 34.6%. This suggests that the application of a high amount of HC increases the NH4+-N leaching risk when the N input is low. HC application resulted in 10.2–45.3% more NO3−-N leaching loss when the three N inputs were applied, the effect of which was significant in regard to the applications involving a 20 and 40% N reduction, but this occurred only with the applied treatments involving 1.5% HC. Moreover, we found that organic N was the main form of leachate N (>80%). More specifically, N144 + HC recorded 7.8–8.3% lower organic N leaching than N192. Based on the effects of HC on the rice grain yield and N leaching, we recommend applications involving a 40% N reduction (N144) with a lower amount of HC (HC 0.5%) to ensure high crop production and to protect the water environment.
A Piecewise Lyapunov Analysis of Sub-quadratic SGD: Applications to Robust and Quantile Regression
2025-06-04 · 2 citations
articleMotivated by robust and quantile regression problems, we investigate the stochastic gradient descent (SGD) algorithm for minimizing an objective function f that is locally strongly convex with a sub--quadratic tail. This setting covers many widely used online statistical methods. We introduce a novel piecewise Lyapunov function that enables us to handle functions f with only first-order differentiability, which includes a wide range of popular loss functions such as Huber loss. Leveraging our proposed Lyapunov function, we derive finite-time moment bounds under general diminishing stepsizes, as well as constant stepsizes. We further establish the weak convergence, central limit theorem and bias characterization under constant stepsize, providing the first geometrical convergence result for sub--quadratic SGD. Our results have wide applications, especially in online statistical methods. In particular, we discuss two applications of our results. 1) Online robust regression: We consider a corrupted linear model with sub--exponential covariates and heavy--tailed noise. Our analysis provides convergence rates comparable to those for corrupted models with Gaussian covariates and noise. 2) Online quantile regression: Importantly, our results relax the common assumption in prior work that the conditional density is continuous and provide a more fine-grained analysis for the moment bounds.
Recent grants
NSF · $369k · 2017–2022
NSF · $209k · 2021–2022
CRII: CIF: Limits and Robustness of Nonconvex Low-Rank Estimation
NSF · $175k · 2017–2020
Frequent coauthors
- 28 shared
Zhensheng Tao
- 26 shared
Jiaming Xu
- 24 shared
Zongyuan Fu
Fudan University
- 21 shared
Constantine Caramanis
- 21 shared
Bingbing Zhu
- 20 shared
Sainan Peng
Fudan University
- 20 shared
Xiaoshuai Hang
Ministry of Ecology and Environment
- 20 shared
Xiaodong Li
Labs
Convexified Modularity Maximization for Community DetectionPI
Community detection in graphs using semidefinite programming relaxation and doubly weighted k-median clustering.
Education
Ph.D., Electrical and Computer Engineering
University of Texas at Austin
M.S., Automation
Tsinghua University
B.S., Automation
Tsinghua University
Awards & honors
- NSF CAREER Award
- Vilas Associates Award
- INFORMS Paper Award from the Applied Probability Society
- Best Student Paper Award from ACM SIGMETRICS 2023
- INFORMS Applied Probability Society Best Student Paper Prize…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Yudong Chen
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup