
Andy Sun
· Iberdrola-Avangrid Professor in Electric Power SystemsMassachusetts Institute of Technology · Operations Research and Statistics
Active 2013–2026
About
Andy Sun holds the title of Iberdrola-Avangrid Professor in Electric Power Systems and is an Associate Professor in Operations Research at MIT Sloan. His research focuses on power systems, including the modeling and analysis of power flow equations, power grid stability, and renewable energy integration. He has contributed to the development of data-driven models and performance analysis in the field of power systems, with publications in prominent journals such as the INFORMS Journal on Optimization, Automatica, and IEEE Transactions on Power Systems. His work emphasizes the application of optimization, complex fixed point analysis, and holomorphic dynamics to address critical challenges in electric power systems, aiming to improve stability, efficiency, and sustainability in energy management.
Research topics
- Mathematics
- Mathematical analysis
- Pure mathematics
- Computer Science
- Artificial Intelligence
- Algorithm
- Geometry
- Applied mathematics
Selected publications
A synthetic data approach for FDR control in change-point detection
Statistics Innovation · 2026-01-01
articleOpen access1st authorCorrespondingIn multiple change-point analysis, the resulting detection sets are typically conservative, often identifying more change points than actually exist, due to the issues of 'unreliability of assumptions' and 'unreliability of algorithms'. Therefore, controlling the false discovery rate is of vital importance to multiple change-point detection. Data-splitting-based methods have gained widespread attention for false discovery rate control. However, relying solely on a part of the dataset during the validation stage typically suffers from power loss. Instead, the study introduces a novel synthetic data framework and proposes the Synthetic Data Filter to control the false discovery rate in multiple change-point detection. Here, the study demonstrates that the proposed method effectively controls the false discovery rate and achieves asymptotic power approaching one under mild conditions. Numerical comparisons with existing methods provide evidence for the superiority of the approach in terms of both false discovery rate control and statistical power. The proposed method is further applied to a bladder tumor microarray dataset, and potential loci are identified with structural changes.
On the long‐time limit of the mean curvature flow in closed manifolds
Journal of the London Mathematical Society · 2026-01-01
articleSenior authorAbstract In this article, we show that generally almost regular flows, introduced by Bamler and Kleiner, in closed 3‐manifolds will either go extinct in finite time or flow to a collection of smooth embedded minimal surfaces, possibly with multiplicity. Using a perturbative argument, then we construct piecewise almost regular flows that either go extinct in finite time or flow to a stable minimal surface, possibly with multiplicity. We apply these results to construct minimal surfaces in 3‐manifolds in a variety of circumstances, mainly novel from the point of the view that the arguments are via parabolic methods.
Rigidity of spherical product Ricci solitons
Communications in Analysis and Geometry · 2025-01-01
article1st authorCorrespondingRegularity of cylindrical singular sets of mean curvature flow
ArXiv.org · 2025-09-01
preprintOpen access1st authorCorrespondingIn this paper, we study the $k$-cylindrical singular set of mean curvature flow in $\mathbb R^{n+1}$ for each $1\leq k\leq n-1$. We prove that they are locally contained in a $k$-dimensional $C^{2,α}$-submanifold after removing some lower-dimensional parts. Moreover, if the $k$-cylindrical singular set is a $k$-submanifold, then its curvature is determined by the asymptotic profile of the flow at these singularities. As a byproduct, we provide a detailed asymptotic profile and graphical radius estimate at these singularities. The proof is based on a new $L^2$-distance non-concentration property that we introduced in [SWX25], modified into a relative version that allows us to modulo those low eigenmodes that are not decaying fast enough and do not contribute to the curvature of the singular set.
On Mean Curvature Flow Translators with Prescribed Ends
Archive for Rational Mechanics and Analysis · 2025-08-18
articleOpen access1st authorCorrespondingAbstract Given a smooth closed embedded self-shrinker S with index I in $$\mathbb {R}^{n}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:msup> <mml:mrow> <mml:mi>R</mml:mi> </mml:mrow> <mml:mi>n</mml:mi> </mml:msup> </mml:math> , we construct an I -dimensional family of complete translators polynomially asymptotic to $$S\times \mathbb {R}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>S</mml:mi> <mml:mo>×</mml:mo> <mml:mi>R</mml:mi> </mml:mrow> </mml:math> at infinity, which answers a long-standing question by Ilmanen. We further prove that $$\mathbb {R}^{n+1}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:msup> <mml:mrow> <mml:mi>R</mml:mi> </mml:mrow> <mml:mrow> <mml:mi>n</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn> </mml:mrow> </mml:msup> </mml:math> can be decomposed in many ways into a one-parameter family of closed sets $$\coprod _{a\in \mathbb {R}} T_a$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:msub> <mml:mo>∐</mml:mo> <mml:mrow> <mml:mi>a</mml:mi> <mml:mo>∈</mml:mo> <mml:mi>R</mml:mi> </mml:mrow> </mml:msub> <mml:msub> <mml:mi>T</mml:mi> <mml:mi>a</mml:mi> </mml:msub> </mml:mrow> </mml:math> , and each closed set $$T_a$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:msub> <mml:mi>T</mml:mi> <mml:mi>a</mml:mi> </mml:msub> </mml:math> contains a complete translator asymptotic to $$S\times \mathbb {R}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>S</mml:mi> <mml:mo>×</mml:mo> <mml:mi>R</mml:mi> </mml:mrow> </mml:math> at infinity. If the closed set $$T_a$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:msub> <mml:mi>T</mml:mi> <mml:mi>a</mml:mi> </mml:msub> </mml:math> fattens, namely it has nonempty interior, then there are at least two translators asymptotic to each other at an exponential rate, which can be viewed as a kind of nonuniqueness. We show that this fattening phenomenon is non-generic but indeed happens.
Generic dynamics of mean curvature flows with asymptotically conical singularities
Science China Mathematics · 2025-07-03 · 1 citations
article1st authorCorrespondingMiniCPM4: Ultra-Efficient LLMs on End Devices
ArXiv.org · 2025-06-09
preprintOpen accessThis paper introduces MiniCPM4, a highly efficient large language model (LLM) designed explicitly for end-side devices. We achieve this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems. Specifically, in terms of model architecture, we propose InfLLM v2, a trainable sparse attention mechanism that accelerates both prefilling and decoding phases for long-context processing. Regarding training data, we propose UltraClean, an efficient and accurate pre-training data filtering and generation strategy, and UltraChat v2, a comprehensive supervised fine-tuning dataset. These datasets enable satisfactory model performance to be achieved using just 8 trillion training tokens. Regarding training algorithms, we propose ModelTunnel v2 for efficient pre-training strategy search, and improve existing post-training methods by introducing chunk-wise rollout for load-balanced reinforcement learning and data-efficient tenary LLM, BitCPM. Regarding inference systems, we propose CPM.cu that integrates sparse attention, model quantization, and speculative sampling to achieve efficient prefilling and decoding. To meet diverse on-device requirements, MiniCPM4 is available in two versions, with 0.5B and 8B parameters, respectively. Furthermore, we construct a hybrid reasoning model, MiniCPM4.1, which can be used in both deep reasoning mode and non-reasoning mode. Evaluation results demonstrate that MiniCPM4 and MiniCPM4.1 outperform similar-sized open-source models across benchmarks, with the 8B variants showing significant speed improvements on long sequence understanding and generation.
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
ArXiv.org · 2025-02-20
preprintOpen accessSpeculative sampling has emerged as an important technique for accelerating the auto-regressive generation process of large language models (LLMs) by utilizing a draft-then-verify mechanism to produce multiple tokens per forward pass. While state-of-the-art speculative sampling methods use only a single layer and a language modeling (LM) head as the draft model to achieve impressive layer compression, their efficiency gains are substantially reduced for large-vocabulary LLMs, such as Llama-3-8B with a vocabulary of 128k tokens. To address this, we present FR-Spec, a frequency-ranked speculative sampling framework that optimizes draft candidate selection through vocabulary space compression. By constraining the draft search to a frequency-prioritized token subset, our method reduces LM Head computation overhead by 75% while ensuring the equivalence of the final output distribution. Experiments across multiple datasets demonstrate an average of 1.12$\times$ speedup over the state-of-the-art speculative sampling method EAGLE-2. Code available at https://github.com/thunlp/FR-Spec.
Mean curvature flow with multiplicity 2 convergence in manifolds
Mathematische Annalen · 2025-04-03
articleOpen accessSenior authorAbstract We construct new examples of immortal mean curvature flow of smooth embedded connected hypersurfaces in manifolds, which converge to minimal hypersurfaces with multiplicity 2 as time approaches infinity.
ArXiv.org · 2025-11-05
preprintOpen accessReinforcement learning (RL) post-training has become a trending paradigm for enhancing the capabilities of large language models (LLMs). Most existing RL systems for LLMs operate in a fully synchronous manner, where training must wait for the rollout of an entire batch to complete. This design leads to severe inefficiencies, as extremely long trajectories can stall the entire rollout process and leave many GPUs idle. To address this issue, we propose Concurrency- Controlled Partial Rollout with Importance Sampling (CoPRIS), which mitigates long-tail inefficiencies by maintaining a fixed number of concurrent rollouts, early-terminating once sufficient samples are collected, and reusing unfinished trajectories in subsequent rollouts. To mitigate the impact of off-policy trajectories, we introduce Cross-stage Importance Sampling Correction, which concatenates buffered log probabilities from the previous policy with those recomputed under the current policy for importance sampling correction. Experiments on challenging mathematical reasoning benchmarks show that CoPRIS achieves up to 1.94x faster training while maintaining comparable or superior performance to synchronous RL systems. The code of CoPRIS is available at https://github.com/777pomingzi/CoPRIS.
Frequent coauthors
- 4 shared
Zhengjiang Lin
- 4 shared
Zhichao Zhang
- 3 shared
Douglas Stryker
- 3 shared
Zhichao Wang
China Medical University
- 3 shared
Xin Zhou
Cornell University
- 3 shared
Julius Baldauf
Massachusetts Institute of Technology
- 3 shared
Jinxin Xue
- 2 shared
Jingwen Chen
Awards & honors
- Iberdrola-Avangrid Professor in Electric Power Systems
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Andy Sun
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup