
Matey Neykov
· Assistant Professor of Statistics and Data Science; Director of Graduate StudiesVerifiedNorthwestern University · Statistics
Active 2014–2026
About
Matey Neykov is an assistant professor in the Department of Statistics and Data Science at Northwestern University. His academic journey includes faculty positions at Carnegie Mellon University in the Department of Statistics & Data Science and a postdoctoral research role at Princeton University, where he worked with Professor Han Liu. He completed his PhD in biostatistics at Harvard University under the guidance of Professors Jun S. Liu and Tianxi Cai. Neykov's research interests focus on statistical machine learning, high-dimensional inference, and dimension reduction, reflecting his passion for statistics. Born and raised in Sofia, Bulgaria, he attended the Sofia High School of Mathematics and earned his bachelor's degree in applied mathematics from Sofia University. Outside of his professional work, he enjoys swimming, traveling, hiking, and photography.
Research topics
- Algorithm
- Statistics
- Computer Science
- Data Mining
- Mathematics
- Discrete mathematics
- Mathematical optimization
- Econometrics
- Combinatorics
- Mathematical economics
- Mathematical analysis
Selected publications
Information theoretic limits of robust sub-Gaussian mean estimation under star-shaped constraints
The Annals of Statistics · 2026-02-01 · 1 citations
preprintOpen accessSenior authorWe obtain the minimax rate for a mean location model with a bounded star-shaped set K⊆Rn constraint on the mean, in an adversarially corrupted data setting with Gaussian noise. We assume an unknown fraction ϵ≤1/2−κ for some fixed κ∈(0,1/2] of N observations are arbitrarily corrupted. We obtain a minimax risk up to proportionality constants under the squared ℓ2 loss of max(η∗2,σ2ϵ2)∧d2 with η∗=sup{η≥0:Nη2 σ2≤logMKloc(η,c)}, where logMKloc(η,c) denotes the local entropy of the set K, d is the diameter of K, σ2 is the variance and c is some sufficiently large absolute constant. A variant of our algorithm achieves the same rate for settings with known or symmetric sub-Gaussian noise, with a smaller breakdown point, still of constant order. We further study the case of unknown sub-Gaussian noise and show that the rate is slightly slower: max(η∗2,σ2ϵ2log(1/ϵ))∧d2. We generalize our results to the case when K is star-shaped but unbounded.
Robust mean estimation under star-shaped constraints with heavy-tailed noise
arXiv (Cornell University) · 2026-04-06
preprintOpen accessSenior authorWe study the problem of robust mean estimation with adversarially contaminated data under star-shaped constraints in a heavy-tailed noise setting, where only a finite second moment $ σ^2 $ is assumed. For a contamination level $ \varepsilon$ below some constant, we show that the minimax rate of the squared $ \ell_2 $ loss is $ \max( δ^{*2}, \varepsilon σ^2) \wedge d^2 $ for a star-shaped set with diameter $ d $ (set $d = \infty$ if the set is unbounded), with $ δ^* $ determined via the local entropy $ \log M^\mathrm{ loc }(δ,c) $ as \begin{align*} δ^*:= \sup\bigg\{δ\geq 0: N\frac{δ^2}{σ^2}\leq \log M^\mathrm{ loc }(δ,c) \bigg\}, \end{align*} where $ c $ is a sufficiently large constant. Crucially, we require that the sample size satisfies $N \gtrsim \mathop{ \sup }\limits_{δ\geq 0} \log M^\mathrm{ loc }(δ,c)$. We also show that the minimax rate is $ \max(δ^{*2},\varepsilon ^2σ^2) \wedge d^2 $ for known or sign-symmetric distributions, matching the rate achieved in the Gaussian case.
Efficient Robust Constrained Signal Detection via Kolmogorov Width Approximations
arXiv (Cornell University) · 2026-05-11
preprintOpen accessSenior authorRobust statistical inference often faces a severe computational-statistical gap when dealing with complex parameter spaces. We investigate minimax signal detection in the Gaussian sequence model under strong $ε$-contamination, where the signal belongs to a general prior constraint $K$. Existing optimal tests require computing the exact Kolmogorov $k$-width of $K$, a computationally intractable task for general non-trivial sets. We bridge this gap by proposing a polynomial-time testing framework that universally applies to balanced, type-2, and exactly 2-convex constraints. By leveraging a semidefinite programming relaxation and a modified ellipsoid method equipped with an approximate subgradient oracle, we efficiently approximate the Kolmogorov widths. Remarkably, our unconditional efficient algorithm achieves a robust detection boundary that matches existing upper bounds up to a mere polylogarithmic factor. This establishes a computationally tractable testing solution for a broad class of structured signals without requiring prior knowledge of their exact geometric complexity.
Information theoretic limits of robust sub-Gaussian mean estimation under star-shaped constraints
The Annals of Statistics · 2026-02-01
articleSenior authorWe obtain the minimax rate for a mean location model with a bounded star-shaped set K⊆Rn constraint on the mean, in an adversarially corrupted data setting with Gaussian noise. We assume an unknown fraction ϵ≤1/2−κ for some fixed κ∈(0,1/2] of N observations are arbitrarily corrupted. We obtain a minimax risk up to proportionality constants under the squared ℓ2 loss of max(η∗2,σ2ϵ2)∧d2 with η∗=sup{η≥0:Nη2 σ2≤logMKloc(η,c)}, where logMKloc(η,c) denotes the local entropy of the set K, d is the diameter of K, σ2 is the variance and c is some sufficiently large absolute constant. A variant of our algorithm achieves the same rate for settings with known or symmetric sub-Gaussian noise, with a smaller breakdown point, still of constant order. We further study the case of unknown sub-Gaussian noise and show that the rate is slightly slower: max(η∗2,σ2ϵ2log(1/ϵ))∧d2. We generalize our results to the case when K is star-shaped but unbounded.
Robust mean estimation under star-shaped constraints with heavy-tailed noise
arXiv (Cornell University) · 2026-04-06
articleOpen accessSenior authorWe study the problem of robust mean estimation with adversarially contaminated data under star-shaped constraints in a heavy-tailed noise setting, where only a finite second moment $ σ^2 $ is assumed. For a contamination level $ \varepsilon$ below some constant, we show that the minimax rate of the squared $ \ell_2 $ loss is $ \max( δ^{*2}, \varepsilon σ^2) \wedge d^2 $ for a star-shaped set with diameter $ d $ (set $d = \infty$ if the set is unbounded), with $ δ^* $ determined via the local entropy $ \log M^\mathrm{ loc }(δ,c) $ as \begin{align*} δ^*:= \sup\bigg\{δ\geq 0: N\frac{δ^2}{σ^2}\leq \log M^\mathrm{ loc }(δ,c) \bigg\}, \end{align*} where $ c $ is a sufficiently large constant. Crucially, we require that the sample size satisfies $N \gtrsim \mathop{ \sup }\limits_{δ\geq 0} \log M^\mathrm{ loc }(δ,c)$. We also show that the minimax rate is $ \max(δ^{*2},\varepsilon ^2σ^2) \wedge d^2 $ for known or sign-symmetric distributions, matching the rate achieved in the Gaussian case.
Efficient Robust Constrained Signal Detection via Kolmogorov Width Approximations
arXiv (Cornell University) · 2026-05-11
articleOpen accessSenior authorRobust statistical inference often faces a severe computational-statistical gap when dealing with complex parameter spaces. We investigate minimax signal detection in the Gaussian sequence model under strong $ε$-contamination, where the signal belongs to a general prior constraint $K$. Existing optimal tests require computing the exact Kolmogorov $k$-width of $K$, a computationally intractable task for general non-trivial sets. We bridge this gap by proposing a polynomial-time testing framework that universally applies to balanced, type-2, and exactly 2-convex constraints. By leveraging a semidefinite programming relaxation and a modified ellipsoid method equipped with an approximate subgradient oracle, we efficiently approximate the Kolmogorov widths. Remarkably, our unconditional efficient algorithm achieves a robust detection boundary that matches existing upper bounds up to a mere polylogarithmic factor. This establishes a computationally tractable testing solution for a broad class of structured signals without requiring prior knowledge of their exact geometric complexity.
Polynomial-Time Near-Optimal Estimation over Certain Type-2 Convex Bodies
Open MIND · 2025-12-27
preprint1st authorCorrespondingWe develop polynomial-time algorithms for near-optimal minimax mean estimation under $\ell_2$-squared loss in a Gaussian sequence model under convex constraints. The parameter space is an origin-symmetric, type-2 convex body $K \subset \mathbb{R}^n$, and we assume additional regularity conditions: specifically, we assume $K$ is well-balanced, i.e., there exist known radii $r, R > 0$ such that $r B_2 \subseteq K \subseteq R B_2$, as well as oracle access to the Minkowski gauge of $K$. Under these and some further assumptions on $K$, our procedures achieve the minimax rate up to small factors, depending poly-logarithmically on the dimension, while remaining computationally efficient. We further extend our methodology to the linear regression and robust heavy-tailed settings, establishing polynomial-time near-optimal estimators when the constraint set satisfies the regularity conditions above. To the best of our knowledge, these results provide the first general framework for attaining statistically near-optimal performance under such broad geometric constraints while preserving computational tractability.
Characterizing the minimax rate of nonparametric regression under bounded star-shaped constraints
Electronic Journal of Statistics · 2025-01-01
articleOpen accessSenior authorWe quantify the minimax rate for a nonparametric regression model over a star-shaped function class F with bounded diameter. We obtain a minimax rate of ε∗2∧diam(F)2 where ε∗=sup{ε≥0:nε2≤logMFloc(ε,c)}, where logMFloc(⋅,c) is the local metric entropy of F, c is some absolute constant scaling down the entropy radius, and our loss function is the squared population L2 distance over our input space X. In contrast to classical works on the topic [cf. 24], our results do not require functions in F to be uniformly bounded in sup-norm. In fact, we propose a condition that simultaneously generalizes boundedness in sup-norm and the so-called L-sub-Gaussian assumption that appears in the prior literature. In addition, we prove that our estimator is adaptive to the true point in the convex-constrained case, and to the best of our knowledge this is the first such estimator in this general setting. This work builds on the Gaussian sequence framework of [10] using a similar algorithmic scheme to achieve the minimax rate. Our algorithmic rate also applies with sub-Gaussian noise. We illustrate the utility of this theory with examples including multivariate monotone functions, linear functionals over ellipsoids, and Lipschitz classes.
Polynomial-Time Near-Optimal Estimation over Certain Type-2 Convex Bodies
ArXiv.org · 2025-12-27
articleOpen access1st authorCorrespondingWe develop polynomial-time algorithms for near-optimal minimax mean estimation under $\ell_2$-squared loss in a Gaussian sequence model under convex constraints. The parameter space is an origin-symmetric, type-2 convex body $K \subset \mathbb{R}^n$, and we assume additional regularity conditions: specifically, we assume $K$ is well-balanced, i.e., there exist known radii $r, R > 0$ such that $r B_2 \subseteq K \subseteq R B_2$, as well as oracle access to the Minkowski gauge of $K$. Under these and some further assumptions on $K$, our procedures achieve the minimax rate up to small factors, depending poly-logarithmically on the dimension, while remaining computationally efficient. We further extend our methodology to the linear regression and robust heavy-tailed settings, establishing polynomial-time near-optimal estimators when the constraint set satisfies the regularity conditions above. To the best of our knowledge, these results provide the first general framework for attaining statistically near-optimal performance under such broad geometric constraints while preserving computational tractability.
Robust density estimation over star-shaped density classes
ArXiv.org · 2025-01-17
preprintOpen accessSenior authorWe establish a novel criterion for comparing the performance of two densities, $g_1$ and $g_2$, within the context of corrupted data. Utilizing this criterion, we propose an algorithm to construct a density estimator within a star-shaped density class, $\mathcal{F}$, under conditions of data corruption. We proceed to derive the minimax upper and lower bounds for density estimation across this star-shaped density class, characterized by densities that are uniformly bounded above and below (in the sup norm), in the presence of adversarially corrupted data. Specifically, we assume that a fraction $ε\leq \frac{1}{3}$ of the $N$ observations are arbitrarily corrupted. We obtain the minimax upper bound $\max\{ τ_{\overline{J}}^2, ε\} \wedge d^2$. Under certain conditions, we obtain the minimax risk, up to proportionality constants, under the squared $L_2$ loss as $$ \max\left\{ τ^{*2} \wedge d^2, ε\wedge d^2 \right\}, $$ where $τ^* := \sup\left\{ τ: Nτ^2 \leq \log \mathcal{M}_{\mathcal{F}}^{\text{loc}}(τ, c) \right\}$ for a sufficiently large constant $c$. Here, $\mathcal{M}_{\mathcal{F}}^{\text{loc}}(τ, c)$ denotes the local entropy of the set $\mathcal{F}$, and $d$ is the $L_2$ diameter of $\mathcal{F}$.
Frequent coauthors
- 24 shared
Jun S. Liu
- 17 shared
Han Liu
- 17 shared
Yang Ning
- 13 shared
Han Liu
- 12 shared
Sivaraman Balakrishnan
- 11 shared
Junwei Lu
Harvard University
- 7 shared
Tianxi Cai
Harvard University
- 7 shared
Ilmun Kim
Education
- 2008
Ph.D., Statistics
University of Chicago
- 2004
M.S., Statistics
University of Chicago
- 2002
B.A., Mathematics
University of Chicago
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Matey Neykov
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup