Alexandr Andoni
· Associate ProfessorVerifiedColumbia University · Computer Science
Active 2003–2026
Research signals
Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.
Research topics
- Computer Science
- Artificial Intelligence
- Mathematics
- Algorithm
- Combinatorics
- Geometry
- Discrete mathematics
Selected publications
Nearly Optimal Attention Coresets
arXiv (Cornell University) · 2026-05-07
preprintOpen accessWe consider the problem of estimating the Attention mechanism in small space, and prove the existence of coresets for it of nearly optimal size. Specifically, we show that for any set of unit-norm keys and values $(K,V)$ in $\mathbb{R}^d$, there exists a subset $(K',V')$ of size at most $O({\sqrt{d} e^{ρ+o(ρ)}/\varepsilon})$ such that \[ \left\| \operatorname{Attn}(q,K,V)- \operatorname{Attn}(q,K',V') \right\| \le \varepsilon \] simultaneously for all queries whose norm is bounded by $ρ$. This outperforms the best known results for this problem. We also offer an improved lower bound showing that $\varepsilon$-coresets must have size $Ω({\sqrt{d} e^ρ/ε})$.
Nearly Optimal Attention Coresets
ArXiv.org · 2026-05-07
articleOpen accessWe consider the problem of estimating the Attention mechanism in small space, and prove the existence of coresets for it of nearly optimal size. Specifically, we show that for any set of unit-norm keys and values $(K,V)$ in $\mathbb{R}^d$, there exists a subset $(K',V')$ of size at most $O({\sqrt{d} e^{ρ+o(ρ)}/\varepsilon})$ such that \[ \left\| \operatorname{Attn}(q,K,V)- \operatorname{Attn}(q,K',V') \right\| \le \varepsilon \] simultaneously for all queries whose norm is bounded by $ρ$. This outperforms the best known results for this problem. We also offer an improved lower bound showing that $\varepsilon$-coresets must have size $Ω({\sqrt{d} e^ρ/ε})$.
Efficient Algorithms for Adversarially Robust Approximate Nearest Neighbor Search
arXiv (Cornell University) · 2026-01-01
preprintOpen access1st authorCorrespondingWe study the Approximate Nearest Neighbor (ANN) problem under a powerful adaptive adversary that controls both the dataset and a sequence of $Q$ queries. Primarily, for the high-dimensional regime of $d = ω(\sqrt{Q})$, we introduce a sequence of algorithms with progressively stronger guarantees. We first establish a novel connection between adaptive security and \textit{fairness}, leveraging fair ANN search to hide internal randomness from the adversary with information-theoretic guarantees. To achieve data-independent performance, we then reduce the search problem to a robust decision primitive, solved using a differentially private mechanism on a Locality-Sensitive Hashing (LSH) data structure. This approach, however, faces an inherent $\sqrt{n}$ query time barrier. To break the barrier, we propose a novel concentric-annuli LSH construction that synthesizes these fairness and differential privacy techniques. The analysis introduces a new method for robustly releasing timing information from the underlying algorithm instances and, as a corollary, also improves existing results for fair ANN. In addition, for the low-dimensional regime $d = O(\sqrt{Q})$, we propose specialized algorithms that provide a strong ``for-all'' guarantee: correctness on \textit{every} possible query with high probability. We introduce novel metric covering constructions that simplify and improve prior approaches for ANN in Hamming and $\ell_p$ spaces.
Efficient Algorithms for Adversarially Robust Approximate Nearest Neighbor Search
ArXiv.org · 2026-01-01
articleOpen access1st authorCorrespondingWe study the Approximate Nearest Neighbor (ANN) problem under a powerful adaptive adversary that controls both the dataset and a sequence of $Q$ queries. Primarily, for the high-dimensional regime of $d = ω(\sqrt{Q})$, we introduce a sequence of algorithms with progressively stronger guarantees. We first establish a novel connection between adaptive security and \textit{fairness}, leveraging fair ANN search to hide internal randomness from the adversary with information-theoretic guarantees. To achieve data-independent performance, we then reduce the search problem to a robust decision primitive, solved using a differentially private mechanism on a Locality-Sensitive Hashing (LSH) data structure. This approach, however, faces an inherent $\sqrt{n}$ query time barrier. To break the barrier, we propose a novel concentric-annuli LSH construction that synthesizes these fairness and differential privacy techniques. The analysis introduces a new method for robustly releasing timing information from the underlying algorithm instances and, as a corollary, also improves existing results for fair ANN. In addition, for the low-dimensional regime $d = O(\sqrt{Q})$, we propose specialized algorithms that provide a strong ``for-all'' guarantee: correctness on \textit{every} possible query with high probability. We introduce novel metric covering constructions that simplify and improve prior approaches for ANN in Hamming and $\ell_p$ spaces.
Embeddings into Similarity Measures for Nearest Neighbor Search
2025-12-14
article1st authorCorrespondingWe introduce the notion of metric embeddings into a similarity measure over $\mathbb{R}_{+}^{m}$, such as the weighted Jaccard coefficient. We develop average embeddings into such similarity measures for a number of metric spaces, with (appropriately defined) distortion that is smaller than the best possible or known distortion of embedding into $\ell_{1}$ or $\ell_{2}$ spaces (biLipschitz or average). We complement our embeddings with a new algorithm for Approximate Nearest Neighbor Search (ANNS) that leverages such an embedding in a black box fashion. Combining these results, we obtain new efficient algorithms for ANNS under the following two classic metrics, achieving an exponential improvement to longstanding prior work: - Edit distance over length- k strings: $\operatorname{poly}(\log k)$ approximation; - $\ell_{p}$ over $\mathbb{R}^{d}$, for $p\gt2: O(\log p)$ approximation (known to be asymptotically optimal in relevant models of computation).
A Framework for Building Data Structures from Communication Protocols
2025-06-15
preprintOpen access1st authorCorrespondingWe present a general framework for designing efficient data structures for high-dimensional pattern-matching problems ($\exists \;? i\in[n], f(x_i,y)=1$) through communication models in which $f(x,y)$ admits sublinear communication protocols with exponentially-small error. Specifically, we reduce the data structure problem to the Unambiguous Arthur-Merlin (UAM) communication complexity of $f(x,y)$ under product distributions. We apply our framework to the Partial Match problem (a.k.a, matching with wildcards), whose underlying communication problem is sparse set-disjointness. When the database consists of $n$ points in dimension $d$, and the number of $\star$'s in the query is at most $w = c\log n \;(\ll d)$, the fastest known linear-space data structure (Cole, Gottlieb and Lewenstein, STOC'04) had query time $t \approx 2^w = n^c$, which is nontrivial only when $c<1$. By contrast, our framework produces a data structure with query time $n^{1-1/(c \log^2 c)}$ and space close to linear. To achieve this, we develop a one-sided $ε$-error communication protocol for Set-Disjointness under product distributions with $\tildeΘ(\sqrt{d\log(1/ε)})$ complexity, improving on the classical result of Babai, Frankl and Simon (FOCS'86). Building on this protocol, we show that the Unambiguous AM communication complexity of $w$-Sparse Set-Disjointness with $ε$-error under product distributions is $\tilde{O}(\sqrt{w \log(1/ε)})$, independent of the ambient dimension $d$, which is crucial for the Partial Match result. Our framework sheds further light on the power of data-dependent data structures, which is instrumental for reducing to the (much easier) case of product distributions.
Faster Algorithms for Average-Case Orthogonal Vectors and Closes Pair Problems
Society for Industrial and Applied Mathematics eBooks · 2025-01-01
book-chapterWe study the average-case version of the Orthogonal Vectors problem, in which one is given as input n vectors from {0,1}d which are chosen randomly so that each coordinate is 1 independently with probability p. Kane and Williams [ITCS 2019] showed how to solve this problem in time O (n2-δp) for a constant δρ > 0 that depends only on p. However, it was previously unclear how to solve the problem faster in the hardest parameter regime where p may depend on d.
Statistical-Computational Trade-offs for Density Estimation
arXiv (Cornell University) · 2024-10-30
preprintOpen accessWe study the density estimation problem defined as follows: given $k$ distributions $p_1, \ldots, p_k$ over a discrete domain $[n]$, as well as a collection of samples chosen from a ``query'' distribution $q$ over $[n]$, output $p_i$ that is ``close'' to $q$. Recently~\cite{aamand2023data} gave the first and only known result that achieves sublinear bounds in {\em both} the sampling complexity and the query time while preserving polynomial data structure space. However, their improvement over linear samples and time is only by subpolynomial factors. Our main result is a lower bound showing that, for a broad class of data structures, their bounds cannot be significantly improved. In particular, if an algorithm uses $O(n/\log^c k)$ samples for some constant $c>0$ and polynomial space, then the query time of the data structure must be at least $k^{1-O(1)/\log \log k}$, i.e., close to linear in the number of distributions $k$. This is a novel \emph{statistical-computational} trade-off for density estimation, demonstrating that any data structure must use close to a linear number of samples or take close to linear query time. The lower bound holds even in the realizable case where $q=p_i$ for some $i$, and when the distributions are flat (specifically, all distributions are uniform over half of the domain $[n]$). We also give a simple data structure for our lower bound instance with asymptotically matching upper bounds. Experiments show that the data structure is quite efficient in practice.
Faster Algorithms for Average-Case Orthogonal Vectors and Closest Pair Problems
arXiv (Cornell University) · 2024-10-29
preprintOpen accessWe study the average-case version of the Orthogonal Vectors problem, in which one is given as input $n$ vectors from $\{0,1\}^d$ which are chosen randomly so that each coordinate is $1$ independently with probability $p$. Kane and Williams [ITCS 2019] showed how to solve this problem in time $O(n^{2 - δ_p})$ for a constant $δ_p > 0$ that depends only on $p$. However, it was previously unclear how to solve the problem faster in the hardest parameter regime where $p$ may depend on $d$. The best prior algorithm was the best worst-case algorithm by Abboud, Williams and Yu [SODA 2014], which in dimension $d = c \cdot \log n$, solves the problem in time $n^{2 - Ω(1/\log c)}$. In this paper, we give a new algorithm which improves this to $n^{2 - Ω(\log\log c /\log c)}$ in the average case for any parameter $p$. As in the prior work, our algorithm uses the polynomial method. We make use of a very simple polynomial over the reals, and use a new method to analyze its performance based on computing how its value degrades as the input vectors get farther from orthogonal. To demonstrate the generality of our approach, we also solve the average-case version of the closest pair problem in the same running time.
Statistical-Computational Trade-offs for Density Estimation
2024-01-01
article
Recent grants
AF:Small: Nearest Neighbor Search in High Dimensional Spaces
NSF · $450k · 2016–2020
AF:Small: Data-Dependent Algorithms for High-Dimensional Data
NSF · $350k · 2020–2024
Frequent coauthors
- 42 shared
Robert Krauthgamer
Weizmann Institute of Science
- 31 shared
Piotr Indyk
Moscow Institute of Thermal Technology
- 31 shared
Ilya Razenshteyn
- 23 shared
Huy L. Nguyễn
- 17 shared
Krzysztof Onak
- 12 shared
Negev Shekel Nosatzki
Columbia University
- 10 shared
Aleksandar Nikolov
University of Toronto
- 10 shared
Peilin Zhong
Google (United States)
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Alexandr Andoni
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup