
About
Surya Ganguli is an Associate Professor of Applied Physics at Stanford University, with courtesy appointments in Neurobiology and Electrical Engineering. His research focuses on theoretical neuroscience and machine learning, aiming to understand how networks of neurons and synapses cooperate across multiple scales of space and time to mediate functions such as sensory perception, motor control, and memory. His lab employs and extends tools from disciplines including statistical mechanics, dynamical systems theory, information theory, control theory, and high-dimensional statistics, often collaborating with experimental neuroscience laboratories to analyze physiological data from various model organisms. His work encompasses topics such as how birds learn to sing, spatial memory in the rodent hippocampus, attention and motor control in macaques, properties of complex synapses, dynamics of plasticity in recurrent networks, signal propagation in neural circuits, the emergence of categorization in multi-layered networks, and the statistical mechanics of high-dimensional data analysis. Additionally, he employs techniques from statistical mechanics, like replica theory and random matrix theory, to analyze the complex dynamics of learning, signal propagation, and memory in neuronal networks, as well as to evaluate the performance of machine learning algorithms that could be implemented in neuronal architectures.
Research topics
- Artificial Intelligence
- Computer Science
- Machine Learning
- Biology
- Cognitive science
- Psychology
- Neuroscience
- Algorithm
- Mathematics
- Programming language
- Data science
Selected publications
From Kepler to Newton: Inductive Biases Guide Learned World Models in Transformers
Open MIND · 2026-02-06
preprintCan general-purpose AI architectures go beyond prediction to discover the physical laws governing the universe? True intelligence relies on "world models" -- causal abstractions that allow an agent to not only predict future states but understand the underlying governing dynamics. While previous "AI Physicist" approaches have successfully recovered such laws, they typically rely on strong, domain-specific priors that effectively "bake in" the physics. Conversely, Vafa et al. recently showed that generic Transformers fail to acquire these world models, achieving high predictive accuracy without capturing the underlying physical laws. We bridge this gap by systematically introducing three minimal inductive biases. We show that ensuring spatial smoothness (by formulating prediction as continuous regression) and stability (by training with noisy contexts to mitigate error accumulation) enables generic Transformers to surpass prior failures and learn a coherent Keplerian world model, successfully fitting ellipses to planetary trajectories. However, true physical insight requires a third bias: temporal locality. By restricting the attention window to the immediate past -- imposing the simple assumption that future states depend only on the local state rather than a complex history -- we force the model to abandon curve-fitting and discover Newtonian force representations. Our results demonstrate that simple architectural choices determine whether an AI becomes a curve-fitter or a physicist, marking a critical step toward automated scientific discovery.
Deriving Neural Scaling Laws from the statistics of natural language
ArXiv.org · 2026-02-07
articleOpen accessDespite the fact that experimental neural scaling laws have substantially guided empirical progress in large-scale machine learning, no existing theory can quantitatively predict the exponents of these important laws for any modern LLM trained on any natural language dataset. We provide the first such theory in the case of data-limited scaling laws. We isolate two key statistical properties of language that alone can predict neural scaling exponents: (i) the decay of pairwise token correlations with time separation between token pairs, and (ii) the decay of the next-token conditional entropy with the length of the conditioning context. We further derive a simple formula in terms of these statistics that predicts data-limited neural scaling exponents from first principles without any free parameters or synthetic data models. Our theory exhibits a remarkable match with experimentally measured neural scaling laws obtained from training GPT-2 and LLaMA style models from scratch on two qualitatively different benchmarks, TinyStories and WikiText.
Contrastive Concept-Tree Search for LLM-Assisted Algorithm Discovery
Open MIND · 2026-02-03
preprintSenior authorLarge language Model (LLM)-assisted algorithm discovery is an iterative, black-box optimization process over programs to approximatively solve a target task, where an LLM proposes candidate programs and an external evaluator provides task feedback. Despite intense recent research on the topic and promising results, how can the LLM internal representation of the space of possible programs be maximally exploited to improve performance is an open question. Here, we introduce Contrastive Concept-Tree Search (CCTS), which extracts a hierarchical concept representation from the generated programs and learns a contrastive concept model that guides parent selection. By reweighting parents using a likelihood-ratio score between high- and low-performing solutions, CCTS biases search toward useful concept combinations and away from misleading ones, providing guidance through an explicit concept hierarchy rather than the algorithm lineage constructed by the LLM. We show that CCTS improves search efficiency over fitness-based baselines and produces interpretable, task-specific concept trees across a benchmark of open Erdős-type combinatorics problems. Our analysis indicates that the gains are driven largely by learning which concepts to avoid. We further validate these findings in a controlled synthetic algorithm-discovery environment, which reproduces qualitatively the search dynamics observed with the LLMs.
Reshaping Global Loop Structure to Accelerate Local Optimization by Smoothing Rugged Landscapes
Open MIND · 2026-02-01
preprintSenior authorProbabilistic graphical models with frustration exhibit rugged energy landscapes that trap iterative optimization dynamics. These landscapes are shaped not only by local interactions, but crucially also by the global loop structure of the graph. The famous Bethe approximation treats the graph as a tree, effectively ignoring global structure, thereby limiting its effectiveness for optimization. Loop expansions capture such global structure in principle, but are often impractical due to combinatorial explosion. The $M$-layer construction provides an alternative: make $M$ copies of the graph and reconnect edges between them uniformly at random. This provides a controlled sequence of approximations from the original graph at $M=1$, to the Bethe approximation as $M \rightarrow \infty$. Here we generalize this construction by replacing uniform random rewiring with a structured mixing kernel $Q$ that sets the probability that any two layers are interconnected. As a result, the global loop structure can be shaped without modifying local interactions. We show that, after this copy-and-reconnect transformation, there exists a regime in which layer-to-layer fluctuations decay, increasing the probability of reaching the global minimum of the energy function of the original graph. This yields a highly general and practical tool for optimization. Using this approach, the computational cost required to reach these optimal solutions is reduced across sparse and dense Ising benchmarks, including spin glasses and planted instances. When combined with replica-exchange Monte Carlo, the same construction increases the polynomial-time algorithmic threshold for the maximum independent set problem. A cavity analysis shows that structured inter-layer coupling significantly smooths rugged energy landscapes by collapsing configurational complexity and suppressing many suboptimal metastable states.
Solving adversarial examples requires solving exponential misalignment
arXiv (Cornell University) · 2026-03-03
articleOpen accessSenior authorAdversarial attacks - input perturbations imperceptible to humans that fool neural networks - remain both a persistent failure mode in machine learning, and a phenomenon with mysterious origins. To shed light, we define and analyze a network's perceptual manifold (PM) for a class concept as the space of all inputs confidently assigned to that class by the network. We find, strikingly, that the dimensionalities of neural network PMs are orders of magnitude higher than those of natural human concepts. Since volume typically grows exponentially with dimension, this suggests exponential misalignment between machines and humans, with exponentially many inputs confidently assigned to concepts by machines but not humans. Furthermore, this provides a natural geometric hypothesis for the origin of adversarial examples: because a network's PM fills such a large region of input space, any input will be very close to any class concept's PM. Our hypothesis thus suggests that adversarial robustness cannot be attained without dimensional alignment of machine and human PMs, and therefore makes strong predictions: both robust accuracy and distance to any PM should be negatively correlated with the PM dimension. We confirmed these predictions across 18 different networks of varying robust accuracy. Crucially, we find even the most robust networks are still exponentially misaligned, and only the few PMs whose dimensionality approaches that of human concepts exhibit alignment to human perception. Our results connect the fields of alignment and adversarial examples, and suggest the curse of high dimensionality of machine PMs is a major impediment to adversarial robustness.
From Kepler to Newton: Inductive Biases Guide Learned World Models in Transformers
arXiv (Cornell University) · 2026-02-06
articleOpen accessCan general-purpose AI architectures go beyond prediction to discover the physical laws governing the universe? True intelligence relies on "world models" -- causal abstractions that allow an agent to not only predict future states but understand the underlying governing dynamics. While previous "AI Physicist" approaches have successfully recovered such laws, they typically rely on strong, domain-specific priors that effectively "bake in" the physics. Conversely, Vafa et al. recently showed that generic Transformers fail to acquire these world models, achieving high predictive accuracy without capturing the underlying physical laws. We bridge this gap by systematically introducing three minimal inductive biases. We show that ensuring spatial smoothness (by formulating prediction as continuous regression) and stability (by training with noisy contexts to mitigate error accumulation) enables generic Transformers to surpass prior failures and learn a coherent Keplerian world model, successfully fitting ellipses to planetary trajectories. However, true physical insight requires a third bias: temporal locality. By restricting the attention window to the immediate past -- imposing the simple assumption that future states depend only on the local state rather than a complex history -- we force the model to abandon curve-fitting and discover Newtonian force representations. Our results demonstrate that simple architectural choices determine whether an AI becomes a curve-fitter or a physicist, marking a critical step toward automated scientific discovery.
arXiv (Cornell University) · 2026-05-12
preprintOpen accessUnderstanding what individual neurons encode is a core question in neuroscience. In primary visual cortex (V1), mathematical models (e.g., Gabor functions) capture neural selectivity, but no comparable framework exists for higher areas. We show that natural language can fill this role: across macaque V1 and V4, the selectivity of most neurons is captured by concise, verifiable semantic descriptions. Using digital twins of V1 and V4, we develop a closed-loop framework that translates each neuron's high- and low-activating images into dense captions, generates a semantic hypothesis and synthesized images, and verifies the hypothesis in silico. Descriptions range from oriented edges and spatial frequency in V1 to conjunctions of form, color, and texture in V4. In V4, images generated from activating and suppressing hypotheses drove 96.1% of neurons above the 95th and 97.6% below the 5th percentile of natural-image responses, respectively (vs. ~10% for random images); V1 activation results matched V4, while V1 suppression was less describable in language. Representational similarity analysis reveals partial alignment between neural activity, vision embeddings, and language embeddings, with vision most aligned to neural activity; alignment lost in the text bottleneck is recovered when hypotheses are rendered back into images, showing that linguistic compression is lossy yet semantically faithful. Together, these results show that combining generative models with neural digital twins enables interpretable, testable descriptions of neural function at scale, toward agentic scientific discovery.
Gaussian Process Inference Reveals Non-separability of Position and Velocity Tuning in Grid Cells
bioRxiv (Cold Spring Harbor Laboratory) · 2026-02-04
articleOpen accessGrid cells in medial entorhinal cortex (MEC) support spatial navigation by responding to multiple variables, including position, speed, and head direction. While tuning curves for each of these variables have been examined individually at the level of single-cells, less is known about the conjunctive coding of grid cells for these properties. To investigate this, we analyzed neural recordings of freely foraging rats and constructed four-dimensional (4D) tuning curves across 2D position and 2D velocity. In order to combat the sparse sampling of such a large behavioral space, we applied Gaussian Process (GP) methods to estimate firing rates at un-sampled points. Comparing GP model-derived tuning curves to those predicted by a fully separable model revealed that some cells exhibited significant non-separability of position and velocity tuning, and suggested a data coverage threshold necessary to observe this non-separability. In summary, our use of GPs allowed us to distinguish interactions in position-velocity tuning across a 4D behavioral space that have not been apparent in 2D analyses.
Deriving Neural Scaling Laws from the statistics of natural language
Open MIND · 2026-02-07
preprintDespite the fact that experimental neural scaling laws have substantially guided empirical progress in large-scale machine learning, no existing theory can quantitatively predict the exponents of these important laws for any modern LLM trained on any natural language dataset. We provide the first such theory in the case of data-limited scaling laws. We isolate two key statistical properties of language that alone can predict neural scaling exponents: (i) the decay of pairwise token correlations with time separation between token pairs, and (ii) the decay of the next-token conditional entropy with the length of the conditioning context. We further derive a simple formula in terms of these statistics that predicts data-limited neural scaling exponents from first principles without any free parameters or synthetic data models. Our theory exhibits a remarkable match with experimentally measured neural scaling laws obtained from training GPT-2 and LLaMA style models from scratch on two qualitatively different benchmarks, TinyStories and WikiText.
ArXiv.org · 2026-05-12
articleOpen accessUnderstanding what individual neurons encode is a core question in neuroscience. In primary visual cortex (V1), mathematical models (e.g., Gabor functions) capture neural selectivity, but no comparable framework exists for higher areas. We show that natural language can fill this role: across macaque V1 and V4, the selectivity of most neurons is captured by concise, verifiable semantic descriptions. Using digital twins of V1 and V4, we develop a closed-loop framework that translates each neuron's high- and low-activating images into dense captions, generates a semantic hypothesis and synthesized images, and verifies the hypothesis in silico. Descriptions range from oriented edges and spatial frequency in V1 to conjunctions of form, color, and texture in V4. In V4, images generated from activating and suppressing hypotheses drove 96.1% of neurons above the 95th and 97.6% below the 5th percentile of natural-image responses, respectively (vs. ~10% for random images); V1 activation results matched V4, while V1 suppression was less describable in language. Representational similarity analysis reveals partial alignment between neural activity, vision embeddings, and language embeddings, with vision most aligned to neural activity; alignment lost in the text bottleneck is recovered when hypotheses are rendered back into images, showing that linguistic compression is lossy yet semantically faithful. Together, these results show that combining generative models with neural digital twins enables interpretable, testable descriptions of neural function at scale, toward agentic scientific discovery.
Recent grants
NSF · $500k · 2019–2025
Computational and Circuit Mechanisms for information transmission in the brain
NIH · $3.7M · 2015–2019
NIH · $198k · 2017–2020
NIH · $238k · 2017–2019
Frequent coauthors
- 49 shared
Krishna V. Shenoy
Howard Hughes Medical Institute
- 29 shared
Niru Maheswaranathan
Stanford University
- 28 shared
Aran Nayebi
Massachusetts Institute of Technology
- 26 shared
Mark J. Schnitzer
Stanford University
- 26 shared
Daniel Yamins
- 26 shared
Stephen I. Ryu
Korea Research Institute of Bioscience and Biotechnology
- 22 shared
Alex H. Williams
New York University
- 20 shared
Jascha Sohl‐Dickstein
Education
- 2001
Ph.D., Physics
Stanford University
- 1996
B.S., Physics
Harvard University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Surya Ganguli
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup