
William Bialek
· John Archibald Wheeler/Battelle Professor in Theoretical PhysicsVerifiedPrinceton University · Physics
Active 1938–2025
About
William Bialek is a Professor of Physics and a co-Director at the Center for the Physics of Biological Function (CPBF), an NSF Physics Frontier Center. He is also a Lewis-Sigler Institute faculty member at Princeton University and holds the position of Visiting Professor of Physics at The Graduate Center, CUNY. His research focuses on the physics of biological function, contributing to the understanding of biological systems through the lens of physics. Bialek's work involves exploring the fundamental principles underlying biological processes, integrating concepts from physics to analyze complex biological phenomena.
Research topics
- Computer science
- Statistical physics
- Physics
- Artificial intelligence
- Mathematics
Selected publications
Princeton University Press eBooks · 2025-12-04
book-chapter1st authorCorrespondingNeural subspaces, minimax entropy, and mean-field theory for networks of neurons
ArXiv.org · 2025-08-04
preprintOpen accessSenior authorRecent advances in experimental techniques enable the simultaneous recording of activity from thousands of neurons in the brain, presenting both an opportunity and a challenge: to build meaningful, scalable models of large neural populations. Correlations in the brain are typically weak but widespread, suggesting that a mean-field approach might be effective in describing real neural populations, and we explore a hierarchy of maximum entropy models guided by this idea. We begin with models that match only the mean and variance of the total population activity, and extend to models that match the experimentally observed mean and variance of activity along multiple projections of the neural state. Confronted by data from several different brain regions, these models are driven toward a first-order phase transition, characterized by the presence of two nearly degenerate minima in the energy landscape, and this leads to predictions in qualitative disagreement with other features of the data. To resolve this problem we introduce a novel class of models that constrain the full probability distribution of activity along selected projections. We develop the mean-field theory for this class of models and apply it to recordings from 1000+ neurons in the mouse hippocampus. This 'distributional mean--field' model provides an accurate and consistent description of the data, offering a scalable and principled approach to modeling complex neural population dynamics.
Context dependent adaptation in a neural computation
ArXiv.org · 2025-09-01
preprintOpen accessBrains adapt to the statistical structure of their input. In the visual system, local light intensities change rapidly, the variance of the intensity changes more slowly, and the dynamic range of contrast itself changes more slowly still. We use a motion-sensitive neuron in the fly visual system to probe this hierarchy of adaptation phenomena, delivering naturalistic stimuli that have been simplified to have a clear separation of time scales. We show that the neural response to visual motion depends on contrast, and this dependence itself varies with context. Using the spike-triggered average velocity trajectory as a response measure, we find that context dependence is confined to a low-dimensional space, with a single dominant dimension. Across a wide range of conditions this adaptation serves to match the integration time to the mean interval between spikes, reducing redundancy.
Maximum entropy models for patterns of gene expression
Physical review. E · 2025-06-24 · 2 citations
articleSenior authorNew experimental methods make it possible to measure the expression levels of many genes, simultaneously, in snapshots from thousands or even millions of individual cells. Current approaches to analyze these experiments involve clustering or low-dimensional projections, and often start with the assumption that distinct cell types exist. Here we use the principle of maximum entropy to obtain a probabilistic description that captures the observed presence or absence of mRNAs from hundreds of genes in cells from the mammalian brain. We construct the Ising model compatible with experimental means and pairwise correlations, and validate it by showing that it gives good predictions for higher-order statistics. We find that the probability distribution of cell states has many local maxima. Grouping cells according to these maxima (or energy minima) gives a classification in good agreement with currently assigned cell types. We show that when assignments disagree our model is dividing cell types into subtypes with clearly distinguishable expression patterns. These results make concrete the intuition that types or classes of cells are emergent behaviors.
Optimization and variability can coexist.
PubMed · 2025-05-29
preprintOpen accessthat we should observe widely varying parameters, and we make this precise: the entropy in parameter space can be extensive even if performance on average is very close to optimal. This removes a major objection to optimization as a general principle, and rationalizes the observed variability.
When many noisy genes optimize information flow
ArXiv.org · 2025-12-16
preprintOpen accessSenior authorIt often is emphasized that gene expression is noisy. A seemingly contradictory view is that control mechanisms have been optimized to squeeze as much information as possible out of a limited number of molecules. Here we revisit these issues in a simple model where a single transcription factor (TF) controls a large number of target genes. We include only the physically required noise sources: random arrival of TFs at their targets and counting noise in the synthesis and degradation of mRNA. If the cell has a limited total number of mRNA molecules, then the capacity to transmit information about TF concentration is maximized when these resources are distributed across the largest possible number of target genes. To realize this capacity the distribution of TF concentrations must be biased toward smaller values. Thus, in some limits, information transmission is optimized when individual expression levels are noisy. In addition, the dependence of information transmission on the parameters of this multi-gene system has a "sloppy" spectrum, so that optimal performance can co-exist with substantial variability.
Princeton University Press eBooks · 2025-10-28
book-chapter1st authorCorrespondingExactly solvable statistical physics models for large neuronal populations
Physical Review Research · 2025-05-19 · 6 citations
preprintOpen accessMaximum-entropy methods provide a principled path connecting measurements of neural activity directly to statistical physics models, and this approach has been successful for populations of <a:math xmlns:a="http://www.w3.org/1998/Math/MathML"><a:mrow><a:mi>N</a:mi><a:mo>∼</a:mo><a:mn>100</a:mn></a:mrow></a:math> neurons. As <b:math xmlns:b="http://www.w3.org/1998/Math/MathML"><b:mi>N</b:mi></b:math> increases in new experiments, we enter an undersampled regime where we have to choose which observables should be constrained in the maximum-entropy construction. The best choice is the one that provides the greatest reduction in entropy, defining a “minimax entropy” principle. This principle becomes tractable if we restrict attention to correlations among pairs of neurons that link together into a tree; we can find the best tree efficiently, and the underlying statistical physics models are exactly solved. We use this approach to analyze experiments on <c:math xmlns:c="http://www.w3.org/1998/Math/MathML"><c:mrow><c:mi>N</c:mi><c:mo>∼</c:mo><c:mn>1500</c:mn></c:mrow></c:math> neurons in the mouse hippocampus, and we find that the resulting model captures key features of collective activity in the network.
Deriving a genetic regulatory network from an optimization principle
Proceedings of the National Academy of Sciences · 2025-01-03 · 18 citations
articleOpen accessCorrespondingMany biological systems operate near the physical limits to their performance, suggesting that aspects of their behavior and underlying mechanisms could be derived from optimization principles. However, such principles have often been applied only in simplified models. Here, we explore a detailed mechanistic model of the gap gene network in the Drosophila embryo, optimizing its 50+ parameters to maximize the information that gene expression levels provide about nuclear positions. This optimization is conducted under realistic constraints, such as limits on the number of available molecules. Remarkably, the optimal networks we derive closely match the architecture and spatial gene expression profiles observed in the real organism. Our framework quantifies the tradeoffs involved in maximizing functional performance and allows for the exploration of alternative network configurations, addressing the question of which features are necessary and which are contingent. Our results suggest that multiple solutions to the optimization problem might exist across closely related organisms, offering insights into the evolution of gene regulatory networks.
Large language models and the entropy of English
arXiv (Cornell University) · 2025-12-31
preprintOpen accessSenior authorWe use large language models (LLMs) to uncover long-ranged structure in English texts from a variety of sources. The conditional entropy or code length in many cases continues to decrease with context length at least to $N\sim 10^4$ characters, implying that there are direct dependencies or interactions across these distances. A corollary is that there are small but significant correlations between characters at these separations, as we show from the data independent of models. The distribution of code lengths reveals an emergent certainty about an increasing fraction of characters at large $N$. Over the course of model training, we observe different dynamics at long and short context lengths, suggesting that long-ranged structure is learned only gradually. Our results constrain efforts to build statistical physics models of LLMs or language itself.
Recent grants
Mechanisms of neural circuit dynamics in working memory
NIH · $3.1M · 2014–2018
NIH · $1.2M · 2012
A new paradigm for quantifying animal behavior in a model genetic system
NIH · $1.6M · 2011–2016
Coarse-graining approaches to networks, learning, and behavior
NIH · $707k · 2018–2022
Coarse-graining approaches to networks, learning, and behavior
NIH · $387k · 2018–2021
Frequent coauthors
- 72 shared
Thomas Gregor
Centre National de la Recherche Scientifique
- 43 shared
Eric Wieschaus
Princeton University
- 38 shared
Gašper Tkačik
Institute of Science and Technology Austria
- 30 shared
Christopher W. Lynn
The Graduate Center, CUNY
- 28 shared
Rob R. de Ruyter van Steveninck
Indiana University Bloomington
- 26 shared
David W. Tank
Princeton University
- 23 shared
Mariela D. Petkova
Harvard University
- 23 shared
Olivier Marre
Sorbonne Université
Labs
Center for the Physics of Biological FunctionPI
Education
- 1986
Postdoctoral, Theoretical Physics
University of California, Santa Barbara
- 1984
Postdoctoral, Physics
Rijksuniversiteit Groningen
- 1983
PhD, Biophysics
University of California, Berkeley
- 1979
AB, Biophysics
University of California, Berkeley
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with William Bialek
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup