
Fang Han
· ProfessorVerifiedUniversity of Washington · Economics
Active 2011–2025
Research topics
- Computer Science
- Artificial Intelligence
- Biology
- Applied mathematics
- Neuroscience
- Psychology
- Statistics
- Mathematics
- Developmental psychology
- Combinatorics
- Computational biology
- Materials science
- Econometrics
- Genetics
- Cognitive science
- Nanotechnology
- Chemistry
- Mathematical analysis
Selected publications
Bias correction for Chatterjee's graph-based correlation coefficient
ArXiv.org · 2025-08-12
preprintOpen accessSenior authorAzadkia and Chatterjee (2021) recently introduced a simple nearest neighbor (NN) graph-based correlation coefficient that consistently detects both independence and functional dependence. Specifically, it approximates a measure of dependence that equals 0 if and only if the variables are independent, and 1 if and only if they are functionally dependent. However, this NN estimator includes a bias term that may vanish at a rate slower than root-$n$, preventing root-$n$ consistency in general. In this article, we (i) analyze this bias term closely and show that it could become asymptotically negligible when the dimension is smaller than four; and (ii) propose a bias-correction procedure for more general settings. In both regimes, we obtain estimators (either the original or the bias-corrected version) that are root-$n$ consistent and asymptotically normal.
Biometrics · 2025-06-30
article1st authorCorrespondingA sliced Wasserstein and diffusion approach to random coefficient models
ArXiv.org · 2025-02-07
preprintOpen accessSenior authorWe propose a new minimum-distance estimator for linear random coefficient models. This estimator integrates the recently advanced sliced Wasserstein distance with the nearest neighbor methods, both of which enhance computational efficiency. We demonstrate that the proposed method is consistent in approximating the true distribution. Moreover, our formulation naturally leads to a diffusion process-based algorithm and is closely connected to treatment effect distribution estimation -- both of which are of independent interest and hold promise for broader applications.
On regression-adjusted imputation estimators of average treatment effects
Journal of Econometrics · 2025-08-26
articleSenior authorCorrespondingJournal of Assisted Reproduction and Genetics · 2025-02-12
articleOpen accessAIM: Assisted reproductive technology (ART) is an invaluable strategy for preventing the inheritance of genetic disorders and promoting the birth of healthy children. Nevertheless, the general public's limited understanding of genetics and low awareness of available services obstruct effective utilization of genetic counseling. Our analysis of a family affected by mitochondrial genetic disease aims to improve public understanding of genetic knowledge and the importance of genetic counseling. METHODS: We gathered comprehensive data on a family with mitochondrial disease and scrutinized the genetic sequencing and diagnostic procedures used to identify mitochondrial disease within the family. RESULTS: In a case involving a family with two daughters, both began to exhibit symptoms such as abnormal gait, myodystonia, and excessive fatigue at the age of 4. These symptoms were incorrectly assumed to be paternally inherited, as the mother believed the father had a mild intellectual disability. As a result, the family opted for ART, specifically in vitro fertilization (IVF) with donor sperm, without thorough genetic counseling or a conclusive diagnosis for the children. Despite these precautions, the son born from IVF presented with symptoms mirroring his sisters' at the age of 6, including typical MRI abnormal signals in the bilateral basal ganglia. Furthermore, the eldest daughter's naturally conceived child also started to show identical symptoms by the age of 3. Subsequent genetic testing revealed a homoplasmic pathogenic mutation in the MT-ND6 gene (m.14459G>A), confirming that the dystonia was maternally inherited, with the mother exhibiting an 89.2% heteroplasmic variation in the same gene. CONCLUSIONS: This case study demonstrates the significant consequences of a lack of genetic knowledge and prevailing misconceptions when applying ART. It underscores the urgent need to bolster genetic literacy and emphasizes the vital importance of informed decision-making within genetic healthcare services.
On a rank-based Azadkia-Chatterjee correlation coefficient
arXiv (Cornell University) · 2024-12-03
preprintOpen accessSenior authorAzadkia and Chatterjee (Azadkia and Chatterjee, 2021) recently introduced a graph-based correlation coefficient that has garnered significant attention. The method relies on a nearest neighbor graph (NNG) constructed from the data. While appealing in many respects, NNGs typically lack the desirable property of scale invariance; that is, changing the scales of certain covariates can alter the structure of the graph. This paper addresses this limitation by employing a rank-based NNG proposed by Rosenbaum (2005) and gives necessary theoretical guarantees for the corresponding rank-based Azadkia-Chatterjee correlation coefficient.
On Rosenbaum’s rank-based matching estimator
Biometrika · 2024-11-12 · 1 citations
articleOpen accessSummary In two influential contributions, Rosenbaum (2005, 2020a) advocated for using the distances between componentwise ranks, instead of the original data values, to measure covariate similarity when constructing matching estimators of average treatment effects. While the intuitive benefits of using covariate ranks for matching estimation are apparent, there is no theoretical understanding of such procedures in the literature. We fill this gap by demonstrating that Rosenbaum’s rank-based matching estimator, when coupled with a regression adjustment, enjoys the properties of double robustness and semiparametric efficiency without the need to enforce restrictive covariate moment assumptions. Our theoretical findings further emphasize the statistical virtues of employing ranks for estimation and inference, more broadly aligning with the insights put forth by Peter Bickel in his 2004 Rietz lecture.
Smoothed NPMLEs in nonparametric Poisson mixtures and beyond
arXiv (Cornell University) · 2024-06-13
preprintOpen accessSenior authorWe discuss nonparametric mixing distribution estimation under the Gaussian-smoothed optimal transport (GOT) distance. It is shown that a recently formulated conjecture -- that the Poisson nonparametric maximum likelihood estimator can achieve root-$n$ rate of convergence under the GOT distance -- holds up to some logarithmic terms. We also establish the same conclusion for other minimum-distance estimators, and discuss mixture models beyond the Poisson.
SSRN Electronic Journal · 2024-01-01
preprintOpen access1st authorCorrespondingBernoulli · 2024-10-30 · 4 citations
articleSenior authorDue to the lack of a canonical ordering in Rd for d>1, defining multivariate generalizations of the classical univariate ranks has been a long-standing open problem in statistics. Optimal transport has been shown to offer a solution in which multivariate ranks are obtained by transporting data points to a grid that approximates a uniform reference measure (Ann. Statist. 45 (2017) 223–256; Hallin (2017); Ann. Statist. 49 (2021) 1139–1165), thereby inducing ranks, signs, and a data-driven ordering of Rd. We take up this new perspective to define and study multivariate analogues of the sign covariance/quadrant statistic, Spearman’s rho, Kendall’s tau, and van der Waerden covariances. The resulting tests of multivariate independence are fully distribution-free, hence uniformly valid irrespective of the actual (absolutely continuous) distribution of the observations. Our results provide the asymptotic distribution theory for these new test statistics, with asymptotic approximations to critical values to be used for testing independence between random vectors, as well as a power analysis of the resulting tests in an extension of the so-called (bivariate) Konijn model. This power analysis includes a multivariate Chernoff–Savage property guaranteeing that, under elliptical generalized Konijn models, the asymptotic relative efficiency of our van der Waerden tests with respect to Wilks’ classical (pseudo-)Gaussian procedure is strictly larger than or equal to one, where equality is achieved under Gaussian distributions only. We similarly provide a lower bound for the asymptotic relative efficiency of our Spearman procedure with respect to Wilks’ test, thus extending the classical result by Hodges and Lehmann on the asymptotic relative efficiency, in univariate location models, of Wilcoxon tests with respect to the Student ones.
Recent grants
Statistical Methods for Analyzing Complex Structured and Count Data
NSF · $200k · 2022–2026
Rank-based Inference for Complex and Noisy High-dimensional Data
NSF · $290k · 2020–2024
An Integrated Toolkit for High-Dimensional Complex and Time Series Data Analysis
NSF · $160k · 2017–2020
Frequent coauthors
- 53 shared
Han Liu
- 49 shared
Wei Sun
- 40 shared
John Lafferty
Yale University
- 40 shared
Ming Yuan
Peking University Shenzhen Hospital
- 39 shared
Larry Wasserman
Carnegie Mellon University
- 15 shared
Zhen Miao
Microsoft (United States)
- 15 shared
Mathias Drton
- 14 shared
Hongjian Shi
Technical University of Munich
Education
- 2008
Ph.D., Economics
University of Washington
- 2003
M.A., Economics
University of California, Los Angeles
- 2001
B.A., Economics
University of California, Los Angeles
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Fang Han
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup