
Ricardo Masini
· Professor of MathematicsUniversity of California, Davis · Biomedical Engineering
Active 1999–2026
About
Ricardo Masini is an Assistant Professor at the University of California, Davis, Department of Statistics. His research encompasses a broad range of topics in statistics and econometrics, including counterfactual analysis, high-dimensional data analysis, and the development of new estimation techniques. His work often involves the application of advanced statistical methods such as machine learning, copulas, and factor models to address complex problems in economic and statistical inference. Masini has contributed to the theoretical and methodological foundations of statistical analysis, with publications in leading journals such as the Journal of Econometrics, Annals of Statistics, and the Journal of the American Statistical Association. His research includes the development of bounds for U-statistics, the integration of random forest methods with linear models, and the exploration of distributional counterfactual analysis in high-dimensional setups. He has also worked on the refinement of asymptotic approximations, the use of artificial controls for high-dimensional panel data, and the creation of statistical software tools, including an R package for artificial counterfactual estimation. His work aims to balance flexibility and interpretability in statistical modeling, advancing the understanding and application of modern statistical techniques in econometrics and related fields.
Research topics
- Computer Science
- Artificial Intelligence
- Statistics
- Econometrics
- Mathematics
- Economics
- Psychology
- Social psychology
Selected publications
Sharp anti-concentration inequalities for extremum statistics via copulas
Bernoulli · 2026-04-29
preprintOpen accessWe derive sharp upper and lower bounds for the pointwise concentration function of the maximum statistic of $d$ identically distributed real-valued random variables. Our first main result places no restrictions either on the common marginal law of the samples or on the copula describing their joint distribution. We show that, in general, strictly sublinear dependence of the concentration function on the dimension $d$ is not possible. We then introduce a new class of copulas, namely those with a convex diagonal section, and demonstrate that restricting to this class yields a sharper upper bound on the concentration function. This allows us to establish several new dimension-independent and poly-logarithmic-in-$d$ anti-concentration inequalities for a variety of marginal distributions under mild dependence assumptions. Our theory improves upon the best known results in certain special cases. Applications to high-dimensional statistical inference are presented, including a specific example pertaining to Gaussian mixture approximations for factor models, for which our main results lead to superior distributional guarantees.
Balancing Flexibility and Interpretability: A Conditional Linear Model Estimation via Random Forest
ArXiv.org · 2025-02-19
preprintOpen access1st authorCorrespondingTraditional parametric econometric models often rely on rigid functional forms, while nonparametric techniques, despite their flexibility, frequently lack interpretability. This paper proposes a parsimonious alternative by modeling the outcome $Y$ as a linear function of a vector of variables of interest $\boldsymbol{X}$, conditional on additional covariates $\boldsymbol{Z}$. Specifically, the conditional expectation is expressed as $\mathbb{E}[Y|\boldsymbol{X},\boldsymbol{Z}]=\boldsymbol{X}^{T}\boldsymbolβ(\boldsymbol{Z})$, where $\boldsymbolβ(\cdot)$ is an unknown Lipschitz-continuous function. We introduce an adaptation of the Random Forest (RF) algorithm to estimate this model, balancing the flexibility of machine learning methods with the interpretability of traditional linear models. This approach addresses a key challenge in applied econometrics by accommodating heterogeneity in the relationship between covariates and outcomes. Furthermore, the heterogeneous partial effects of $\boldsymbol{X}$ on $Y$ are represented by $\boldsymbolβ(\cdot)$ and can be directly estimated using our proposed method. Our framework effectively unifies established parametric and nonparametric models, including varying-coefficient, switching regression, and additive models. We provide theoretical guarantees, such as pointwise and $L^p$-norm rates of convergence for the estimator, and establish a pointwise central limit theorem through subsampling, aiding inference on the function $\boldsymbolβ(\cdot)$. We present Monte Carlo simulation results to assess the finite-sample performance of the method.
Yurinskii’s coupling for martingales
The Annals of Statistics · 2025-10-01 · 1 citations
articleYurinskii’s coupling is a popular theoretical tool for nonasymptotic distributional analysis in mathematical statistics and applied probability, offering a Gaussian strong approximation with an explicit error bound under easily verifiable conditions. Originally stated in ℓ2-norm for sums of independent random vectors, it has recently been extended both to the ℓp-norm, for 1≤p≤∞, and to vector-valued martingales in ℓ2-norm, under some strong conditions. We present as our main result a Yurinskii coupling for approximate martingales in ℓp-norm, under substantially weaker conditions than those previously imposed. Our formulation further allows for the coupling variable to follow a more general Gaussian mixture distribution, and we provide a novel third-order coupling method, which gives tighter approximations in certain settings. We specialize our main result to mixingales, martingales, and independent data, and derive uniform Gaussian mixture strong approximations for martingale empirical processes. Applications to nonparametric partitioning-based and local polynomial regression procedures are provided, alongside central limit theorems for high-dimensional martingale vectors.
Constrained Polynomial Likelihood
Journal of Business and Economic Statistics · 2024-09-03 · 2 citations
articleWe develop a nonnegative polynomial minimum-norm likelihood ratio (PLR) of two distributions of which only moments are known. The sample PLR converges to the unknown population PLR under mild conditions. The methodology allows for additional shape restrictions, as we illustrate with two empirical applications. The first develops a PLR for the unknown transition density of a jump-diffusion process, while the second extracts a positive density directly from option prices. In both cases, we show the importance of implementing the non-negativity restriction.
Distributional counterfactual analysis in high-dimensional setup
Journal of Econometrics · 2024-02-01 · 1 citations
article1st authorCorrespondingJournal of Econometrics · 2024-09-21 · 2 citations
articleSenior authorBridging factor and sparse models
The Annals of Statistics · 2023-08-01 · 44 citations
articleFactor and sparse models are widely used to impose a low-dimensional structure in high-dimensions. However, they are seemingly mutually exclusive. We propose a lifting method that combines the merits of these two models in a supervised learning methodology that allows for efficiently exploring all the information in high-dimensional datasets. The method is based on a flexible model for high-dimensional panel data with observable and/or latent common factors and idiosyncratic components. The model is called the factor-augmented regression model. It includes principal components and sparse regression as specific models, significantly weakens the cross-sectional dependence, and facilitates model selection and interpretability. The method consists of several steps and a novel test for (partial) covariance structure in high dimensions to infer the remaining cross-section dependence at each step. We develop the theory for the model and demonstrate the validity of the multiplier bootstrap for testing a high-dimensional (partial) covariance structure. A simulation study and applications support the theory.
Yurinskii's Coupling for Martingales
arXiv (Cornell University) · 2022-10-01 · 3 citations
preprintOpen accessYurinskii's coupling is a popular theoretical tool for non-asymptotic distributional analysis in mathematical statistics and applied probability, offering a Gaussian strong approximation with an explicit error bound under easily verifiable conditions. Originally stated in $\ell_2$-norm for sums of independent random vectors, it has recently been extended both to the $\ell_p$-norm, for $1 \leq p \leq \infty$, and to vector-valued martingales in $\ell_2$-norm, under some strong conditions. We present as our main result a Yurinskii coupling for approximate martingales in $\ell_p$-norm, under substantially weaker conditions than those previously imposed. Our formulation further allows for the coupling variable to follow a more general Gaussian mixture distribution, and we provide a novel third-order coupling method which gives tighter approximations in certain settings. We specialize our main result to mixingales, martingales, and independent data, and derive uniform Gaussian mixture strong approximations for martingale empirical processes. Applications to nonparametric partitioning-based and local polynomial regression procedures are provided, alongside central limit theorems for high-dimensional martingale vectors.
arXiv (Cornell University) · 2022-12-31
preprintOpen accessSenior authorThe density weighted average derivative (DWAD) of a regression function is a canonical parameter of interest in economics. Classical first-order large sample distribution theory for kernel-based DWAD estimators relies on tuning parameter restrictions and model assumptions that imply an asymptotic linear representation of the point estimator. These conditions can be restrictive, and the resulting distributional approximation may not be representative of the actual sampling distribution of the statistic of interest. In particular, the approximation is not robust to bandwidth choice. Small bandwidth asymptotics offers an alternative, more general distributional approximation for kernel-based DWAD estimators that allows for, but does not require, asymptotic linearity. The resulting inference procedures based on small bandwidth asymptotics were found to exhibit superior finite sample performance in simulations, but no formal theory justifying that empirical success is available in the literature. Employing Edgeworth expansions, this paper shows that small bandwidth asymptotic approximations lead to inference procedures with higher-order distributional properties that are demonstrably superior to those of procedures based on asymptotic linear approximations.
Distributional Counterfactual Analysis in High-Dimensional Setup
arXiv (Cornell University) · 2022-02-23 · 1 citations
preprintOpen access1st authorCorrespondingIn the context of treatment effect estimation, this paper proposes a new methodology to recover the counterfactual distribution when there is a single (or a few) treated unit and possibly a high-dimensional number of potential controls observed in a panel structure. The methodology accommodates, albeit does not require, the number of units to be larger than the number of time periods (high-dimensional setup). As opposed to modeling only the conditional mean, we propose to model the entire conditional quantile function (CQF) without intervention and estimate it using the pre-intervention period by a l1-penalized regression. We derive non-asymptotic bounds for the estimated CQF valid uniformly over the quantiles. The bounds are explicit in terms of the number of time periods, the number of control units, the weak dependence coefficient (beta-mixing), and the tail decay of the random variables. The results allow practitioners to re-construct the entire counterfactual distribution. Moreover, we bound the probability coverage of this estimated CQF, which can be used to construct valid confidence intervals for the (possibly random) treatment effect for every post-intervention period. We also propose a new hypothesis test for the sharp null of no-effect based on the Lp norm of deviation of the estimated CQF to the population one. Interestingly, the null distribution is quasi-pivotal in the sense that it only depends on the estimated CQF, Lp norm, and the number of post-intervention periods, but not on the size of the post-intervention period. For that reason, critical values can then be easily simulated. We illustrate the methodology by revisiting the empirical study in Acemoglu, Johnson, Kermani, Kwak and Mitton (2016).
Frequent coauthors
- 42 shared
Marcelo C. Medeiros
University of Illinois Urbana-Champaign
- 13 shared
Carlos Carvalho
Pontifical Catholic University of Rio de Janeiro
- 12 shared
Jianqing Fan
- 8 shared
Eduardo Mendes
Institut polytechnique de Grenoble
- 3 shared
Victor Orestes
- 3 shared
Matias D. Cattaneo
- 2 shared
Gabriel Vasconcelos
Brazilian Development Bank
- 2 shared
Yuri Fonseca
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Ricardo Masini
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup