
Yufeng Liu
· ProfessorVerifiedUniversity of North Carolina at Chapel Hill · Statistics
Active 1996–2026
About
Yufeng Liu is a Professor at the University of North Carolina at Chapel Hill, specializing in Statistical Machine Learning, Data Mining, and Bioinformatics. He holds a B.S. from Nankai University obtained in 1999, an M.S. from The Ohio State University in 2001, and a Ph.D. from The Ohio State University completed in 2004. His research focuses on statistical machine learning and data science, with particular interests in high-dimensional data analysis, nonparametric statistics and functional estimation, statistical genetics and genomics, neuroimaging data analysis, and bioinformatics. Professor Liu's work involves the design and analysis of experiments, contributing to the advancement of methods in these areas.
Research topics
- Medicine
- Computer Science
- Biology
- Engineering
- Artificial Intelligence
- Medical emergency
- Pathology
- Genetics
- Embedded system
- Evolutionary biology
- Materials science
- Systems engineering
- Virology
- Computational biology
- Computer network
- Cancer research
- Nursing
- Immunology
- Mathematics
- Internal medicine
- Emergency medicine
- Statistics
Selected publications
Consistency of Lloyd’s algorithm under perturbations
Electronic Journal of Statistics · 2026-01-01
articleOpen accessSenior authorMaterials Characterization · 2026-04-20
articlePeerJ · 2026-03-10
articleOpen access1st authorCorrespondingBackground: IL20RB, interleukin 20 receptor subunit beta, functions as a cytokine receptor subunit coding gene and has been discovered to serve an essential function in human malignancies. However, the link between IL20RB expression, clinical outcomes, and tumor-infiltrating lymphocytes in clear cell renal cell carcinoma (ccRCC) remains unclear. Methods: The Cancer Genome Atlas (TCGA) was utilized to compile data on the IL20RB expression in both normal and ccRCC tissues. The link between IL20RB expression and clinicopathologic characteristics was examined utilizing the TCGA database. Kaplan-Meier survival curves were employed for performing the survival analysis. Furthermore, a protein network involving IL20RB was established using data from the GeneMANIA database. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were undertaken, and the relationship between IL20RB and tumor immune infiltration was examined via single-sample GSEA (ssGSEA). Additional examination of the link between tumor-infiltrating immune cells (TIIC) and IL20RB was executed utilizing the Tumor Immune Estimation Resource (TIMER) and TISIDB databases. IL20RB expression in tumor specimens was detected through immunohistochemistry (IHC). IL20RB expression levels in tumor cells were confirmed via Western blot analysis. Cell counting kit-8 (CCK-8) and colony formation assays evaluated IL20RB's impact on ccRCC cell viability. Wound Healing and Transwell assays assessed IL20RB's influence on ccRCC cell migration. Results: < 0.01). Subsequently, a significant link was observed between IL20RB overexpression and immunomodulators, chemokines, and a heightened presence of infiltrating Treg, NK CD56 cells, Th1 cells, cytotoxic cells, and T helper cells in ccRCC. IHC showed that the IL20RB level in the adjacent normal tissues was notably diminished relative to that in ccRCC samples. IL20RB suppression through small interfering RNA (siRNA) markedly diminished ccRCC cell proliferation and migration. Conclusion: Heightened IL20RB expression is linked to a dismal prognosis and infiltration of immune cells in ccRCC, indicating its potential importance in the development of immunotherapeutic strategies.
Replication Data for: Subject-Specific Scalar-on-Image Regression
Harvard Dataverse · 2026-05-02
datasetOpen accessThis is the data for "Subject-Specific Scalar-on-Image Regression" published on Journal of Data Science.
On Probability Estimation for Unbalanced Classification
Journal of Statistical Theory and Practice · 2026-04-20
articleOpen accessSenior authorAbstract Unbalanced classification poses a significant challenge in real-world applications such as medical diagnosis and fraud detection, where class label distributions are highly skewed. Standard classification methods often yield suboptimal results by sacrificing minority class accuracy to maximize overall performance. Common techniques to address unbalanced classification typically involve weighting adjustments of points for different classes or resampling strategies, including oversampling of minority classes and undersampling of majority classes. While these balancing techniques are effective in improving classification accuracy, they often introduce biases that compromise posterior class probability estimation and model calibration. This paper highlights the trade-offs associated with balancing techniques when applied without appropriate adjustments. We systematically investigate the distortions in probability estimation caused by these unbalanced classification techniques and propose a robust framework to correct these biases through probability adjustment. We further investigate high-dimensional unbalanced data, examining both the distortion induced by balancing techniques and the shrinkage effect of regularization on probability estimation. Finally, our method is evaluated through simulations and real data analysis, covering a range of balancing strategies including class weighting, random resampling, and generative models. Results demonstrate that our proposed framework significantly mitigates probability estimation bias while preserving classification power when applying traditional balancing techniques, ensuring reliable posterior estimates.
Effective Permutation Tests for Differences Across Multiple High-Dimensional Correlation Matrices
Journal of Computational and Graphical Statistics · 2025-08-29
articleOpen accessSenior authorHub Detection in Gaussian Graphical Models
Journal of the American Statistical Association · 2025-01-21
articleOpen accessSenior authorCorrespondingGraphical models are popular tools for exploring relationships among a set of variables. The Gaussian graphical model (GGM) is an important class of graphical models, where the conditional dependence among variables is represented by nodes and edges in a graph. In many real applications, we are interested in detecting hubs in graphical models, which refer to nodes with a significant higher degree of connectivity compared to non-hub nodes. A typical strategy for hub detection consists of estimating the graphical model, and then using the estimated graph to identify hubs. Despite its simplicity, the success of this strategy relies on the accuracy of the estimated graph. In this paper, we directly target on the estimation of hubs, without the need of estimating the graph. We establish a novel connection between the presence of hubs in a graphical model, and the spectral decomposition of the underlying covariance matrix. Based on this connection, we propose the method of inverse principal components for hub detection (IPC-HD). Both consistency and convergence rates are established for IPC-HD. Our simulation study demonstrates the superior performance and fast computation of the proposed method compared to existing methods in the literature in terms of hub detection. Our application to a prostate cancer gene expression dataset detects several hub genes with close connections to tumor development.
World Journal of Gastrointestinal Surgery · 2025-02-24
articleOpen accessThis study explores the significance of using two-dimensional shear wave elastography (2D-SWE) to assess liver stiffness (LS) and spleen area (SPA) for predicting post-hepatectomy liver failure (PHLF). By providing a non-invasive method to measure LS, which correlates with the degree of liver fibrosis, and SPA, an indicator of portal hypertension, 2D-SWE offers a comprehensive evaluation of a patient's hepatic status. These advancements are particularly crucial in hepatic surgery, where accurate preoperative assessments are essential for optimizing surgical outcomes and minimizing complications. This letter highlights the practical implications of integrating 2D-SWE into clinical practice, emphasizing its potential to improve patient safety and surgical precision by enhancing the ability to predict PHLF and tailor surgical approaches accordingly.
In-situ investigation of oxygen-induced deformation mechanism transition in FGH96 superalloy
Materials Science and Engineering A · 2025-10-25 · 2 citations
articleCorrespondingSynthetic and Systems Biotechnology · 2025-09-13
articleOpen accessBase editors (BEs) enable precise genome editing, but their use in microbes remains limited by restricted mutagenesis capabilities and narrow editing windows. Here, we reported MicroDFBEST, a novel dual-function base editor (DFBE) for microbes, by fusing the high-activity deaminases evoCDA1 and TadA9 with nuclease-deficient Cas12b from Bacillus hisashii (dBhCas12b). This engineered system enables simultaneous C-to-T and A-to-G editing within a 26–33 nt window, the broadest range reported for microbial DFBEs. The editing characteristics of MicroDFBEST can be easily adjusted by changing fusion protein expression and editing generations to create diverse mutant libraries. We show that the MicroDFBEST system enables both flexible gene expression modulation via random promoter (P ylbP ) diversification and targeted protein evolution through mutational hotspot scanning in native genomic contexts. This study offers a versatile platform enabling in situ gene regulation (e.g., biosynthetic gene clusters activation) and protein evolution (e.g., chassis optimization), with broad synthetic biology utility.
Recent grants
Graph-based Learning and Inference for Sparse Regularized Techniques
NSF · $120k · 2014–2018
Flexible statistical machine learning techniques for cancer-related data
NIH · $1.5M · 2010–2016
NSF · $300k · 2016–2020
Unlocking complex co-expression network using graphical models
NIH · $1.6M · 2017–2024
CAREER: Flexible Statistical Learning for Complex Data
NSF · $400k · 2008–2014
Frequent coauthors
- 522 shared
Yang Zhou
- 150 shared
Kai Zhang
East China Normal University
- 56 shared
Hui Shen
Ohio Northern University
- 39 shared
D. Neil Hayes
- 35 shared
Yichao Wu
- 33 shared
J. S. Marron
University of North Carolina at Chapel Hill
- 32 shared
Yingyong Hou
Fudan University
- 31 shared
Lianxin Liu
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Yufeng Liu
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup