Statistical Methods for Ultrahigh-dimensional Biomedical Data

Fan, Jianqing

Abstract

This proposal develops novel statistical methods to select a small group of molecules from high-throughput data such as microarray, proteomic, and next generation sequence from biomedical research, especially for autism studies and brain tumors. It focuses on developing efficient methods and valid statistical tools for controlling false discovery rate an testing treatment effects on a group of molecules, for feature selection and model building in presence of errors-in-variables, endogeneity, and heavy-tail error distributions, and for predicting clinical outcomes and understanding molecular mechanisms. It develops semiparametric and nonparametric models to reduce modeling biases and to augment features. It furthers the developments on estimating large covariance matrices for understanding genetic network, statistical model building and inferences. It introduces multivariate independence screening and conditional independence screening techniques to reduce false negatives and false positives in variable screening, and develops computable and optimal penalized likelihood methods for an array of statistical models. The strength and weakness of each proposed method will be critically analyzed via theoretical investigations and simulation studies. Related software will be developed. Data sets from ongoing autism research, brain tumor, and other biomedical studies will be analyzed using the newly developed methods and the results will be further biologically confirmed and investigated. The research findings will have strong impact on statistical analysis of high throughput data for biomedical research and on understanding molecular mechanisms of autism, brain tumors, and other diseases.

Public Health Relevance

This proposal develops novel statistical and bioinformatic tools for finding genes, proteins, and SNPs that are associated with clinical outcomes. Data sets from ongoing autism research, brain tumors and other biomedical studies will be critically analyzed using the newly developed statistical and bioinformatic methods, and the results will be further biologically confirmed and investigated. The research findings will have strong impact on understanding molecular mechanisms of autism, brain tumors, and other diseases and developing therapeutic targets.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM072611-11
Application #: 8998956
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Marcus, Stephen

Project Start: 2004-12-01
Project End: 2018-01-31
Budget Start: 2016-02-01
Budget End: 2017-01-31
Support Year: 11
Fiscal Year: 2016
Total Cost
Indirect Cost

Institution

Name: Princeton University
Department
Type: Biomed Engr/Col Engr/Engr Sta
DUNS #: 002484665

City: Princeton
State: NJ
Country: United States
Zip Code

Related projects

Publications

Fan, Jianqing; Liu, Han; Wang, Weichen (2018) LARGE COVARIANCE ESTIMATION THROUGH ELLIPTICAL FACTOR MODELS. Ann Stat 46:1383-1414

Chen, Zhao; Fan, Jianqing; Li, Runze (2018) Error Variance Estimation in Ultrahigh-Dimensional Additive Models. J Am Stat Assoc 113:315-327

Li, Quefeng; Cheng, Guang; Fan, Jianqing et al. (2018) Embracing the Blessing of Dimensionality in Factor Models. J Am Stat Assoc 113:380-389

Fan, Jianqing; Shao, Qi-Man; Zhou, Wen-Xin (2018) ARE DISCOVERIES SPURIOUS? DISTRIBUTIONS OF MAXIMUM SPURIOUS CORRELATIONS AND THEIR APPLICATIONS. Ann Stat 46:989-1017

Battey, Heather; Fan, Jianqing; Liu, Han et al. (2018) DISTRIBUTED TESTING AND ESTIMATION UNDER SPARSE HIGH DIMENSIONAL MODELS. Ann Stat 46:1352-1382

Zhou, Wen-Xin; Bose, Koushiki; Fan, Jianqing et al. (2018) A NEW PERSPECTIVE ON ROBUST M-ESTIMATION: FINITE SAMPLE THEORY AND APPLICATIONS TO DEPENDENCE-ADJUSTED MULTIPLE TESTING. Ann Stat 46:1904-1931

Fan, Jianqing; Liu, Han; Sun, Qiang et al. (2018) I-LAMM FOR SPARSE LEARNING: SIMULTANEOUS CONTROL OF ALGORITHMIC COMPLEXITY AND STATISTICAL ERROR. Ann Stat 46:814-841

Avella-Medina, Marco; Battey, Heather S; Fan, Jianqing et al. (2018) Robust estimation of high-dimensional covariance and precision matrices. Biometrika 105:271-284

Aït-Sahalia, Yacine; Fan, Jianqing; Laeven, Roger J A et al. (2017) Estimation of the Continuous and Discontinuous Leverage Effects. J Am Stat Assoc 112:1744-1758

Wang, Weichen; Fan, Jianqing (2017) Asymptotics of empirical eigenstructure for high dimensional spiked covariance. Ann Stat 45:1342-1374

Showing the most recent 10 out of 77 publications

Comments

Be the first to comment on Jianqing Fan's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: