This proposal develops novel statistical methods to select a small group of molecules from high-throughput data such as microarray, proteomic, and next generation sequence from biomedical research, especially for autism studies and brain tumors. It focuses on developing efficient methods and valid statistical tools for controlling false discovery rate an testing treatment effects on a group of molecules, for feature selection and model building in presence of errors-in-variables, endogeneity, and heavy-tail error distributions, and for predicting clinical outcomes and understanding molecular mechanisms. It develops semiparametric and nonparametric models to reduce modeling biases and to augment features. It furthers the developments on estimating large covariance matrices for understanding genetic network, statistical model building and inferences. It introduces multivariate independence screening and conditional independence screening techniques to reduce false negatives and false positives in variable screening, and develops computable and optimal penalized likelihood methods for an array of statistical models. The strength and weakness of each proposed method will be critically analyzed via theoretical investigations and simulation studies. Related software will be developed. Data sets from ongoing autism research, brain tumor, and other biomedical studies will be analyzed using the newly developed methods and the results will be further biologically confirmed and investigated. The research findings will have strong impact on statistical analysis of high throughput data for biomedical research and on understanding molecular mechanisms of autism, brain tumors, and other diseases.
This proposal develops novel statistical and bioinformatic tools for finding genes, proteins, and SNPs that are associated with clinical outcomes. Data sets from ongoing autism research, brain tumors and other biomedical studies will be critically analyzed using the newly developed statistical and bioinformatic methods, and the results will be further biologically confirmed and investigated. The research findings will have strong impact on understanding molecular mechanisms of autism, brain tumors, and other diseases and developing therapeutic targets.
|Fan, Jianqing; Feng, Yang; Jiang, Jiancheng et al. (2016) Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification. J Am Stat Assoc 111:275-287|
|Fan, Jianqing; Han, Fang; Liu, Han et al. (2016) Robust Inference of Risks of Large Portfolios. J Econom 194:298-308|
|Rolfe, Alyssa J; Bosco, Dale B; Wang, Jingying et al. (2016) Bioinformatic analysis reveals the expression of unique transcriptomic signatures in Zika virus infected human neural stem cells. Cell Biosci 6:42|
|Jones, Zachary B; Ren, Yi (2016) Sphingolipids in spinal cord injury. Int J Physiol Pathophysiol Pharmacol 8:52-69|
|Dobriban, Edgar; Fan, Jianqing (2016) Regularity Properties for Sparse Regression. Commun Math Stat 4:1-19|
|Guo, Lei; Rolfe, Alyssa J; Wang, Xi et al. (2016) Rescuing macrophage normal function in spinal cord injury with embryonic stem cell conditioned media. Mol Brain 9:48|
|Fan, Jianqing; Liao, Yuan; Wang, Weichen (2016) PROJECTED PRINCIPAL COMPONENT ANALYSIS IN FACTOR MODELS. Ann Stat 44:219-254|
|Fan, Jianqing; Liao, Yuan; Shi, Xiaofeng (2015) Risks of Large Portfolios. J Econom 186:367-387|
|Fan, Jianqing; Rigollet, Philippe; Wang, Weichen (2015) ESTIMATION OF FUNCTIONALS OF SPARSE COVARIANCE MATRICES. Ann Stat 43:2706-2737|
|Fan, Jianqing; Ke, Zheng Tracy; Liu, Han et al. (2015) QUADRO: A SUPERVISED DIMENSION REDUCTION METHOD VIA RAYLEIGH QUOTIENT OPTIMIZATION. Ann Stat 43:1498-1534|
Showing the most recent 10 out of 54 publications