The broad objectives of this research are developments of regularization methods for high-dimensional data that arise commonly in biomedical studies, particularly studies in genomics, epigenetics, and brain imaging.
The specific aims i n this proposal are motivated by problems arising in studies of neurodegenerative disorders in an aging population including epigenetic and PiB PET studies for patients with Alzheimer's disease, PET studies of dementia in patients with Parkinson's disease, and the genome-wise association study for the age- related macular degeneration. They include: 1. Developing effective variable selection methods in high- dimensional regression models with grouped structures. We will consider the regularization with a novel convex penalty in multiple linear regression with high-dimensional grouped covariates, generalized linear models with high-dimensional grouped covariates, and multivariate linear regression with both high- dimensional grouped response variables and high-dimensional grouped covariates. Fast algorithms will be developed, theoretical properties such as oracle inequalities will be examined, and finite sample performance will be evaluated through extensive simulations. 2. Developing and evaluating regularization methods for disease diagnosis with functional data. We consider functional linear regression models and functional principal component analysis for imaging data to achieve sparse group effects. Both theoretical and numerical performance of the proposed methods will be examined. 3. Developing new multiple testing methodologies for dependent tests. We propose to use the hidden Markov models, either non-homogeneous or group homogeneous, to characterize the dependence structure among the multiple tests, and develop procedures with enhanced power and correctly controlled false discovery rate. We will examine the asymptotic optimality and numerical performance of the proposed methods. 4. Developing user-friendly computing programs systematically for the proposed statistical methodologies and disseminating them to health sciences researchers.
The application is intended to solve emerging statistical issues in high-dimensional data analysis, which arise commonly in biomedical and public health studies. The proposed methods will be particularly useful in epigenetic, genomic, PET imaging studies on neurodegenerative disorders including Alzheimer's disease, Parkinson's disease, and age-related macular degeneration.
|Hu, Tianle; Lin, Xihong; Nan, Bin (2014) Cross-ratio estimation for bivariate failure times with left truncation. Lifetime Data Anal 20:23-37|
|Kong, Shengchun; Nan, Bin (2014) Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso. Stat Sin 24:25-42|
|Foster, Jared C; Taylor, Jeremy M G; Nan, Bin (2013) Variable selection in monotone single-index models via the adaptive LASSO. Stat Med 32:3944-54|