With rapid advances of computing power and other modern technology, high-throughput data of unprecedented size and complexity are becoming a commonplace in diverse fields. Examples include data from genetic, microarrays, proteomics, fMRI, cancer clinical trials and high frequency financial data. These high dimensional data characterize many important contemporary problems in statistics and feature selection play pivotal roles in these problems. This research project aims to develop cutting-edge statistical theory and methods for high dimensional variable selections. In particular, the PI proposes the following interrelated research topics for investigation: (1) grouped-variables screening with sparse linear models; (2) nonparametric components screening with sparse additive models; (3) parametric components screening with sparse semiparametric models and(4) their further extensions. The proposed methods will be studied theoretically for their sure screening behavior and compared with some of the existing methods empirically in terms of computational expediency, statistical accuracy and algorithmic stability.

The outlined research project on variable selection in high dimensions tries to tackle fundamental problems in statistical learning and will stimulate interests from a large group of scientists and researchers in diverse fields of sciences, engineering and humanities ranging from genomics and health sciences to economics and finance. Another key aspect of this project is the integration of research and education, which will be achieved by developing two new courses on statistical learning and non-, semi-parametric inference and proposing specific projects for students during the teaching of classes. It will enable the participation of all citizens from various disciplines, including underrepresented groups of students.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1007698
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2010-09-01
Budget End
2012-11-30
Support Year
Fiscal Year
2010
Total Cost
$100,000
Indirect Cost
Name
Colorado State University-Fort Collins
Department
Type
DUNS #
City
Fort Collins
State
CO
Country
United States
Zip Code
80523