The research project will focus on developing practical methods, efficient algorithms and solid theory for the selection of important features, estimation of unknown parameters and prediction of responses with high-dimensional data, especially in the case where the number of features is much larger than the number of samples. It will further develop recently proposed methodologies and algorithms for feature selection in linear regression, extend them to more general high-dimensional statistical models, investigate their consistency and optimality properties in selection and estimation. The methodologies developed in the project will be directly relevant to many applications. The project will specifically investigate applications in two important areas. The first one is signal processing, including efficient sampling, representation, transmission and recovery of data objects. The second one is communications networks, including detection and estimation of significant patterns in volume and changes in data streams.

High-dimensional data is an area of intense current interest in statistical research and practice due to the rapid development of information technologies and their applications to modern scientific experiments. Important fields with an abundance of high-dimensional data include bioinformatics, signal processing, neural imaging, communications networks and more. In many suchscientific and engineering applications, the size of the problem is measured by the number of features: genetic components in bioinformatics, brain regions or voxels in neural imaging, or computers and routers in theInternet. A main challenge in high-dimensional data is that the size of the problem is often much larger than the size of the data to be used. The project is motivated and will be directly applicable to signal processing and monitoring communications networks. Due to mathematical and statistical commonalities of problems involving high-dimensional data, the project will also be directly applicable to bioinformatics, neural imaging and many more disciplines where modern information technologies prosper. Furthermore, the project will have significant educational impact.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0906420
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2009-09-01
Budget End
2013-08-31
Support Year
Fiscal Year
2009
Total Cost
$221,627
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
New Brunswick
State
NJ
Country
United States
Zip Code
08901