Increasing amounts of high-dimensional data are being collected and analyzed in a diverse range of research areas. In practice, data scientists face significant analytic challenges when exploring and understanding the complex and high-dimensional data. Statistical inference of high-dimensional data is essential in theoretical and applied research of statistics, biostatistics, econometrics, geoscience, machine learning, signal processing, and many others. Big Data has rapidly reshaped statistical modeling and revolutionized statistical analysis. There exist many challenges and open problems, whose solutions require innovative ideas and techniques. This project will address new challenges arising in high-dimensional hypothesis testing.
Testing high-dimensional structural parameters plays a vital role in estimating and quantifying uncertainty, making informed choices, and discovering knowledge from Big Data. In this project, novel statistical methods and theory are developed to study three important topics of high-dimensional hypothesis testing: (1) power enhancement tests for high-dimensional covariance matrices, (2) power enhancement tests for high-dimensional mean vectors, and (3) nonlinear statistical dependence of high-dimensional data. The research outcomes will provide powerful analytic tools for solving open problems in three research topics. The methods and theory are general, and they can be directly extended to address other important hypothesis testings for high-dimensional data such as testing in high-dimensional spiked models. Software packages will be developed to make the research outcomes readily available to other researchers and practitioners.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.