In a wide range of applications involving covariance, people are interested in certain aspects, i.e., functionals, of the covariance structure. Despite recent progress on methodological work on covariance matrices estimation, there has been remarkably little fundamental theoretical and methodological studies on optimal estimation of functionals of high dimensional covariance matrices. The goal of this proposal is to develop a coherent theory to unveil the precision to which covariance matrix functionals can be estimated, to develop general methodologies for optimal estimation of functionals of covariance matrices, to establish the asymptotic equivalence between covariance matrices estimation and Gaussian sequence models, and to address applications that arise in finance, bioinformatics, genomics, and meteorology etc. The research presented in this proposal will significantly advance the theoretical understanding of estimating functionals of large scale covariance matrices. In particular, the asymptotic equivalence theory to be developed will help build a close and inspiring connection between matrices estimation and classical Gaussian sequence models which had been well studied in the past thirty years, and help carry over results and methodologies developed in Gaussian sequence models to matrices estimation. The proposed optimal estimation procedures for those fundamentally important statistics methodologies, for example, principal components analysis, graphical model, and linear and quadratic discriminant analysis, will provide more accurate estimation and prediction rules in a wide range of applications.
With the emergence of high dimensional data from modern technologies, estimating large scale covariance matrices and their functionals is becoming a crucial problem in many fields including climate studies, genomics and proteomics, functional magnetic resonance imaging, portfolio allocation and risk management. The traditional sample covariance matrix estimator has been used frequently in practice when analyzing high dimensional data, which may result in poor performance and invalid conclusions. To overcome the difficulty associated with the high dimensionality, regularized methods have been developed in recent years. A central role of covariance matrix in statistical analysis and its wide range of important statistical applications ensure that the progress the investigator and his colleagues make towards their proposed objectives will have a great impact in the broad scientific community which includes astronomy, bioinformatics, finance, genomics, meteorology and clinical research. Research results from this proposal will be disseminated through research articles, workshops and seminar series to researchers in other disciplines. The project will integrate research and education by teaching monograph courses and organizing workshops and seminars to help graduate students and postdocs, particularly minority, women, and domestic students and young researchers, work on this topic.