A research effort is proposed to create new tools for large-scale multiple comparisons. Work in this field has been concentrated on idealized models such as the standard Gaussian model, what has not been addressed is the potential of many other models which have more realism and impact. In this proposal, the investigator studies problems in three areas: (a). Formulation of massive data -- Develop models which have Scientific realism and impact, as well as mathematical simplicity such that careful study is possible. (b). Development of new tools -- By exploring a wide variety of models, expose new phenomena and develop tools which are easy-to-implement and theoretically sound. (c). Delicate asymptotic study -- Lay out framework for asymptotic study, carefully compare the existing and newly proposed inference tools, study on the optimality of such tools.
The motivation of this project lies in that, massive datasets produced in scientific areas such as Genomics, astronomy, and image processing lead to a new field in statistics: large-scale simultaneous hypothesis testing or multiple comparisons. The vision is advances in this new field will enable the scientists from various scientific fields to quickly extract the information they need from massive datasets, and it is the immediate interest of the statistics community to develop easy-to-implement tools. This project pushes the boundary of the field by developing new tools and novel theories, as well as exposing new phenomena. The project produces tools which are theoretically sounding and practically feasible for solving problems in areas such as Genomics, astronomy, and image processing.