Statistical analysis of multiple testing problems revolves around the distribution of the collection of p-values arising from simultaneous tests. Data from fMRI, Proteomics, Microarray and other biomedical experiments exhibit dependence among p-values. Statistical inference yields biologically irrelevant conclusions if such dependence is not taken into consideration while estimating error control measures such as the false discovery rate. This proposal delineates a model oriented approach to multiple hypotheses testing by flexible and accurate modeling of the joint distribution of the p-values in dependent situations using mixtures. An additional theoretical goal the investigators study properties of skew-mixture models. By incorporating dependence in the model for the p-values, the proposed research provides valid controls of false discoveries, especially in complex biomedical applications. The proposed methodologies provide a foundation for statistical analysis in large dependent multiple testing situations and will spawn new research in the area of false discovery control.

Multiple hypothesis testing is one of the primary statistical tools available to the scientists for efficiently analyzing large-scale complex biomedical data such as gene-expression data, protemics data or brain imaging data. Disease association studies in such biomedical applications require testing significance of association of several thousand genes or proteins or brain regions, simultaneously. Identification of a gene or a protein as being potentially associated with a given disease is called a discovery. However, in large scale biomedical studies there is a risk of accumulating error via making too many false discoveries. The proposed research substantially influences the practice of statistics in biomedical applications by providing accurate estimates of error rates in large scale disease association studies. The investigators specifically develop error control mechanism for brain imaging applications in MRI studies of autistic patients. The project impacts human resource development in the form of graduate student education and training.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0803531
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2008-09-01
Budget End
2010-08-31
Support Year
Fiscal Year
2008
Total Cost
$81,091
Indirect Cost
Name
University of Maryland Baltimore County
Department
Type
DUNS #
City
Baltimore
State
MD
Country
United States
Zip Code
21250