? ? DNA microarrays are an increasingly important technology for a diverse spectrum of biological and biomedical research disciplines. Concomitant with the rise in microarrays' popularity has been a surge of proposed methods for the analysis of microarray data. However, assessing the effectiveness of these various methods has been difficult as the """"""""correct"""""""" answers typically are not known; that is, due to the vast numbers of genes interrogated in a microarray experiment, only a relatively small fraction of gene expression differences tend to be validated in any given study. While some attempts to address this problem and to compare among different analysis methods have been made, they have tended to use small numbers of known control genes, making adequate statistical analysis difficult, and have included large background RNA samples of unknown composition, preventing an accurate assessment of false positive rates and nonspecific hybridization to the array. We recently reported a new control dataset for purposes of evaluating microarray analysis methods as applied to Affymetrix GeneChips. This dataset has three key features: it contains over 1300 RNAs that differ by known relative concentrations; it includes low fold changes, beginning at a 1.2 x concentration difference, so that sensitivity of the microarrays can be fully assessed; and it contains a defined background sample of over 2500 RNAs. Using this control dataset we were able to assess a number of popular analysis methods for Affymetrix arrays and to devise a new algorithm that reduced false-negative rates. Here, we propose the construction of a similar but improved control dataset to be applied to spotted glass slide microarrays, which differ in their analysis requirements from Affymetrix arrays in several areas, and also an improved control dataset to extend the Affymetrix studies. DNA microarrays are not only increasingly being used as a fundamental tool in biomedical research, particularly in the areas of cancer and neurodegenerative diseases, but they also have important potential for use in clinical testing for heterogeneous diseases such as cancer. In order for these research and diagnostic potentials to be realized, it is vital that we know how to analyze microarray data accurately. This proposal addresses this important issue. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Small Research Grants (R03)
Project #
5R03LM008941-02
Application #
7363677
Study Section
Special Emphasis Panel (ZLM1-ZH-S (O1))
Program Officer
Ye, Jane
Project Start
2007-03-01
Project End
2010-08-28
Budget Start
2008-03-01
Budget End
2010-08-28
Support Year
2
Fiscal Year
2008
Total Cost
$79,250
Indirect Cost
Name
State University of New York at Buffalo
Department
Biochemistry
Type
Schools of Medicine
DUNS #
038633251
City
Buffalo
State
NY
Country
United States
Zip Code
14260
Zhu, Qianqian; Miecznikowski, Jeffrey C; Halfon, Marc S (2011) A wholly defined Agilent microarray spike-in dataset. Bioinformatics 27:1284-9
Zhu, Qianqian; Miecznikowski, Jeffrey C; Halfon, Marc S (2010) Preferred analysis methods for Affymetrix GeneChips. II. An expanded, balanced, wholly-defined spike-in dataset. BMC Bioinformatics 11:285