Identification of transcripts that are differentially regulated in response to studied experimental conditions is one of critical steps in analysis of DNA microarray data. Currently employed statistical approaches become particularly ineffective for experiments with small number of biological replicates, which are prevalent in the differential expression studies. We propose to develop and validate a novel numerical framework for identification of differentially expressed transcripts, with emphasis on analysis of experiments with small number of replicates and genes with moderate levels of expression. The proposed approach is based on a novel, non-parametric method for assessment of noise distributions in microarray data, which are derived directly from the analyzed data set. Three distinct, univariate and multivariate methods for identification of differentially expressed genes will be implemented and their results will be compared to the results of leading advanced statistical methods. In the Phase I feasibility study we will analyze differential gene expression between at least nine normal tissues with varying levels of similarity, in rat and mouse. Publicly available data from SymAtlas database (Genomics Institute of the Novartis Research Foundation), obtained with Affymetrix microarrays, will be employed. The utility of newly developed numerical methods will be established through biological and/or experimental validation of identified genomic biomarkers using functional analysis (if functional annotation is available) and/or quantitative polymerase chain reaction analysis.
DNA microarray technology enables simultaneous profiling of thousands of transcripts expressed in particular organism, cells or tissues. Its current applications include gene profiling, gene regulation studies, disease biomarker discovery, toxicogenomics, pharmacogenomics, and clinical diagnostics and prognosis. Despite recent impressive technological advances, major bottlenecks to the realization of the full potential of the microarray technology exist and include incomplete functional gene annotation and the lack of effective computational data analysis tools. The analysis methods developed in this project will improve the ability to reliably identify differentially expressed genes in experiments with small number of biological replicates, which will improve the overall effectiveness of this technology and reduce the cost of microarray gene expression studies.