We propose to develop methods and software tools for the statistical analysis of a new generation of gene expression microarrays. These new arrays are probe-rich in the sense that they provide multiple probes to measure the relative abundance of each and every known or predicted exon in the genome. Because of the large number and uniform placement of probes on the transcript, we can now obtain genome-wide survey of gene expression at the exon level. This will enable us to investigate new biological questions, such as alternative splicing, that were not assessable by previous generations of arrays. The result of this research will have great biological and medical significance as a large percentage of human genes are alternatively spliced and many diseases are linked to aberrations in splicing. We will use the Affymetrix Exon 1.0 ST arrays and next-generation Affymetrix Exon arrays as the motivating example for our research. We will develop methods for background noise corrections and for modeling cross-hybridization effects. Combining these with a probe-selection strategy, we will design a gene-level expression index for quantitative assessment of gene expression. Methods will be designed for the identification of new and /or alternatively spliced exons, and for the detection of differential splicing. Data and software generated in this research will be made freely available for public use.

Public Health Relevance

This project will lead to new methods and software for the analysis of probe-rich microarrays. This will allow researchers to obtain better quantitative gene expression measurements and to study new biological questions such as alternative splicing. It is expected that many basic and translational biomedical studies critical for the improvement of public health will benefit from the results of this project. ? ? ?

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Good, Peter J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code
Hiller, David; Wong, Wing Hung (2013) Simultaneous isoform discovery and quantification from RNA-seq. Stat Biosci 5:100-118
Jia, Yichang; Mu, John C; Ackerman, Susan L (2012) Mutation of a U2 snRNA gene causes global disruption of alternative splicing and neurodegeneration. Cell 148:296-308
Mu, John C; Jiang, Hui; Kiani, Amirhossein et al. (2012) Fast and accurate read alignment for resequencing. Bioinformatics 28:2366-73
Seok, Junhee; Xu, Weihong; Gao, Hong et al. (2012) JETTA: junction and exon toolkits for transcriptome analysis. Bioinformatics 28:1274-5
Ma, Li; Wong, Wing Hung; Owen, Art B (2012) A sparse transmission disequilibrium test for haplotypes based on Bradley-Terry graphs. Hum Hered 73:52-61
Liu, Song; Lin, Lan; Jiang, Peng et al. (2011) A comparison of RNA-Seq and high-density exon array for detecting differential gene expression between closely related species. Nucleic Acids Res 39:578-88
Salzman, Julia; Jiang, Hui; Wong, Wing Hung (2011) Statistical Modeling of RNA-Seq Data. Stat Sci 26:
Ma, Wenxiu; Wong, Wing Hung (2011) The analysis of ChIP-Seq data. Methods Enzymol 497:51-73
Yang, Hong; Chen, Xi; Wong, Wing Hung (2011) Completely phased genome sequencing through chromosome sorting. Proc Natl Acad Sci U S A 108:12-7
Ji, Hongkai; Jiang, Hui; Ma, Wenxiu et al. (2011) Using CisGenome to analyze ChIP-chip and ChIP-seq data. Curr Protoc Bioinformatics Chapter 2:Unit2.13

Showing the most recent 10 out of 30 publications