Gene expression data produced from expression microarrays have not only greatly improved our understanding of cell biology, but also provided invaluable resources to guide the diagnosis and treatment of human diseases. However, the pace of incorporating gene expression signatures into medical practice has been relatively slow. This is mainly due to the limitations of gene expression microarrays and the natural variation of gene expression across tissues or developmental stages. This research project aims to overcome these limitations by joint study of germline DNA polymorphisms and allele-specific expression (ASE) obtained from RNA-seq data. Since germline DNA polymorphisms are stable across tissues and developmental stages, inclusion of DNA information will help us establish more reliable biomarkers for patients'clinical care. More specifically, we will study the genetic basis of ASE in both normal and tumor tissues, dissect genetic and parent-of-origin effects on ASE in human cell lines, and identify genes that escape X inactivation in both mouse reciprocal cross and human cell lines.

Public Health Relevance

We propose to develop statistical methods and software for RNA-seq data analysis, with specific aims on dissecting the genetic basis of allele-specific expression (ASE), quantitative assessment of autosomal imprinting in humans, as well as the genetically controlled measurement of escape from X-inactivation in mouse and human. The deliverables of this project will help biomedical researchers to harvest the huge amount of knowledge accumulated in DNA variations and RNA-seq data and translate them into strategies of personalized disease prevention and treatment.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of North Carolina Chapel Hill
Biostatistics & Other Math Sci
Schools of Public Health
Chapel Hill
United States
Zip Code
Zhou, Hua; Blangero, John; Dyer, Thomas D et al. (2016) Fast Genome-Wide QTL Association Mapping on Pedigree and Population Data. Genet Epidemiol :
Zhou, Jin J; Hu, Tao; Qiao, Dandi et al. (2016) Boosting Gene Mapping Power and Efficiency with Efficient Exact Variance Component Tests of SNP Sets. Genetics :
Richardson, Sylvia; Tseng, George C; Sun, Wei (2016) Statistical Methods in Integrative Genomics. Annu Rev Stat Appl 3:181-209
Ha, Min Jin; Sun, Wei; Xie, Jichun (2016) PenPC: A two-step approach to estimate the skeletons of high-dimensional directed acyclic graphs. Biometrics 72:146-55
Hu, Yi-Juan; Liao, Peizhou; Johnston, H Richard et al. (2016) Testing Rare-Variant Association without Calling Genotypes Allows for Systematic Differences in Sequencing between Cases and Controls. PLoS Genet 12:e1006040
Wang, Xuefeng; Chen, Mengjie; Yu, Xiaoqing et al. (2016) Global copy number profiling of cancer genomes. Bioinformatics 32:926-8
Sun, Wei; Liu, Yufeng; Crowley, James J et al. (2015) IsoDOT Detects Differential RNA-isoform Expression/Usage with respect to a Categorical or Continuous Covariate with High Sensitivity and Specificity. J Am Stat Assoc 110:975-986
Hu, Yi-Juan; Sun, Wei; Tzeng, Jung-Ying et al. (2015) Proper Use of Allele-Specific Expression Improves Statistical Power for cis-eQTL Mapping with RNA-Seq Data. J Am Stat Assoc 110:962-974
Hu, Yi-Juan; Li, Yun; Auer, Paul L et al. (2015) Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations. Proc Natl Acad Sci U S A 112:1019-24
Zhou, Hua; Lange, Kenneth (2015) Path Following in the Exact Penalty Method of Convex Programming. Comput Optim Appl 61:609-634

Showing the most recent 10 out of 12 publications