To understand which genes and how these genes influence a human quantitative trait has been a long-standing research topic in the field of genetics, and is of great importance for public health. In past years, researchers rely on Genome Wide Association Studies (GWAS) to address this question with a focus on the common genetic variants (MAF > 5%). Although more than 2000 disease associated common variants being identified so far, genetic contribution to complex traits remain largely unexplained. The new development of next generation sequencing technology allows researchers to have a complete assessment on the low frequent and rare variants, and are expected to contribute significantly to disease risk and may help explain the yet unexplained (or missing) heritability. Studies on the expression quantitative trait loci (eQTLs) are to discover th genetic variants that may explain the variations in gene expression level. The eQTL studies are crucial to understand how genetic variants functions at molecular level. Most genetic variants discovered from GWAS studies are non-coding, which means they likely manifest their effects through regulating gene expressions. Therefore, identifying eQTLs is essential to interpret the discovered loci from GWAS and Sequencing studies. In this application we focus on developing quantile analysis tools for these two important genetic problems. The current analysis tools employed in genetic research are mostly mean-based. Genetic effects often vary across quantile levels. Such heterogeneity is an important aspect of genetic functions, but has been overlooked by existing analyses. The quantile-based analysis will allow us to explore how the genetic variants are associated with the entire distribution of quantitative traits and gene expression levels, and represent new directions of research in genetics.

Public Health Relevance

The project will develop quantile analysis tools to the expression Quantitative Trait Loci (eQTLs) in single/multiple tissues, and to identify the associations between infrequent/rare variants with human complex traits using next generation sequencing data. Once complete, the developed methods and its applications have great potential to deepen and expand the existing knowledge in genetics, and to contribute significantly to the fields of statistics as well.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Struewing, Jeffery P
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Columbia University (N.Y.)
Biostatistics & Other Math Sci
Schools of Public Health
New York
United States
Zip Code
Li, Gen; Jima, Dereje; Wright, Fred A et al. (2018) HT-eQTL: integrative expression quantitative trait loci analysis in a large number of human tissues. BMC Bioinformatics 19:95
Song, Xiaoyu; Li, Gen; Zhou, Zhenwei et al. (2017) QRank: a novel quantile regression tool for eQTL discovery. Bioinformatics 33:2123-2130
Wei, Ying; Song, Xiaoyu; Liu, Mengling et al. (2016) Quantile Regression in the Secondary Analysis of Case-Control Data. J Am Stat Assoc 111:344-354