Understanding the genetic basis of human quantitative traits is of great importance for public health. The main strategy has been to identif genetic variants that influence the mean of a quantitative trait of interest. However, high-risk groups are often identified as the subjects who have either high or low values for their quantitative traits. Therefore, it is more meaningful to investigate the genetic association with the upper or lower quantiles of the complex traits. Moreover, recent studies indicate that genetic variants could influence the entire distributions of the complex traits, and their impact could differ at various quantiles. Hence, we propose to apply quantile regression methods to the secondary complex traits in GWAS and Sequencing studies. Since most GWAS and Sequencing studies use case-control sample schemes, they are not representative samples to the general population. Naively estimated regression quantiles could be substantially biased from the true association in general. Statistical methods recovering the population associations from case- control sample are known as """"""""secondary analysis"""""""". Most of these methods are likelihood based, and only estimate the genetic effect on the means of the traits. They cannot be applied directly to obtain quantile estimates. In order to make consistent and efficient estimation on conditional quantiles, herein we propose a novel family of estimating equations, and also develop all the necessary statistical tools for inference, variable selection and ranking. We will apply the developed methods to GWAS and Sequencing studies to investigate the genetic association with human quantitive traits. The proposed work has great potential to deepen and expand the existing knowledge on the genetic basis of quantitative traits.

Public Health Relevance

The project is aimed to develop new statistical methods to make consistent and efficient estimation on the conditional quantiles of the secondary traits in case control samples. The developed methods will be applied to GWAS and Sequencing studies to investigate the genetic basis of the secondary complex traits. Once complete, the developed methods and its applications have great potential to deepen and expand the existing knowledge in genetics, and to contribute significantly to the fields of statistics as well.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Small Research Grants (R03)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Columbia University (N.Y.)
Biostatistics & Other Math Sci
Schools of Public Health
New York
United States
Zip Code
Song, Xiaoyu; Li, Gen; Zhou, Zhenwei et al. (2017) QRank: a novel quantile regression tool for eQTL discovery. Bioinformatics 33:2123-2130
Wei, Ying; Song, Xiaoyu; Liu, Mengling et al. (2016) Quantile Regression in the Secondary Analysis of Case-Control Data. J Am Stat Assoc 111:344-354
Song, Xiaoyu; Ionita-Laza, Iuliana; Liu, Mengling et al. (2016) A General and Robust Framework for Secondary Traits Analysis. Genetics 202:1329-43