Despite the successes of sequence-based genetic association and functional genomic studies, the X chromosome, which is enriched with disease-relevant genes, is frequently understudied. Here, we propose innovative approaches to identify regulatory variants and enhance the association analysis for X. Functional genomics and disease association studies for the X chromosome are challenging, in part due to the complexities of X-chromosome inactivation (XCI) in females, the dosage compensation process that epigenetically inactivates one X. Due to XCI mosaicism, the assignment of active X (Xa)/inactive X (Xi) varies between cells, which poses difficulties for inferring XCI states and estimating Xi expression levels. Furthermore, while most X-linked gene dosage is equalized between sexes by XCI, up to >20% of genes escape XCI and are expressed from both Xs. Importantly, XCI escape exhibits inter-individual differences. Such biological complexity results in increased gene expression heterogeneity in females, and makes it difficult to properly analyze X-linked associations. As a result, the genomic architecture of XCI escape remains poorly understood. The association analysis on X is underpowered and results are difficult to interpret. To improve X chromosome analyses, we propose to quantify Xa/Xi expression from RNA-seq datasets, study Xi expression as a heritable trait, identify genetic variants that influence Xi expression levels and incorporate the inferred XCI states into association analysis. Specifically, we will quantify Xi expression levels from population scale bulk RNA-seq data (Aim 1). The methods will maximize the utility of broadly available RNA-seq datasets in diverse tissues types from normal and disease samples. They will greatly complement single-cell RNA-seq data, which are typically only available for a very small number of samples and hence inadequate for assessing subtle inter-individual differences in human disease studies. Next, in order to understand genetic influences on XCI escape, we propose a Gaussian hierarchical model that simultaneously detects associations with Xa and Xi expression levels (Xa-/Xi- QTL) and estimates Xi expression heritability. We further propose to model inferred XCI states and their spatial clustering patterns in eQTL mapping, which greatly improves power compared to nave approaches that ignore XCI states (Aim 2). Finally, we will develop more powerful methods that integrate inferred XCI states into genotype-phenotype association for analyzing X-linked genes (Aim 3). In our preliminary analysis, we demonstrated for the first time that XCI escape has significant heritability. These methods will allow the comprehensive assessment of the impact of XCI on human complex traits. We will apply our methods to some of the largest datasets for a variety of complex traits including lupus, diabetes and addiction. Together, we expect the proposed research projects to bring significant improvement for functional genomics and disease association analysis of the X chromosome.
X-linked genes have been clearly implicated in many complex traits with significant public health relevance. We propose to develop novel computational approaches to understand the mechanisms of X chromosome inactivation (XCI) maintenance and escape, and incorporate the knowledge on XCI to enable more powerful discovery of complex trait genes. Results from this study will have the potential to provide new knowledge on XCI biology, implicate novel drug targets and guide the design of personalized therapeutics.