While common variants are known to additively contribute to gene expression variation, there has been limited statistical evidence of gene-by-environment interactions (GxE) in humans. This is because even the largest expression quantitative trait loci (eQTL) studies have little statistical power to detect GxE interactions given the large number of segregating loci and extensive variability in environmental exposure. We hypothesize that the integration of multiplexed perturbations and single-cell RNA-sequencing is an efficient strategy for mapping GxE interactions in large population cohorts. However, current approaches are not scalable to sequencing 107 cells across 104 samples (i.e. 103 donors by 10 conditions) needed for sufficiently powered perturbation screens in human cohorts. In this proposal, we will first develop a cost-effective single-cell RNA-sequencing approach called DIT-seq that reduces the cost of sequencing to $0.06/cell (Aim 1). We will then develop strategies for encoding environmental perturbations using sample multiplexing to map and characterize GxE interactions in the human immune response (Aim 2). Finally, we will develop a new statistical model and a computational pipeline for efficient hypothesis testing using tens of millions of cells (Aim 3). The experimental and computational technologies proposed have the potential to create fundamental new ways to study genotype-phenotype relationships and the biological insights gained could shed light on the genetic architecture of gene expression and facilitate the interpretation of disease-associations from large-scale genome and exome sequencing studies.
We propose to develop scalable experimental and computational approaches to efficiently perturb, profile, and analyze tens of millions of single cells. We apply our approaches to study the interaction between common genetic variants and in vitro stimulation mimicking environmental exposure in the human immune response. The technologies proposed enables perturbation screens in large population cohorts and the results gleaned will shed new light on how sequence variants interact with environment to determine variation in human traits.