In stark contrast to Mendelian disorders, the majority of complex trait-associated common variants map to non-protein coding regions. Since there is a less well-developed genetic code for the much larger non- protein coding portion of the genome, identifying the gene(s) and causal alleles underlying non- Mendelian/complex traits presents a challenge. Given the rapidity with which genome wide association studies (GWAS) are discovering regions associated with complex traits, gene and causal allele identification have become severe bottlenecks. The overall goal of this proposal is to outline a rigorous strategy to discover functionally causal variants and genes underlying complex traits. While the proposal focuses on breast cancer, the strategies are generic and can be applied to any non-protein coding locus. The central hypothesis is that cancer risk loci are regulatory elements. Recent data convincingly demonstrate that GWAS loci are enriched for regulatory elements. Regulatory elements control the level of expression of genes. Causal genes and variants are difficult to discover because the scientific community is less adept at annotating the non-protein coding portion of the genome. This proposal seeks to utilize three powerful tools, expression quantitative trait loci (eQTL), circular chromosome conformation capture (4C) and genome editing to identify causal genes and alleles.
Both Aims 1 and 2 are logically and structurally similar - identify enhancer-target gene interactions (using eQTL in Aim 1 and 4C/TALE-LSD1 in Aim 2), identify candidate causal variants using case-control fine mapping data intersected with epigenetic profiling, and perform genome editing on candidate causal variants. The variant that affects the predetermined readout - changes in gene expression (Aim 1) or allele- specific expression (Aim 2) - will be deemed a causal functional polymorphism. In parallel, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) evaluation of the candidate causal variants will be performed. Information from these assays will be integrated with genetic and epigenetic data to define the functionally causal variant.
Aim 3 will test the target genes in cell based models to understand their influence on cancer-related phenotypes, such as proliferation and invasion. At the completion of this project, we fully anticipate that we will have begun to unravel the genes/pathways that initiate human breast cancer. Discovering the mechanisms underlying prostate cancer will not only inform the biology of this disease, but may also reveal opportunities to more rationally intervene in treatment and prevention.

Public Health Relevance

To date, most genetic risk factors for complex traits are located outside of known genes. This proposal focuses on developing strategies for identifying the actual causal variants underlying breast cancer. Employing these strategies will lead to a more profound understanding of the genetic mechanisms that drive breast cancer.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Cancer Genetics Study Section (CG)
Program Officer
Nelson, Stefanie A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Dana-Farber Cancer Institute
United States
Zip Code