In stark contrast to Mendelian disorders, the majority of complex trait-associated common variants map to non-protein coding regions. Since there is a less well-developed genetic code for the much larger non- protein coding portion of the genome, identifying the gene(s) and causal alleles underlying non- Mendelian/complex traits presents a challenge. Given the rapidity with which genome wide association studies (GWAS) are discovering regions associated with complex traits, gene and causal allele identification have become severe bottlenecks. The overall goal of this proposal is to outline a coherent set of strategies to discover causal genes and alleles underlying complex traits. While the proposal focuses on prostate cancer, the strategies are generic and can be applied to any non-protein coding locus. The central hypothesis is that prostate cancer risk loci are regulatory elements. Recent data convincingly demonstrate that GWAS loci are enriched for regulatory elements. Regulatory elements can control the level of expression of genes. The correlation between the number of alleles an individual carries (0, 1, or 2) and transcript levels can be investigated. Variants that control RN levels are often referred to as expression quantitative trait loci (eQTLs). The existence of an eQTL-target gene relationship provides a strong foundation upon which to pursue gene and causal allele identification.
The first aim will discover eQTL/transcript pairs for all known prostate cancer risk alleles in prostate tissue from 500 men. The highly quantitative Nanostring platform will be used to measure transcript levels.
Aim 2 will employ functional assays to ensure that the genes discovered in Aim 1 are relevant to prostate cancer biology. The functional assays will be performed using nuclease technology, a novel strategy to upregulate and downregulate genes. Transcription activator-like effector nuclease (TALEN) technology has the ability to create DNA sequence modifications in a directed manner and with exquisite precision directly in a genomic location of choice. This technology radically differs from more traditional methods in that it creates stable and heritable changes in a location targeted by the investigator.
Aim 3 will focus on causal allele identification for loci demonstrating an eQTL/transcript association. An integrative strategy using genetic and epigenetic approaches will be used to identify a candidate set of causal alleles. These candidates will then be functionally tested using TALENs to engineer specific genetic modifications in appropriate cell lines. Modifications at the causal allele site will be expected to influence transcription. At the completion of this project,we fully anticipate that we will have begun to unravel the genes/pathways that initiate human prostate cancer. Discovering the mechanisms underlying prostate cancer will not only inform the biology of this disease, but may also reveal opportunities to more rationally intervene in treatment and prevention.

Public Health Relevance

To date, most genetic risk factors for complex traits are located outside of known genes. Our study focuses on developing strategies for identifying the causal genes and alleles underlying non-protein coding complex trait associated loci. Employing these strategies will lead to a more profound understanding of the genetic mechanisms and alleles that drive human biology.

National Institute of Health (NIH)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Dana-Farber Cancer Institute
United States
Zip Code