Colorectal cancer (CRC) is the second leading cause of cancer death in the US. Linkage studies and genome-wide association studies (GWAS) have successfully identified high-penetrance mutations such as those that occur in APC or DNA mismatch-repair genes, as well as low-penetrance variants such as 8q24 and SMAD7. However, these variants explain only a fraction of the heritability of CRC. This is not surprising, as contributions from large classes of genetic variation, specifically less frequent and rare singl nucleotide variants (SNV) with allele frequency of 0.1-5%, insertion/deletions (indels), and copy number variants (CNVs), have not been systematically investigated across the genome. These genetic variants are predicted to have stronger effect sizes than common low-penetrance variants and are postulated to explain a substantial proportion of the heritability of CRC. To comprehensively identify these variants across the genome, we propose to use next generation technology to sequence the whole genome with 12x coverage in 2,123 high-risk CRC cases and 2,123 controls (Aim 1.1). These cases and controls will be selected from our existing Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO;U01CA137088, PI: Peters) of 15 well-characterized prospective cohorts and case-control studies. We demonstrate that combining whole genome sequence data with imputation using existing GWAS data in large sets of case-control studies allows a powerful and efficient screen for CRC susceptibility loci. This method is particularly well suited to identifying less frequent and rare SNVs, indels, and CNVs. Accordingly, in Aim 1.2 we use the sequencing data from Aim 1.1 to impute ~20M variants in an additional 8,958 CRC cases and 10,212 controls with existing GWAS data. We will test the associations between CRC risk and variants (sequenced and imputed) in a total of 11,081 cases and 12,335 controls.
In Aim 1. 3, we will replicate the most promising loci by genotyping 3,000 variants in 8,827 independent CRC cases and 8,595 controls.
In Aim 2, we will investigate gene-environment interactions for directly sequenced and imputed variants, utilizing GECCO studies, which have detailed clinical and epidemiologic data that have already been harmonized across studies. To improve the power for Aim 1 and 2, we will apply novel statistical methods. This project brings together a highly qualified, multidisciplinary team of investigators with expertise in CRC research, biostatistics, population and statistical genetics, epidemiology, and next generation sequencing. We expect to identify several novel CRC susceptibility variants with effect sizes larger than previous GWAS findings. These results will improve our understanding of which genes are impacting CRC. Such knowledge about the underlying biology could have long term impacts on screening, treatment and disease prevention.

Public Health Relevance

This multidisciplinary effort will investigate whether different types of genetic variations, including rare variants and structural variation, influence colorectal cancer risk in humans. Specifically, we will examine genetic variants across the entire genomes of colorectal cancer cases and controls to identify new genetic risk factors for colorectal cancer. Findings from this study will improve our knowledge of the full spectrum of genes that affect the risk of this severe disease.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project--Cooperative Agreements (U01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-PSE-Q (02))
Program Officer
Mechanic, Leah E
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Fred Hutchinson Cancer Research Center
United States
Zip Code
Rashkin, Sara; Jun, Goo; Chen, Sai et al. (2017) Optimal sequencing strategies for identifying disease-associated singletons. PLoS Genet 13:e1006811
Lindström, Sara; Finucane, Hilary; Bulik-Sullivan, Brendan et al. (2017) Quantifying the Genetic Correlation between Multiple Cancer Types. Cancer Epidemiol Biomarkers Prev 26:1427-1435
Zhao, Wei; Chen, Ying Qing; Hsu, Li (2017) On estimation of time-dependent attributable fraction from population-based case-control studies. Biometrics 73:866-875
Bien, Stephanie A; Auer, Paul L; Harrison, Tabitha A et al. (2017) Enrichment of colorectal cancer associations in functional regions: Insight for using epigenomics data in the analysis of whole genome sequence-imputed GWAS data. PLoS One 12:e0186518
Dimitrakopoulou, Vasiliki I; Tsilidis, Konstantinos K; Haycock, Philip C et al. (2017) Circulating vitamin D concentration and risk of seven cancers: Mendelian randomisation study. BMJ 359:j4761
Yang, Baiyu; Thrift, Aaron P; Figueiredo, Jane C et al. (2016) Common variants in the obesity-associated genes FTO and MC4R are not associated with risk of colorectal cancer. Cancer Epidemiol 44:1-4
Phipps, Amanda I; Passarelli, Michael N; Chan, Andrew T et al. (2016) Common genetic variation and survival after colorectal cancer diagnosis: a genome-wide analysis. Carcinogenesis 37:87-95
McCarthy, Shane; Das, Sayantan; Kretzschmar, Warren et al. (2016) A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet 48:1279-83
Kocarnik, Jonathan M; Chan, Andrew T; Slattery, Martha L et al. (2016) Relationship of prediagnostic body mass index with survival after colorectal cancer: Stage-specific associations. Int J Cancer 139:1065-72
Gong, Jian; Hutter, Carolyn M; Newcomb, Polly A et al. (2016) Genome-Wide Interaction Analyses between Genetic Variants and Alcohol Consumption and Smoking for Risk of Colorectal Cancer. PLoS Genet 12:e1006296

Showing the most recent 10 out of 49 publications