Breast cancer is the most commonly diagnosed malignancy in the United States. The age-adjusted mortality rate of this cancer is more than 40% higher in African Americans (AAs) than in whites for reasons poorly understood. Since 2007, genome-wide association studies (GWAS) conducted in Asian and European descendants have identified nearly 100 susceptibility loci for this cancer. However, only a few of the initially identified risk variants can be directly replicated in AAs due to a small sample sizein previous studies and racial differences in genetic architectures and genetic/environmental modifiers. GWAS are often not equipped to study structural variants and are inefficient for capturing low-frequency variants. These variants, although virtually uninvestigated to date, are believed to contribute substantially to the heritability of breast cancer and other complex traits, particularly in African-ancestry populations. Furthermore, compared with Asian- and European-ancestry populations, the African-ancestry genome is much more heterogeneous and thus more informative, particularly as we expand the scope of genetic studies from common to less-common variants using next-generation sequencing technology. Herein, we propose a large consortium study in AAs to systematically search the whole genome to discover novel genetic susceptibility factors for breast cancer and further evaluate the influence of germline risk variants on breast cancer biology. Nearly 20,000 AA breast cancer patients and an equal number of controls will be included in this study. In Stage 1, we propose to sequence the whole genome for 1,200 breast cancer cases and 600 controls for association analyses. We will then use these sequencing data, along with data from other sources, to build a novel, comprehensive reference panel for imputation and meta-analysis of approximately 6,300 cases and 6,300 controls genotyped in four previous GWAS conducted in African-ancestry populations. We will utilize publically available genetic data, including functional genomic data, to enhance the abilit of the two aforementioned analyses to identify promising breast cancer susceptibility genes and variants for replication. In Stage 2, we will replicate approximately 60,000 promising variants in 5,500 cases and 5,500 controls. Genes/variants which show a promising association in Stage 2 will be evaluated further in Stage 3, including two additional stages (3A and 3B) in approximately 7,500 cases and 7,500 controls. Finally, we will use gene expression signatures to evaluate how germline risk variants identified in this study and previous studies affect the major signaling pathways of breast cancer. This proposed study will generate critically needed data in AAs to improve the understanding of the genetics, biology, and etiology of breast cancer.

Public Health Relevance

Mortality rates for breast cancer are more than 40% higher in African Americans than whites for reasons poorly understood. Genetic factors play a major role in the etiology of this common malignancy, but limited studies have been conducted in African Americans. In this application, we propose a large consortium to search the whole genome for genetic risk factors for breast cancer using novel technologies and study methods and further evaluate how genetic factors affect cancer biology and influence racial disparities in breast cancer risk.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-PSE-U (90)S)
Program Officer
Martin, Damali
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
United States
Zip Code