More than 180 susceptibility loci for breast cancer have been identified by genome-wide association studies (GWAS), mainly in Caucasian populations. However, many of these risk variants cannot be directly replicated in women of African ancestry, suggesting that causal variants are yet to be identified. Polygenic risk scores (PRS), which aggregate common genetic variants identified by GWAS, have been developed to predict genetic risk of breast cancer for Caucasian women, but there is no validated PRS for African American women. The linkage equilibrium in African ancestry populations is much less extensive than in Caucasian and Asian populations, which makes African ancestry population the ideal population to find causative variants after localizing a breast cancer susceptibility locus. Therefore, we propose a comprehensive analytical study that leverages several types of existing genetic datasets for breast cancer available to us and in public domains to address three specific aims. First, we aim to conduct cross-ethnic fine-mapping analysis for narrowing down casual variant candidate lists in 180+ loci of breast cancer. We will compile and harmonize genetic data from studies of breast cancer in women of African ancestry, including 7,525 cases and 6,207 controls, and leverage the association results from Caucasians (>122,000 cases and >105,000 controls), East Asians (>14,000 cases and >13,000 controls), and Latinos (4,400 cases and 7,500 controls). We will use a Bayesian statistical method to directly incorporate multiple functional annotations for the top variants in each locus. Second, we aim to develop breast cancer polygenic risk score models in African Americans by leveraging functional annotations, linkage disequilibrium, and gene expression data. Several PRSs will be developed for overall breast cancer risk and by estrogen receptor, cross-validated internally, and validated with external studies. Third, we aim to develop breast cancer risk prediction model by combining both genetic and non-genetic factors. The proposed study will efficiently utilize several types of existing data using innovative integrative approaches and has the potential to advance the field by narrowing down the genetic regions containing causal variants. More importantly, the risk prediction model has a good potential to translate knowledge from GWAS to the practice of breast cancer screening.

Public Health Relevance

The primary objective of this study is to discover genetic alterations in genes that contribute to increased breast cancer risk. We wish to find the causal risk factors that share across populations using multiple types of genomic data, in order to gain insights into the mechanisms underlying breast cancer development. We will also develop risk assessment tools that include both genetic and non-genetic factors to translate knowledge into better breast cancer screening in all women.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Cancer, Heart, and Sleep Epidemiology B Study Section (CHSB)
Program Officer
Martin, Damali
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Chicago
Public Health & Prev Medicine
Schools of Medicine
United States
Zip Code