Genome-wide association studies typically compare cases and controls for single SNP autosomal variants under the implicit assumption that heritable effects are secondary to inherited autosomal genetic variants. Four nonstandard genetic mechanisms could be involved as well: sex-linked traits, matermally-mediated effects where the mother influences the development of her fetus during gestation, and this influences later risk, mutations in the mitochondrial DNA, and parent-of-origin effects. Each of these nonstandard mechanisms can cause asymmetry in family history data, which can be studied even in the absence of any genotype data. In one project, we are estimating the extent of asymmetry that would be produced in family history data secondary to the existence of such mechanisms. We applied this strategy to family history data from our large study of women, each of whom had as sister diagnosed with breast cancer (the NIEHS Sister Study), and found evidence that maternal grandmothers of young-onset (under age 50) cases of breast cancer were more likely to have had breast cancer than were their paternal grandmothers. This suggests there may be maternally-mediated genetic risk factors for breast cancer, that there may be imprinted genes related to risk or that mitochondrial variants play a role. Epigenetics could also be important for breast cancer. A particularly important design we are now considering involves a "tetrad" structure, with one affected and one unaffected offspring, in addition to the two parents. This design has been implemented in the Two Sister Study (funded in part by Susan G. Komen for the Cure), which is assessing the joint role of genetic and environmental risk factors in young-onset (under age 50) breast cancer. The discordant sib pair allows estimation of effects of exposures, while the embedded case-parent triad allows detection of haplotypes that confer either protection or risk. The tetrad analyzed together should provide a powerful design for assessing gene-by-environment interaction. We have been working on developing and evaluating methods for use with the tetrad design. The Two Sister Study completed enrollment of nuclear families where one daughter developed breast cancer before age 50 and the other daughter is unaffected. This is described under a separate project. Inherited genotypes, together with tumor characteristics, will need to be explored to investigate factors that predict the clinical course following treatment, and improved statistical methods will also need to be developed in that context. We are undertaking a genome-wide association study based on these data through a contract with the Center for Inherited Disease Research at Johns Hopkins and will be able to explore gene-by-environment effects on risk of young-onset breast cancer and also look at maternally-mediated effects and possible parent-of-origin effects on risk. The genotype data are expected at the end of September. The Illumina platform that will be used is the human OmniExpress plus Exome array, and the use of the exome typing will impose the need to develop further methods appropriate for rare alleles. We also are participating in the GAME-ON consortium, which will provide additional SNPs from the newly developed oncochip. Together with a graduate student from UNC Biostatistics, Alison Wise, we are working on a machine-learning approach to finding complex epistatic and gene-by-environement interactions based on case-parent triads. We downloaded case-parent triad data from dbGaP on oral clefts, sanitized it for real effects and are using those genomes to generate simulated case-parent triad data with known GxGxGxG interations. We are working to develop an algorithm that can search through the enormous search space of 3-way choices of SNPs from the GWAS data and identify the right multi-SNP model, even when the attributable risk is very small. We are also working on assessing the performance of our new method for identifying risk-related variants on the X chromosome. Our method, the PI-XLRT, makes use of parental information in a robust way in addition to the transmission distortion, and thus makes more efficient use of the data than do existing methods. A paper on identifying risk-related variants on the X is in preparation.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Zip Code
Shi, Min; Umbach, David M; Weinberg, Clarice R (2014) Disentangling pooled triad genotypes for association studies. Ann Hum Genet 78:345-56
Weinberg, Clarice R; Shi, Min; DeRoo, Lisa A et al. (2014) Asymmetry in family history implicates nonstandard genetic mechanisms: application to the genetics of breast cancer. PLoS Genet 10:e1004174
Kistner, Emily O; Shi, Min; Weinberg, Clarice R (2009) Using cases and parents to study multiplicative gene-by-environment interaction. Am J Epidemiol 170:393-400