When using case-parent triads to study the association of single nucleotide polymorphisms with disease, the phenotype of the child is used but the phenotypes of the parents are ignored, even though all three family members are genotyped. When parental phenotypes for the disease under study are available, including them in the analysis will bring more information to bear on disease-gene associations. We have developed and evaluated new approaches for using parental phenotypes together with the usual data from case-parents studies to increase power for detecting associations. Our approach uses parental phenotypes to assess association independently of the usual test based on offspring genotypes. Moreover, our procedure for using parental phenotypes is robust to bias from hidden genetic population structure because our statistical model employs strata defined by the pair of parental genotypes. Our simulations support this claim of robustness and show that that incorporating information about parental phenotypes can enhance power compared to using offspring phenotypes alone. We are in the process of developing methods for using exposures measured in pooled specimens from several individuals, together with genotypes measured separately on each individual, to study gene-environment interactions. Suppose one has case-control study and genotyped each individual at a panel of SNPs (single nucleotide polymorphisms). Suppose that one also has biological specimens (e.g., serum or urine) from the same individuals but lacks the budget to assay each individual specimen for an exposure of interest. Pooling specimens and assaying the resulting pooled specimens will not only save assay costs but preserve specimen volume for future uses. In the past, we have developed methods for analyzing case-control studies with exposures measured in pooled specimens. Those methods assume, reasonably, that the measured value on the pooled specimen is the average of the values for the individual specimens. With those methods, testing gene-environment interactions at a single SNP required creating specimen pools within strata of individuals who all had the same genotype for that SNP. To study gene-environment interactions for a panel of SNPs, our previous methods would require creating new pooled specimens for each SNP studied and the potential savings in assay costs would disappear. The approach that we are developing regards the individual measurements as missing data and uses the pooled specimens in a principled way to impute those missing data. With a give set of imputed data in hand, we can use standard statistical methods for case-control data to estimate gene-environment interactions. In practice, we use a multiple-imputation approach: creating multiple sets of imputed data, doing a case-control analysis for each set, and combining the results from the multiple analyses. This approach has shown some promise but some problems remain to be resolved. Work on this problem is ongoing. Identification of causative SNPs in a genome-wide study can be challenging when individual SNPs have small marginal effects because testing thresholds must reflect the large number of SNPs under study. For complex diseases, particular combinations of SNPs may dramatically increase risk a kind of epistasis or gene-gene interaction. We are currently investigating the use of a machine learning technique for the discovery of sets of SNPs that together cause disease (causative SNPs) in case-parents data. First, we devised a way to use actual case-parent triad genotypes to create simulated genome-wide data sets that reflect realistic linkage disequilibrium structure and are seeded with known sets of causative SNPs. Second, we implemented an existing stochastic search algorithm (called GA-KNN) that is based on an evolutionary algorithm to find multiple sets of k SNPs that are predictive of disease (here k is a small number, say 2 or 4). By cataloguing those SNPs which appear most frequently among the sets that are predictive of disease, we hope to uncover the sets of causative SNPS. In preliminary trials on simulated data seeded with two interacting sets of four SNPs each, our approach shows promise. In ongoing work, we are attempting to speed up the algorithm and to see whether the promising performance is maintained in more complex situations. (see also Z01 ES040007; PI Clare Weinberg; Min Shi is also a within-lab collaborator on this project; her time is allocated in Weinbergs project but not in this one.)

Project Start
Project End
Budget Start
Budget End
Support Year
20
Fiscal Year
2015
Total Cost
Indirect Cost
Name
U.S. National Inst of Environ Hlth Scis
Department
Type
DUNS #
City
State
Country
Zip Code
Shi, Min; Umbach, David M; Weinberg, Clarice R (2015) Using parental phenotypes in case-parent studies. Front Genet 6:221
Shi, Min; Umbach, David M; Weinberg, Clarice R (2014) Disentangling pooled triad genotypes for association studies. Ann Hum Genet 78:345-56
Weinberg, Clarice R; Shi, Min; DeRoo, Lisa A et al. (2014) Asymmetry in family history implicates nonstandard genetic mechanisms: application to the genetics of breast cancer. PLoS Genet 10:e1004174
Deroo, Lisa A; Bolick, Sophia C E; Xu, Zongli et al. (2014) Global DNA methylation and one-carbon metabolism gene polymorphisms and the risk of breast cancer in the Sister Study. Carcinogenesis 35:333-8
Shi, Min; Umbach, David M; Weinberg, Clarice R (2013) Case-sibling studies that acknowledge unstudied parents and permit the inclusion of unmatched individuals. Int J Epidemiol 42:298-307
Weinberg, Clarice R; Shi, Min; Umbach, David M (2011) A sibling-augmented case-only approach for assessing multiplicative gene-environment interactions. Am J Epidemiol 174:1183-9
Weinberg, Clarice R; Shi, Min; Umbach, David M (2011) Re.: ""Genetic association and gene-environment interaction: a new method for overcoming the lack of exposure information in controls"". Am J Epidemiol 173:1346-7; author reply 1347-8
Shi, Min; Umbach, David M; Weinberg, Clarice R (2011) Family-based gene-by-environment interaction studies: revelations and remedies. Epidemiology 22:400-7
Shi, Min; Umbach, David M; Weinberg, Clarice R (2010) Testing haplotype-environment interactions using case-parent triads. Hum Hered 70:23-33
Vermeulen, Sita H; Shi, Min; Weinberg, Clarice R et al. (2009) A hybrid design: case-parent triads supplemented by control-mother dyads. Genet Epidemiol 33:136-44

Showing the most recent 10 out of 11 publications