Risks of complex diseases, such as cancers, hypertension, diabetes, and schizophrenia, are determined by both genetic and environmental factors. Advances in human genome research have thus led to epidemiologic investigations not only of the effects of genes alone, but also of their effects in combination with environmen- tal exposures. The case-control study design, which has been widely used in classical questionnaire-based epidemiologic studies, is now commonly employed to study the role of genes and gene-environment interac- tions in the etiology of complex diseases. Recently, a broad class of semiparametric retrospective-likelihood methods has been developed for the analysis of case-control genetic data in the presence of environmental factors. These methods exploit knowledge about the distribution of genetic variants in order to build esti- mators that are much more statistically efficient than other approaches, and are also statistically valid in the presence of incomplete genetic data, such as missing marker alleles and unknown haplotypes. Because this kind of methodology is not available in any commercial software, researchers have resorted to standard approaches, which lack statistical efficiency and sometimes validity. As a result, important gene-environment interactions are obscured, as are important main effects. The goal of this project is to develop Stata software to implement the semiparametric retrospective-likelihood and related methods. The software will accom- modate missing genotypes, phase ambiguity, untyped markers, flexible disease-risk models with gene-gene and gene-environment interactions, genomewide association studies, population stratification, and models both with and without Hardy-Weinberg equilibrium. This tool will be highly useful to epidemiologists and geneticists in their search for genetic and environmental determinants of complex diseases.

Public Health Relevance

Risks of complex diseases, such as cancers, hypertension, diabetes, and schizophrenia, are determined by both genetic and environmental factors. Advances in human genome research have thus led to epidemiologic investigations not only of the effects of genes alone, but also of their effects in combination with environmental exposures. This project will implement novel and efficient statistical methods for the analysis of case-control genetic data in the presence of environmental factors, and thus bring into the mainstream better ways of detecting genetic effects and gene-environment interactions.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Small Business Innovation Research Grants (SBIR) - Phase II (R44)
Project #
2R44HG004536-02
Application #
7668314
Study Section
Special Emphasis Panel (ZRG1-GGG-J (10))
Program Officer
Ramos, Erin
Project Start
2007-07-01
Project End
2011-04-30
Budget Start
2009-05-15
Budget End
2010-04-30
Support Year
2
Fiscal Year
2009
Total Cost
$374,955
Indirect Cost
Name
Statacorp, Lp
Department
Type
DUNS #
147662068
City
College Station
State
TX
Country
United States
Zip Code
77845