Nearly a decade ago, Risch and Merikangas suggested the possibility of conducting genome-wide association scans. Although the cost was prohibitive at the time, they predicted that these technological barriers would eventually be overcome. With the advent of 500K chip-based or bead technologies, at a cost of about 0.2 cents per genotype, that prediction has now become a reality. Nevertheless, these will still be expensive studies to conduct and there remain numerous methodological challenges to efficient and valid design of such studies. To address these issues, we convened a panel of 165 investigators from around the world at USC in April 2005. These discussions highlighted a number of study design and statistical analysis problems that we propose to continue working on as part of this Cooperative Agreement. Our team is also involved in conducting and planning several such studies for such conditions as breast, colon, and prostate cancer and age-related macular degeneration. We anticipate that this research will inform the conduct of these studies and be motivated by the needs of these projects (as well as the many others at other institutions). In particular, we propose to focus on the following methodological issues: (1) tag SNP selection and haplotype-based methods incorporating both case-control association and case-case sharing comparisons; (2) multiple testing procedures for multistage sampling designs, including hierarchical models for prioritizing SNPs for further consideration using external genomic data; (3) family- vs. population-based studies and allowance for population stratification and admixture; and (4) gene-gene and gene-environment interactions. To investigate these problems, we will apply the methods to real data from our own studies (the Multiethnic Cohort and the Los Angeles Latino Eye Study of age-related macular degeneration), as well as data available in public databases such as the HapMap Project. Since most genome-wide datasets are limited to relatively small samples and are not connected to any phenotype information, we will develop ways for using these real data to generate large populations that would contain realistic degrees of genetic diversity that would look like those seen in these small samples. We will then sample from these populations to simulate replicate case-control data sets under known phenotype models to investigate the statistical performance of alternative study designs and analysis methods. ? ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Institute of Environmental Health Sciences (NIEHS)
Type
Research Project--Cooperative Agreements (U01)
Project #
5U01ES015090-02
Application #
7280472
Study Section
Special Emphasis Panel (ZHG1-HGR-P (J1))
Program Officer
Mcallister, Kimberly A
Project Start
2006-09-01
Project End
2009-07-31
Budget Start
2007-08-01
Budget End
2008-07-31
Support Year
2
Fiscal Year
2007
Total Cost
$309,460
Indirect Cost
Name
University of Southern California
Department
Public Health & Prev Medicine
Type
Schools of Medicine
DUNS #
072933393
City
Los Angeles
State
CA
Country
United States
Zip Code
90089
Franklin, Meredith; Vora, Hita; Avol, Edward et al. (2012) Predictors of intra-community variation in air quality. J Expo Sci Environ Epidemiol 22:135-47
Thomas, Duncan C (2012) Some surprising twists on the road to discovering the contribution of rare variants to complex diseases. Hum Hered 74:113-7
Thomas, Duncan C (2012) Genetic epidemiology with a capital E: where will we be in another 10 years? Genet Epidemiol 36:179-82
Quintana, Melanie A; Berstein, Jonine L; Thomas, Duncan C et al. (2011) Incorporating model uncertainty in detecting rare variants: the Bayesian risk index. Genet Epidemiol 35:638-49
Li, Dalin; Lewinger, Juan Pablo; Gauderman, William J et al. (2011) Using extreme phenotype sampling to identify the rare causal variants of quantitative traits in association studies. Genet Epidemiol 35:790-9
Yang, Fan; Thomas, Duncan C (2011) Two-stage design of sequencing studies for testing association with rare variants. Hum Hered 71:209-20
Murcray, Cassandra E; Lewinger, Juan Pablo; Conti, David V et al. (2011) Sample size requirements to detect gene-environment interactions in genome-wide association studies. Genet Epidemiol 35:201-10
Chen, Gary K; Chen, Gary; Wei, Peng et al. (2011) Incorporating biological information into association studies of sequencing data. Genet Epidemiol 35 Suppl 1:S29-34
Li, Dalin; London, Stephanie J; Liu, Jinghua et al. (2011) Association of the calcyon neuron-specific vesicular protein gene (CALY) with adolescent smoking initiation in China and California. Am J Epidemiol 173:1039-48
Wilson, Melanie A; Baurley, James W; Thomas, Duncan C et al. (2010) Complex system approaches to genetic analysis Bayesian approaches. Adv Genet 72:47-71

Showing the most recent 10 out of 20 publications