This project covers DCEG whole genome scan research and other activities that were originally part of the Genes, Environment and Health Initiative (GEI). Research activities currently underway include:This work was originally based on a grant from GEI (Gene-Environment Initiative) to conduct a whole genome scan on lung cancer and the smoking phenotype on over 5500 subjects from the Environmental And Genetic Lung Cancer Etiology (EAGLE) study and the lung cancer etiology arm of the Prostate, Lung, Colorectal and Ovarian Cancer (PLCO) Screening Trial. Further GWAS data was generated on ATBC (The Alpha Tocopheral Beta Carotene). All of these data were eventually incorporated in a series of publications that describe the genetic factors associated with lung cancer and its subgroups (see below). These data have supported studies of statistical methods, i.e., studies into the properties of procedures for case-control genome-wide association studies (CCGWASs) that select the SNPs whose chi-square trend tests are largest (or whose corresponding p-values are smallest). We showed that for rare diseases association tests for SNPs are independent if the SNP genotypes are independent in the source population. This result allowed us to develop analytic and simulation techniques to study CCGWASs. These analyses showed that large samples are needed to have a high detection probability (the chance a true disease SNP appears in the top ranks of chi-square values).A large genome-wide association analysis of lung cancer was conducted including one population-based study (EAGLE) and three cohort studies (PLCO, ATBC and CPS-II). Three major genomic loci on chromosome 15q25, 5p15 and 6p21 have been confirmed in association with lung cancer risk. A locus on chromosome 5p15 was distinctly associated with risk of adenocarcinoma. A large meta-analysis including over 33,000 subjects from several American and European studies have been conducted to confirm these results. Analyses of the association between genetic variants and lung cancer risk by subgroups of interest, including smoking status, early age at onset, gender, histology, family history of lung cancer and stage are ongoing. Analyses of genetic determinants of several smoking phenotypes are also underway. Analyses of smoking-gene interaction, gene-gene interaction and pathway analyses in relation to lung cancer risk are planned in collaboration with other groups within the International Lung Cancer Consortium (ILCCO). In addition, new studies involving key exposures (smoking, alcohol, caffeine and others), new traits (such as melatonin levels) and additional malignancies (lymphomas, bladder cancer, and others) are the focus of study using available data and extending collaborations to include others. Integrative studies incorporating the results of new genomic technologies (methylation, transcriptome, sequencing, etc.) are in progress. New exposure technologies, i.e., metabolomics, microbiome, and cytokine data will be the next area to be integrated and future studies incorporating mendelian randomization are anticipated.