Susceptibility to sporadic forms of cancer is determined by numerous genetic factors that interact in a nonlinear manner in the context of an individual's age and environmental exposure. This complex genetic architecture has important implications for the use of genome-wide association studies for identifying susceptibility genes. The assumption of a simple architecture supports a strategy of testing each single-nucleotide polymorphism (SNP) individually using traditional univariate statistics followed by a correction for multiple tests. However, a complex genetic architecture that is characteristic of most types of cancer requires analytical methods that specifically model combinations of SNPs and environmental exposures. While new and novel methods are available for modeling interactions, exhaustive testing of all combinations of SNPs is not feasible on a genome- wide scale because the number of comparisons is effectively infinite. Thus, it is critical that we develop intelligent strategies for selecting subsets of SNPs prior to combinatorial modeling. The objective of this renewal application is to continue the development of a research strategy for the detection, characterization, and interpretation of gene-gene and gene-environment interactions in genome-wide association studies of bladder cancer susceptibility. To accomplish this objective, we will continue developing and evaluating modifications and extensions to the ReliefF family of algorithms for selecting or filtering subsets of single- nucleotide polymorphisms (SNPs) for multifactor dimensionality reduction (MDR) analysis of gene-gene and gene-environment interactions (AIM 1). We will continue developing and evaluating a stochastic wrapper or search strategy for MDR analysis of interactions that utilizes ReliefF values as a heuristic (AIM 2). We will continue to make available ReliefF algorithms as part of our open-source MDR software package (AIM 3). Finally, we will apply the best ReliefF-MDR analysis strategies to the detection, characterization, and interpretation of gene-gene and gene-environment interactions in large genome-wide association studies of bladder cancer susceptibility (AIM 4). We anticipate the proposed machine learning methods will provide powerful new approaches for identifying genetic variations that are predictive of cancer susceptibility.

Public Health Relevance

The technology to measure information about the human genome is advancing at a rapid pace. Despite these advance, the computational methods for analyzing the data have not kept pace. We will develop new computer algorithms and software that can be used to identify genetic biomarkers of common human diseases. We will then apply these new computational methods to identifying genetic biomarkers of bladder cancer in an epidemiological study from New Hampshire.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZLM1-ZH-C (M3))
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Dartmouth College
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Madan, Juliette C; Hoen, Anne G; Lundgren, Sara N et al. (2016) Association of Cesarean Delivery and Formula Supplementation With the Intestinal Microbiome of 6-Week-Old Infants. JAMA Pediatr 170:212-9
Greene, Anna C; Giffin, Kristine A; Greene, Casey S et al. (2016) Adapting bioinformatics curricula for big data. Brief Bioinform 17:43-50
Frost, H Robert; Shen, Li; Saykin, Andrew J et al. (2016) Identifying significant gene-environment interactions using a combination of screening testing and hierarchical false discovery rate control. Genet Epidemiol 40:544-557
Yao, Xiaohui; Yan, Jingwen; Kim, Sungeun et al. (2016) Two-dimensional enrichment analysis for mining high-level imaging genetic associations. Brain Inform :
Du, Lei; Huang, Heng; Yan, Jingwen et al. (2016) Structured sparse CCA for brain imaging genetics via graph OSCAR. BMC Syst Biol 10 Suppl 3:68
Qiu, Jingya; Moore, Jason H; Darabos, Christian (2016) Studying the Genetics of Complex Disease With Ancestry-Specific Human Phenotype Networks: The Case of Type 2 Diabetes in East Asian Populations. Genet Epidemiol 40:293-303
Hong, Chuan; Ning, Yang; Wei, Peng et al. (2016) A semiparametric model for vQTL mapping. Biometrics :
Cheng, Samantha; Andrew, Angeline S; Andrews, Peter C et al. (2016) Complex systems analysis of bladder cancer susceptibility reveals a role for decarboxylase activity in two genome-wide association studies. BioData Min 9:40
Fan, Ruzong; Wang, Yifan; Chiu, Chi-Yang et al. (2016) Meta-analysis of Complex Diseases at Gene Level with Generalized Functional Linear Models. Genetics 202:457-70
Li, Haiquan; Achour, Ikbel; Bastarache, Lisa et al. (2016) Integrative genomics analyses unveil downstream biological effectors of disease-specific polymorphisms buried in intergenic regions. NPJ Genom Med 1:

Showing the most recent 10 out of 127 publications