An important challenge in the field of molecular epidemiology is to develop methods for studying multiple gene and multiple environmental factors as interacting factors contributing to disease. The overall goal of this proposal is to develop novel approaches to this problem, using folate metabolism as an important candidate pathway in colorectal carcinogenesis and as a prototype for other candidate biochemical and metabolic pathways, and to apply them to several datasets on colorectal cancer and colorectal adenomas. We have the following specific aims: 1. Develop a modeling framework incorporating biological understanding about the structure of a complex pathway, illustrated by folate metabolism. This will involve extensions to our differential equations model for folate metabolism, using the predictions of this in silico model to derive prior covariates for our hierarchical modeling framework, and developing an integrated approach allowing for model uncertainty. These various methods will be illustrated and contrasted with purely exploratory methods using data from the Colon Cancer Family Registry (C-CFR) folate study, an adenoma case-control study, and a randomized adenoma prevention trial. 2. Extend this framework to exploit biomarker measurements of intermediate metabolite concentrations and enzyme activity rates on a subsample of subjects. We will develop analytical methods and explore optimal sampling schemes stratifying on various combinations of disease, exposure, and genotypes that are available on an entire study population. These approaches will be illustrated with already available data on homocysteine levels in the adenoma case-control, additional biomarkers that will be measured in a pending C-CFR grant application, and two longitudinal studies. 3. Organize biological knowledge for the folate pathway into a formal ontology. We will develop systematic methods for extracting prior covariates from an ontology for use in our hierarchical modeling framework, and will explore ways of incorporating information on evolution of pathways across species. 4. Extend this candidate pathway approach to the genome-wide scale. We will explore methods for inferring pathways from genome-wide data and for using genome-wide association data to inform pathway-based analyses. These methods will be applied to data from the ongoing C-CFR GWAS. The four example datasets we propose were chosen in part to illustrate a range of designs, including family-based, population-based case-control, longitudinal, and randomized trial.

Public Health Relevance

The overall goal of this project is to develop statistical methods for modeling the effects of metabolic pathways involving multiple interacting genes and environmental factors on complex human traits, incorporating biomarker measurements and leveraging ontologies generated from external biological and genomic information. Our methods will initially be developed for candidate gene studies, and later extended to a genome-wide scale. They will be applied to observational and experimental studies of folate metabolism in relation to colorectal adenomas and colorectal cancer and are expected to yield improved models for predicting individual disease risks and insight into their underlying biological mechanisms.

National Institute of Health (NIH)
National Institute of Environmental Health Sciences (NIEHS)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-PSE-B (02))
Program Officer
Mcallister, Kimberly A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Southern California
Public Health & Prev Medicine
Schools of Medicine
Los Angeles
United States
Zip Code
Thomas, Duncan C (2017) Estimating the Effect of Targeted Screening Strategies: An Application to Colonoscopy and Colorectal Cancer. Epidemiology 28:470-478
Thomas, Duncan C (2017) What Does ""Precision Medicine"" Have to Say About Prevention? Epidemiology 28:479-483
Pereira, Miguel; Thompson, John R; Weichenberger, Christian X et al. (2017) Inclusion of biological knowledge in a Bayesian shrinkage model for joint estimation of SNP effects. Genet Epidemiol 41:320-331
Manrai, Arjun K; Cui, Yuxia; Bushel, Pierre R et al. (2017) Informatics and Data Analytics to Support Exposome-Based Discovery for Public Health. Annu Rev Public Health 38:279-294
Su, Yu-Chen; Gauderman, William James; Berhane, Kiros et al. (2016) Adaptive Set-Based Methods for Association Testing. Genet Epidemiol 40:113-22
Temamogullari, N Ezgi; Nijhout, H Frederik; C Reed, Michael (2016) Mathematical modeling of perifusion cell culture experiments on GnRH signaling. Math Biosci 276:121-32
Salomon, Matthew P; Li, Wai Lok Sibon; Edlund, Christopher K et al. (2016) GWASeq: targeted re-sequencing follow up to GWAS. BMC Genomics 17:176
Reed, Michael C; Gamble, Mary V; Hall, Megan N et al. (2015) Mathematical analysis of the regulation of competing methyltransferases. BMC Syst Biol 9:69
Chen, Gary K; Chi, Eric C; Ranola, John Michael O et al. (2015) Convex clustering: an attractive alternative to hierarchical clustering. PLoS Comput Biol 11:e1004228
Hsu, Li; Jeon, Jihyoun; Brenner, Hermann et al. (2015) A model to determine colorectal cancer risk using common genetic susceptibility loci. Gastroenterology 148:1330-9.e14

Showing the most recent 10 out of 59 publications