Methods for Epidemiology Studies

Chatterjee, Nilanjan

Abstract

Investigations have been conducted for the potential for using data from current and future genome-wide association studies for improving performance of models for predicting disease risks. A new mathematical paradigm was developed to characterize predictive performance of polygenic models in terms of sample size for training datasets, number of underlying susceptibility loci and distribution of their effect-sizes. The paradigm was then applied to make projections for performance of risk prediction models for ten different complex traits, including cancers. These projections revealed that in the future extremely large GWAS, with sample size of a larger order magnitude than even some of the largest GWAS to date, would be needed for building genetic risk models with substantially improved predictive performance. A new method was developed for assessing gene-environment interactions using data from case-control genome-wide association studies that uses publicly available genetic controls. It was shown that under a set of assumptions it possible to characterize joint gene-environment effects from such studies if data on environmental exposures are available from an internal case-control study even if controls in such a study are not genotyped. New methods was developed for evaluating association of SNP markers with disease outcome of ordinal nature reflecting various stages of the progression of a disease. Two alternative tests, the maximum score test (MAX) and the adaptive P-value combination test (Adapt-P), are proposed with the aim of striking a balance between efficiency and robustness over possible alternative models by which a SNPs might be involved in the various stages. Simulation studies were used to demonstrates that MAX and Adapt-P have the most robust performance among all a range of tests under various realistic scenarios. A permutation-based resampling method was developed for using metabolomic data for testing the hypothesis of mediation of the effect of an exposure (e.g smoking) on the risk of a disease (e.g lung cancer) through intermediate biomarkers. Extensive simulation studies were used to examine validity and power of the proposed test. Methods were developed for analysis of population-based case-control studies with complex sampling designs. Two methods were developed for incorporating the information included in the sample weights by modeling the sample expectation of the weights conditional on design variables. These methods have higher efficiency and smaller finite sample bias compared with the standard estimators that use original sample weights. The methods were to the U.S. Kidney Cancer Case-Control Study to identify risk factors. A project developed a linear-expit regression model (LEXPIT) to incorporate linear and nonlinear risk effects to estimate absolute risk from studies of a binary outcome. The LEXPIT is a generalization of both the binomial linear and logistic regression models. The coefficients of the LEXPIT linear terms estimate adjusted risk differences, while the exponentiated nonlinear terms estimate residual odds ratios. The LEXPIT could be particularly useful for epidemiological studies of risk association, where adjustment for multiple confounding variables is common. The method was applied to estimate the absolute five-year risk of cervical precancer or cancer associated with different Pap and human papillomavirus test results in 167,171 women undergoing screening at Kaiser Permanente Northern Califronia. The LEXPIT model found an increased risk due to abnormal Pap test in HPV-negative that was not detected with logistic regression. An R package blm was developed to provide free and easy-to-use software for fitting the LEXPIT model.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Investigator-Initiated Intramural Research Projects (ZIA)
Project #: 1ZIACP010181-11
Application #: 8763630
Study Section

Project Start
Project End
Budget Start
Budget End
Support Year: 11
Fiscal Year: 2013
Total Cost: $3,179,533
Indirect Cost

Institution

Name: Division of Cancer Epidemiology and Genetics
Department
Type
DUNS #

City
State
Country
Zip Code

Related projects

Publications

Zhang, Cuilin; Hediger, Mary L; Albert, Paul S et al. (2018) Association of Maternal Obesity With Longitudinal Ultrasonographic Measures of Fetal Growth: Findings From the NICHD Fetal Growth Studies-Singletons. JAMA Pediatr 172:24-31

Katki, Hormuzd A; Greene, Mark H; Isabel Achatz, Maria (2018) Testing Positive on a Multigene Panel Does Not Suffice to Determine Disease Risks. J Natl Cancer Inst 110:797-798

Sampson, Joshua N; Boca, Simina M; Moore, Steven C et al. (2018) FWER and FDR control when testing multiple mediators. Bioinformatics 34:2418-2424

Katki, Hormuzd A; Kovalchik, Stephanie A; Petito, Lucia C et al. (2018) Implications of Nine Risk Prediction Models for Selecting Ever-Smokers for Computed Tomography Lung Cancer Screening. Ann Intern Med 169:10-19

Katki, Hormuzd A; Schiffman, Mark (2018) A novel metric that quantifies risk stratification for evaluating diagnostic tests: The example of evaluating cervical-cancer screening tests across populations. Prev Med 110:100-105

Cheung, Li C; Katki, Hormuzd A; Chaturvedi, Anil K et al. (2018) Preventing Lung Cancer Mortality by Computed Tomography Screening: The Effect of Risk-Based Versus U.S. Preventive Services Task Force Eligibility Criteria, 2005-2015. Ann Intern Med 168:229-232

Anderson, William F; Rabkin, Charles S; Turner, Natalie et al. (2018) The Changing Face of Noncardia Gastric Cancer Incidence Among US Non-Hispanic Whites. J Natl Cancer Inst 110:608-615

Kant, Ashima K; Graubard, Barry I (2018) A prospective study of frequency of eating restaurant prepared meals and subsequent 9-year risk of all-cause and cardiometabolic mortality in US adults. PLoS One 13:e0191584

Boca, Simina M; Pfeiffer, Ruth M; Sampson, Joshua N (2017) Multivariate meta-analysis with an increasing number of parameters. Biom J 59:496-510

Gail, Mitchell H (2017) The prediction impact curve is proportional to the proportion of cases followed (letter commenting: J Clin Epidemiol 2016;69:361-363). J Clin Epidemiol 85:70

Showing the most recent 10 out of 182 publications

Comments

Be the first to comment on Nilanjan Chatterjee's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: