Investigations have been conducted for using data from current genome-wide association studies to assess genetic architecture of cancer and likely yield of future genome-wide association studies. One project explored distribution of allele frequencies and effect-size and their interrelationships for common susceptibility SNPs using discoveries from existing genome-wide association. It used novel methods to correct for bias as variants with larger effect-sizes are currently over-represented due to their larger statistical power for discovery. The analysis identified several intriguing patterns that can have implications for design and analysis of future genetic association studies. A second project explored potential utility of future discoveries from larger genome-wide association studies for building risk-prediction models that can be potentially utilized for targeting high-risk groups for cancer screening. It was found that although many discoveries are expected from future genome-wide association studies, risk-prediction models based only on discovered SNPs are unlike to identify a small portion of the population that would give rise to the large majority of the future cases. Several projects involved development of statistical methods for exploring gene-gene and gene-environment interactions using data from genome-wide association studies. A new method was developed for modeling interaction of an environmental exposure with multiple SNPs within a genomic region using a Bayesian latent variable modeling approach. Another method exploited an assumption of gene-environment independence in the underlying population to improve the power for the test for gene-environment interaction on the absolute risk of a disease from case-control studies. Another report investigated power for various alternative methods for conducting genome-wide interaction scans using simulation studies. General statistical methods Several studies have been conducted to evaluate efficient design and analysis strategies for epidemiologic studies that use complex sampling designs. One study focuses on the efficient usage of specimen repositories for the evaluation of new diagnostic tests and for comparing new tests with existing tests. Typically, all pre-existing diagnostic tests will already have been conducted on all specimens. It was proposed that retesting only a judicious subsample of the specimens by the new diagnostic test could minimizes study costs and specimen consumption, yet estimates of agreement or diagnostic accuracy potentially retain adequate statistical efficiency. Another project explore efficient analysis method for case-cohort designs that select a random sample of a cohort to be used as control with cases arising from the follow-up of the cohort. Analyses of case-cohort studies with time-varying exposures that use Cox partial likelihood methods can be computer intensive. A new computationally simple method has been developed using piecewise-exponential approach where Poisson regression model parameters are estimated from a pseudo-likelihood and the corresponding variances are derived by applying the corresponding variances are derived by applying Taylor linearization methods that are used in survey research. Several studies have involved development of regression models in a setting that involve potentially a large number of predictor variables. A Bayesian variable selection method has been developed in a setting where the number of independent variables or predictors in a particular dataset is much larger than the available sample size. While most of the existing methods allow some degree of correlations among predictors but do not consider these correlations for variable selection, the proposed method accounts for correlations among the predictors in variable selection. The method could be applied to continuous, binary, ordinal, and count outcome data. Another method is proposed to combine several predictors (markers) that are measured repeatedly over time into a composite marker score without assuming a model and only requiring a mild condition on the predictor distribution. Assuming that the first and second moments of the predictors can be decomposed into a time and a marker component via a Kronecker product structure that accommodates the longitudinal nature of the predictors, the method uses first-moment sufficient dimension reduction techniques to replace the original markers with linear transformations that contain sufficient information for the regression of the predictors on the outcome. These linear combinations can then be combined into a score that has better predictive performance than a score built under a general model that ignores the longitudinal structure of the data. Our methods can be applied to either continuous or categorical outcome measures. Several studies have developed methodologies related to models for predicting absolute risk of diseases and their applications. One study has developed two criteria to assess the usefulness of models that predict risk of disease incidence for screening and prevention, or the usefulness of prognostic models for management following disease diagnosis. The first criterion, the proportion of cases followed PCF(q), is the proportion of individuals who will develop disease who are included in the proportion q of individuals in the population at highest risk. The second criterion is the proportion needed to follow-up, PNF(p), namely the proportion of the general population at highest risk that one needs to follow in order that a proportion p of those destined to become cases will be followed. New methods of inference are developed to compare the PCFs and PNFs of two risk models that are built based on the same validation data. A second project developed a linear-expit regression model (LEXPIT) to incorporate linear and nonlinear risk effects to estimate absolute risk from studies of a binary outcome. The LEXPIT is a generalization of both the binomial linear and logistic regression models. The coefficients of the LEXPIT linear terms estimate adjusted risk differences, while the exponentiated nonlinear terms estimate residual odds ratios. The LEXPIT could be particularly useful for epidemiological studies of risk association, where adjustment for multiple confounding variables is common. The method was applied to estimate the absolute five-year risk of cervical precancer or cancer associated with different Pap and human papillomavirus test results in 167,171 women undergoing screening at Kaiser Permanente Northern Califronia. The LEXPIT model found an increased risk due to abnormal Pap test in HPV-negative that was not detected with logistic regression. An R package blm was developed to provide free and easy-to-use software for fitting the LEXPIT model.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Investigator-Initiated Intramural Research Projects (ZIA)
Project #
1ZIACP010181-10
Application #
8565443
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
10
Fiscal Year
2012
Total Cost
$3,232,673
Indirect Cost
Name
Division of Cancer Epidemiology and Genetics
Department
Type
DUNS #
City
State
Country
Zip Code
Zhang, Cuilin; Hediger, Mary L; Albert, Paul S et al. (2018) Association of Maternal Obesity With Longitudinal Ultrasonographic Measures of Fetal Growth: Findings From the NICHD Fetal Growth Studies-Singletons. JAMA Pediatr 172:24-31
Katki, Hormuzd A; Greene, Mark H; Isabel Achatz, Maria (2018) Testing Positive on a Multigene Panel Does Not Suffice to Determine Disease Risks. J Natl Cancer Inst 110:797-798
Sampson, Joshua N; Boca, Simina M; Moore, Steven C et al. (2018) FWER and FDR control when testing multiple mediators. Bioinformatics 34:2418-2424
Katki, Hormuzd A; Kovalchik, Stephanie A; Petito, Lucia C et al. (2018) Implications of Nine Risk Prediction Models for Selecting Ever-Smokers for Computed Tomography Lung Cancer Screening. Ann Intern Med 169:10-19
Katki, Hormuzd A; Schiffman, Mark (2018) A novel metric that quantifies risk stratification for evaluating diagnostic tests: The example of evaluating cervical-cancer screening tests across populations. Prev Med 110:100-105
Cheung, Li C; Katki, Hormuzd A; Chaturvedi, Anil K et al. (2018) Preventing Lung Cancer Mortality by Computed Tomography Screening: The Effect of Risk-Based Versus U.S. Preventive Services Task Force Eligibility Criteria, 2005-2015. Ann Intern Med 168:229-232
Anderson, William F; Rabkin, Charles S; Turner, Natalie et al. (2018) The Changing Face of Noncardia Gastric Cancer Incidence Among US Non-Hispanic Whites. J Natl Cancer Inst 110:608-615
Kant, Ashima K; Graubard, Barry I (2018) A prospective study of frequency of eating restaurant prepared meals and subsequent 9-year risk of all-cause and cardiometabolic mortality in US adults. PLoS One 13:e0191584
Sakoda, Lori C; Henderson, Louise M; Caverly, Tanner J et al. (2017) Applying Risk Prediction Models to Optimize Lung Cancer Screening: Current Knowledge, Challenges, and Future Directions. Curr Epidemiol Rep 4:307-320
Kassahun-Yimer, Wondwosen; Albert, Paul S; Lipsky, Leah M et al. (2017) A joint model for multivariate hierarchical semicontinuous data with replications. Stat Methods Med Res :962280217738141

Showing the most recent 10 out of 182 publications