This research is for the development of new approaches to the analysis of data from large cohort studies, either epidemiologic or clinical trials, with many qualitatively different variables observed over several time points, with exploratory genetic marker data vectors of length the order of thousands, with pedigree information, and with multiple correlated outcomes of interest.
The aim i s to develop methods for discovering relevant components or patterns of components in long or ultra-long genetic marker data vectors as they interact in conjunction with patterns or clusters of clinical/lifestyle/environmental variables and, possibly, pedigrees to suggest unusually high risk and correlations of interest for multi- ple outcomes of interest. The methods proposed are an attempt to develop tools that merge so-called data mining approaches and penalized likelihood methods (some developed in prior grants) and that have the ability to generate hypotheses which can then be examined more closely by classical para- metric statistical methods and by experimenters to formulate further hypotheses. The emphasis is on approaches that have the ability to contribute information to evidence-based and personalized medical decision making. Data from Beaver Dam Eye Study will be used to examine the models under study for their reasonableness and for their ability to answer questions meaningful to the study scientists. The results will have broad applicability to other large epidemiological studies as well as to clinical tri- als, in particular those collecting local and genome-wide genetic marker information along with other, heterogenous risk factors.

Public Health Relevance

Epidemiological and clinical studies take much of the credit for the dramatic improvement in public health and longevity in the last fifty years or so. Better understanding of the effect of lifestyle factors, clinical variables, treatment opportunities, and, more recently, genetic factors has come about as the result of straightforward as well as sophisticated analysis of the data gleaned from these studies. With extensive data collection and complex data structures, as well as improved computational and software resources, there are opportunities to further develop and extend modern data analysis methods to better capture complex relations between variables that affect outcomes of important personal and public health interest. It is proposed to exploit these opportunities.

Agency
National Institute of Health (NIH)
Institute
National Eye Institute (NEI)
Type
Research Project (R01)
Project #
5R01EY009946-19
Application #
8209127
Study Section
Special Emphasis Panel (ZRG1-HDM-G (02))
Program Officer
Everett, Donald F
Project Start
1992-12-01
Project End
2014-12-31
Budget Start
2012-01-01
Budget End
2014-12-31
Support Year
19
Fiscal Year
2012
Total Cost
$249,903
Indirect Cost
$77,103
Name
University of Wisconsin Madison
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
161202122
City
Madison
State
WI
Country
United States
Zip Code
53715
Wahba, Grace (2010) Encoding Dissimilarity Data for Statistical Model Building. J Stat Plan Inference 140:3580-3596
Bravo, Hector Corrada; Lee, Kristine E; Klein, Barbara E K et al. (2009) Examining the relative influence of familial, genetic, and environmental covariate information in flexible risk models. Proc Natl Acad Sci U S A 106:8128-33
Shi, Weiliang; Wahba, Grace; Wright, Stephen et al. (2008) LASSO-Patternsearch algorithm with application to ophthalmology and genomic data. Stat Interface 1:137-153
Lu, Fan; Keles, Sunduz; Wright, Stephen J et al. (2005) Framework for kernel regularization with application to protein clustering. Proc Natl Acad Sci U S A 102:12332-7
Lee, Yoonkyung; Lee, Cheol-Koo (2003) Classification of multiple cancer types by multicategory support vector machines using gene expression data. Bioinformatics 19:1132-9
Carew, John D; Wahba, Grace; Xie, Xianhong et al. (2003) Optimal spline smoothing of fMRI time series by generalized cross-validation. Neuroimage 18:950-61
Wahba, Grace (2002) Soft and hard classification by reproducing kernel Hilbert space methods. Proc Natl Acad Sci U S A 99:16524-30
Wang, Y; Wahba, G; Gu, C et al. (1997) Using smoothing spline anova to examine the relation of risk factors to the incidence and progression of diabetic retinopathy. Stat Med 16:1357-76