This subproject is one of many research subprojects utilizing theresources provided by a Center grant funded by NIH/NCRR. The subproject andinvestigator (PI) may have received primary funding from another NIH source,and thus could be represented in other CRISP entries. The institution listed isfor the Center, which is not necessarily the institution for the investigator.Cardiovascular disease is an enormous public health problem that plagues over 79 million people in the United States (NHLBI morbidity and mortality chartbook, 2007), and is the leading cause of death worldwide. Identifying the genetic determinants of CVD can lead to more effective diagnostics, prognostics, therapeutics, and, ultimately, preventive strategies, but compelling strategies for identifying those genes in a way that ultimately leads to true insights into pathogenesis of CVD are difficult to implement and/or expensive. What are needed are strategies that take advantage of individuals tracked longitudinally prior to the onset of disease in order to determine risk factor profiles that predispose to disease. Identifying the gene(s) that contribute to these risk profiles will then lead to direct insights into not only the mechanisms that the genes work through to cause CVD, but also insights into the clinical profiles that signal genetically-mediated pathogenic susceptibility and the early onset of disease. Dynamic complex traits (DCTs), quantitative phenotypes measured over time, are influenced by the interplay of multiple genetic and environmental factors. The characterization and analysis of DCTs in longitudinal contexts can offer insights into disease pathogenesis that are not achievable in other study designs and research settings. For example, the often-used case-control study design is problematic in that it considers data collected at a single time point and hence cannot accommodate temporal variations in the risk factors that predispose to disease. In addition, most data collected on DCTs is rarely obtained at uniform time points and is often infected with missing data, making analysis problematic. Consider, for example, repeated blood pressure measurements collected during a particular acute drug challenge, clinical trial, or as a patient is tracked in the clinic. Such longitudinal blood pressure information can provide important information for cardiovascular disease risk, but, at least in the case of clinical care settings, is collected haphazardly according to the patients needs and clinical care schedule. As a result, statistical methods for the assessment of DCTs and their influence on disease risk and pathogenesis are legion, but many are known to be inappropriate or perform poorly in certain situations (e.g., situations involving non-uniform time points) or do not fully utilize all available data. In this research, both traditional and novel statistical models and frameworks will be evaluated and applied to actual data to determine their advantages and disadvantages in practical settings. The novel methodology to be considered involves the assessment of the similarity of the longitudinal profiles exhibited by each subject in a sample. The first step, involves modeling the dynamic trait using non-parametric functions (curves) fitted to all available data. Note that the data do not have to be collected at uniform time points across the subjects or have a standard number of measurements. The dissimilarity (or distance) between a set of individuals functions is calculated and related to genetic and environmental factors via a Multivariate distance matrix regression (MDMR) method. This approach accounts for uncertainty of fitted functions, can accommodate weighting factors, and can be extended to a multivariate analysis settings. We apply this novel methodology, as well as traditional methodologies (implemented in R and python), to data from three clinical studies investigating DCTs (Hand vein Distension, Body Mass Index, Serum Lipids, Insulin, LDL, HDL, total cholesterol, and Blood Pressure) influencing and/or contributing to hypertension susceptibility. These three studies vary in their complexity and showcase the range of settings and problems involving to the analysis of DCTs. The most complex of the three studies involves the analysis of actual clinic-derived longitudinal medical records and represents a valuable and unique perspective for studying hypertension in a natural clinical setting. The use of longitudinal medical records of this type in research poses considerable challenges which have been collectively labeled the Longitudinal Unstructured Clinical Information or LUCI Problem since the data have often been collected in a very haphazard manner. Finally, a forth study is utilized to study the issues and challenges of DCTs in a study genome-wide association (GWA) setting.
SPECIFIC AIMS This research to be pursued in this project aims to utilize, compare, and extend analysis approaches to the study of DCTs, in unrelated individuals and in settings of relevance to both disease pathogenesis and pharmacogenetics. The overall framework will consist of comparing traditional data analysis methods (ANOVA/General Linear Models, Marginal Models/GEE, Mixed Effects models, Generalized Linear Models, Spline Regression, etc), along with novel similarity analysis and curve fitting analysis methods for three existing clinical datasets: (i) clinical investigation of drug infusion into dorsal hand vein and response; (ii) Clinical trial of kidney disease progression (the African American Study of Kidney Disease and Hypertension [AASK]); and (iii) clinical/medical data collected as part of the San Diego Veterans Affairs Anti-Hypertension Pharamacogenetics (VAAP) cohort study. These three studies vary widely in their sophistication and the degree to which complexities associated with the DCTs collected as part of them arise. The forth study; the Bogalusa Heart Study is a long-term epidemiologic program concerned with the early natural history of CVD of children aged 5 to 17, with up to eight follow-up measurements, and GWA with approximately 500,000 SNPs.
The specific aims for this project include: I. Development and test, via simulation studies, a novel non-parametric analysis tools, based on curve fitting functions and similarity analyses to assess DCTs in a wide variety of study settings. II. Apply the methodology developed in specific aim 1 to datasets (i-iii) with varying DCT complexity and measurement structure: (i) the hand vein study which involved structured repeated measurements in an inpatient setting, (ii) the AASK study structured clinical trial data; and (iii) the VAAP unstructured longitudinal medical records study data. III. Compare the results of the application of the methodology developed in specific aim 1 with results obtained from traditional longitudinal data analysis methodologies, when applicable. IV. Develop and explore strategies for efficient analysis of DCTs, including both novel and traditional analysis in genome wide study. V. Consider areas that demand further consideration in the context of the LUCI problem posed by the use of medical record data collected during the natural patient/physician interactions. These will include traditional epidemiological problems such as confounding, bias, effect modification, etc. This work anticipates a shift from use of case-control to longitudinal cohort study designs in genetics research. Ultimately, the research to be pursued has the potential to shed light on not only the most appropriate approach to the analysis of DCTs, but also help promote their use in the study of hypertension and other common, complex diseases.
Showing the most recent 10 out of 292 publications