Cardiovascular disease (CVD) and its associated risk factors such as hypertension and dyslipidemia constitute a major public-health burden due to increased mortality and morbidity and rising health care costs. Massive epidemiological data are needed to detect the small effects of many individual genes and the environment on these traits. However, sample sizes needed to make powerful inferences may only be reached by integrating multiple epidemiological studies. Meaningful integration of information from multiple studies requires the development of data ontologies which make it possible to integrate information across studies in an optimum manner so as to maximize the information content and hence the statistical power for detecting small effect sizes. A second compounding problem of data integration is that software applications that manage such study data are typically non-interoperable, i.e. """"""""silos"""""""" of data, and are incapable of being shared in a syntactically and semantically meaningful manner. Consequently, an infrastructure that integrates across studies in an interoperable manner is needed to ensure that epidemiological cardiovascular research remains a viable and major player in the biomedical informatics revolution which is currently underway. The cancer Biomedical Informatics Grid (caBIGTM) is addressing these problems in the cancer domain by developing software systems that are able to exchange information or that are syntactically interoperable by accessing metadata that is semantically annotated using controlled vocabularies. Our overarching goal is to develop ontologies for integrating cardiovascular epidemiological data from multiple studies. Specifically, we propose three Aims: First, develop cardiovascular data ontologies and vocabularies for each of three disparate multi-center epidemiological studies that facilitate data integration across the studies and data mining for various phenotypes. Second, adopt a technology infrastructure that leverages the cardiovascular data ontologies and vocabularies using Model Driven Architecture (MDA) and caBIGTM tools to facilitate the integration and widespread sharing of cardiovascular data sets. Third, facilitate seamless data sharing and promote widespread data dissemination among research communities cutting across clinical, translational and epidemiological domains, primarily through collaboration with the established CardioVascular Research Grid (CVRG).
Shimoyama, Mary; Nigam, Rajni; McIntosh, Leslie Sanders et al. (2012) Three ontologies to define phenotype measurement data. Front Genet 3:87 |