The efforts of the human genome project are beginning to provide important findings for human health. Technological advances in the laboratory, particularly in characterizing human genomic variation, have created new approaches for studying the human genome. However, current statistical and computational strategies are taking only partial advantage of this wealth of information. In the quest for disease susceptibility genes for common, complex disease, we are faced with many challenges. Selecting genetic, clinical, and environmental factors important for the trait of interest is increasingly more difficult as high throughput data generation technologies are developed. We know that genes do not act in isolation, thus numerous other factors are likely important in complex disease phenotypes. However, techniques for robust statistical modeling of important variables to predict clinical outcomes are limited in their capability for interaction effects. Ultimately, we want to know what factors are important to provide superior prevention, diagnosis, and treatment of human disease. Unfortunately, interpretation of statistical models in a meaningful way for biomedical research has been lacking due to the inherent difficulty in making such connections. Thus, a technology that embraces the complexity of human disease and integrates multiple data sources including biological knowledge from the public domain, through a powerful analytical framework is essential for dissecting the architecture of common diseases. ATHENA: the Analysis Tool for Heritable and Environmental Network Associations is a novel framework that incorporates variable selection, modeling, and interpretation to learn more about diseases of public health interest. As the field gains experience in analyzing large scale genomic data, it is crucial that we learn from each other and develop and codify the best strategies.
Many common, complex diseases are likely due to a combination of genetic and environmental risk factors. Out ability to extract all of the meaningful information from very large genomic and phenotypic datasets has been limited by our analytic strategies. The methodology described in this proposal is a powerful new approach to maximize the information learned from large datasets to improve prevention, diagnosis, and treatment of diseases of public health interest.
Showing the most recent 10 out of 34 publications