An enduring impediment in translating genomic advances into biomedical solutions has been the lack of tools and techniques that enable biologists to [a] efficiently leverage the multitude of publically- available genome variation data in their research endeavors, and [b] effectively harness the long-term (inter-specific) evolutionary histories of mutant positions in diagnosing functional effects of novel mutations. This need has become more acute with the discovery of unprecedented numbers of novel mutations in personal genomes and population surveys. Therefore, we propose an integrated research and development project to address this need. First, we plan to develop unique, user-friendly, and robust software to investigate human mutations in the context of Long-Term Evolutionary (LTE) patterns on a genomic scale;LTE patterns are revealed by inter-specific comparisons at a position, and they provide sound baseline hypotheses for analyzing the nature of mutations and frequencies of contemporary variations. The proposed myPEG (Population Evolutionary Genomics) software will contain tools for automated data assembly and integration from primary genome alignment browsers and mutation databases (e.g., UCSC, 1000Genomes, dbSNP). myPEG will enable users to conduct integrative analysis across taxonomic scales via its cross-platform WebTop display and analysis framework that will seamlessly integrate species and population sequence alignments and analyses in traditional and novel ways. myPEG's approach to software design and development will be biologist- centric in which we emulate, rather than reinvent, biologists'favorite work practices. These software developments will be informed by the proposed fundamental research to develop direct applications of macro-evolutionary patterns to the diagnosis of mutations associated with disease (e.g., Mendelian, complex, and somatic-cell mutations), and the successes of their computational predictions using in silico tools. The proposed investigations will yield similarities and differences in evolutionary anatomies of disease-associated and other mutations (including population SNPs) as well as those of the success rates of all major in silico tools currently used for diagnosing functional effects of novel mutations. These discoveries will form the basis for developing a decision support system to choose the best in silico method for the type of mutation and purpose (type of disease), such that the Reliability of in silico Inference (RoI) is the highest. myPEG will contain this decision support system, along with facilities for prototyping and conducting high-throughput iterative analysis of large numbers of mutations. myPEG will run on all major platforms (Windows, Linux, and MacOS), will be useable as a plug-in into analysis pipelines natively in these operating systems, and will be available at no cost (including the source code) to all users, including those in research, education, and training.
The biological and biomedical research community at large needs user-friendly computational tools to translate the wealth of genomic data into useful information and solutions. Therefore, we will develop biologist-centric software to explore, integrate, analyze, and diagnose human mutation data in the context of the evolutionary history of mutation positions. Proposed innovative technological and research advances will enable scientists to harness the power of datasets in basic biomedicine, personalized medicine, personal genomics, and broader biological research.
Showing the most recent 10 out of 13 publications