The overarching goal of the proposed project is to develop computational methods and practical bioinformatics resources for data-driven, integrative analysis of expression images and sequence data to discover functional, genetic, and regulatory interactions between genes and genomic elements. Fast growing collections of spatial and temporal gene expression patterns in the model organism, Drosophila melanogaster, are providing unprecedented opportunities for understanding the spatiotemporal regulation of expression not only for fruit fly genes, but also human genes that show extensive evolutionary similarity and functional conservation. These expression patterns are the first links between a gene's primary sequence and its influence on the phenotype, and their overlaps provide initial clues to functional, genetic, or regulatory interactions. Therefore, our primary framework for translating large volumes of images into functional knowledge is to discover and analyze co- expressed (and, thus, potentially co-regulated) genes. To date, our efforts have led to the development and establishment of a unique and innovative image-based framework (FlyExpress) to carry out high-throughput analyses of these large datasets, because the standard practice of manually inspecting images is no longer feasible owing to the sheer volume of available images. We are now poised to address a growing and urgent need to develop computational tools and data-integration methods that enable effective harnessing of fast- growing image and sequence data as well as foster enhanced engagement of the research community in building the FlyExpress knowledgebase. Therefore, we plan to (a) develop a new software tool to enable effective expression image analysis while advancing community collaborations, (b) translate knowledge of spatiotemporal expression overlap into the discovery of regulatory motifs by developing novel methods for integrative analysis of image and sequence data, and (c) evolve FlyExpress into a comprehensive knowledge- base of embryonic expression images in order to generate better predictions and integrative analysis across heterogeneous image sources. These developments will enable investigators to effectively generate and evaluate their gene interaction hypotheses based on overlaps in expression patterns by using all relevant biological information. The software tool and web system, including the source code, will always be freely available. The computational algorithms, statistical methods, and bioinformatics technologies developed in this project will be reconfigurable and adaptable for application in constructing similar frameworks for organizing expression pattern data from other species and life history stages. The FlyExpress system will fulfill the day-to-day needs of basic and applied researchers as well as students in many areas of molecular biology crucial in basic biomedicine, including computational genomics, molecular genetics, developmental biology, genetics, and evolution.
Investigations of model organisms are critical for understanding spatiotemporal regulation of gene expression that result in alternative cell fates in a developing embryo and establish the cellular precursors of adult tissues and organs. The proposed project will produce urgently needed computational methods and practical bioinformatics resources that enable scientists to carry out integrative data-driven analysis of expression pattern images to discover functional, genetic, and regulatory interactions between genes and genomic elements. The proposed advances would lead to a more effective translation of gene expression (image) and genomics (sequence) data into the functional knowledge of human and other animal genes.
Showing the most recent 10 out of 30 publications