As the Xenopus community moves into the post-genomic era, it is critical complete gene models and regulatory elements are rapidly and accurately defined and then made readily accessible to the community. Currently, we and others have estimated that - in X. tropicalis - only 36% of transcripts have 5' UTRs with accompanying transcription start site and only 37% have 3' UTRs associated with the poly A tail. A similar situation is probably true in X. laevis. Additionally, very little is known about promoters, enhancers and other regulatory elements in either species especially when considering their broad temporal and spatial usage. The enormous utility in mapping complete gene bodies and defining elements genome-wide has recently been clearly illustrated by the massive ENCODE findings published widely in September 2012. As the Xenopus community is just beginning to observe and then curate genomes, we can harness the advances in technology driven by these other genome efforts. Unlike the situation in ENCODE - whereby large laboratories and resources had to be mobilized - the decreasing cost and advancing technologies allow similar efforts in Xenopus to be achieved by small labs and at a much reduced rate. The Xenopus community understands the immediate need for rigorous genome annotation. This was clearly echoed at the Resources Meeting at the International Xenopus Meeting in France (Sept. 2012), where genome annotation was confirmed as the highest priority after renewal of the European Xenopus Stock Center. In this grant, we propose to classify genomic elements in Xenopus throughout development and will establish an interactive database in which to access genomic elements. Specifically, our goal is to biochemically elucidate the 5' UTR, 3' UTR, promoters, enhancers and long non-coding RNAs. These datasets will allow rapid and efficient curation of gene models and regulatory elements in both X. tropicalis and X. laevis. Significantly, we will also establish XenMine - a genomic interaction tool that allows researchers to directly interface with genomic datasets from these efforts and from the community. Overall, this effort is one branch of a larger interactome - whose overall goal is to compile, annotate and then disseminate genomic data from both X. laevis and X. tropicalis. While we will directly interface with multiple efforts, our ole is to utilize biochemistry to 'fill a large gap' in the genomic repertoire of Xenopus to enable accurate mapping of gene models and gene regulatory elements.
Xenopus (both X. tropicalis and X. laevis species) is a powerful model system to study a variety of biological questions and understand human disease. However, the shortage of genomic resources and tools limits many, if not all, of researchers in the Xenopus community. In this proposal, we will apply cutting edge technologies to better define gene structures and genomic elements that regulate gene expression, and provide a powerful, interactive database - XenMine - that allows users to analyze the large scale genomic data at ease.