We propose to determine the genome sequences of four Saccharomyces species (Aim 1), for the purpose of identifying conserved, and therefore likely functional features in their DNA sequences. Rather than focusing on proteins and protein-coding DNA (a common emphasis of such studies), we seek instead to find hidden functional features in non-coding sequence (Aim 2), such as sequences regulating gene expression, chromosome maintenance, and encoding non-protein coding RNAs. Because these kinds of sequences evolve rapidly, sequences of relatively closely related species need to be compared. We have come to the conclusion that alignment of S. cerevisiae DNA sequence to that of at least 3 other Saccharomyces species (i.e., 4-way alignments) will be necessary to observe conserved blocks of sequence that are significant. Whole genome shotgun sequencing (2-3-fold coverage) of the genomes of 4 different species, followed by limited filling of gaps in the 4-way alignments, seems like the most efficient and economical way to acquire the necessary data. We will assess our ability to recognize functional sequence elements based on their conservation by testing the function of some of the conserved sequences (Aim 3) with experiments designed to identify transcriptional regulatory sequences and the proteins that bind to them, as well as non-protein coding RNAs. Yeasts are excellent organisms for these studies because 1) their relatively small genomes makes the project cost effective, 2) a wealth of information on yeast is available for assessing the significance of conserved sequences, 3) incisive experiments can be designed to test the biological function of conserved sequences, and 4) the data will benefit the large community of scientists studying this important model organism. We hope to contribute to the development of comparative DNA sequence analysis as an approach to annotating genomes, and learn how to use it effectively.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM063803-04
Application #
6772468
Study Section
Genome Study Section (GNM)
Program Officer
Tompkins, Laurie
Project Start
2001-08-01
Project End
2006-07-31
Budget Start
2004-08-01
Budget End
2006-07-31
Support Year
4
Fiscal Year
2004
Total Cost
$260,037
Indirect Cost
Name
Washington University
Department
Genetics
Type
Schools of Medicine
DUNS #
068552207
City
Saint Louis
State
MO
Country
United States
Zip Code
63130
Ho, Su-Wen; Jona, Ghil; Chen, Christina T L et al. (2006) Linking DNA-binding proteins to their recognition sequences by using protein microarrays. Proc Natl Acad Sci U S A 103:9940-5
McCutcheon, John P; Eddy, Sean R (2003) Computational identification of non-coding RNAs in Saccharomyces cerevisiae by comparative genomics. Nucleic Acids Res 31:4119-28
Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin et al. (2003) Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science 301:71-6