? We propose to establish the ProPhylER (Protein Phylogeny and Evolutionary Rates) database as a web resource available to the general research community. The ProPhylER database will feature quantitative estimates of local constraint on orthologs and closely related paralogs. Estimates of constraint require robust, curated, alignments and phylogenetic trees, which ProPhylER generates as part of its dataflow. Alignments, trees, and estimates of constraint will be made available via graphical interfaces on the web site that will be developed. In addition, because ProPhylER's trees provide the natural framework for relating protein-coding genes and their annotation across model organism and genome databases, reciprocal links to SGD, Flybase, and the UCSC browser are proposed. We anticipate that ProPhylER will become a widely used tool for the analysis of functional constraint and phylogenetic relationships among related genes. Because ProPhylER's data are generated semiautomatically with curation at the critical steps in the dataflow, ProPhylER's data have a high degree of reliability that would be difficult to reproduce for individual investigators working on a particular protein. Thus, initially it will mostly be useful for experimentalists for whom analyses of constraint can guide structure function analyses, and for whom the curated alignments and phylogenetic trees will provide a knowledge base for the protein of interest. Extension of ProPhylER to generate quantitative predictions of the deleteriousness of cSNPs will expand the target community to include human geneticists and genomicists. As sequence data become less limiting for the majority of eukaryotic proteins, ProPhylER will expand to cover most of eukaryotic ortholog space and become the database that ties together eukaryotic orthologs and closely related paralogs by their evolutionary relationships. ? ?

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
1R01HG003039-01
Application #
6710959
Study Section
Genome Study Section (GNM)
Program Officer
Bonazzi, Vivien
Project Start
2004-09-15
Project End
2007-08-31
Budget Start
2004-09-15
Budget End
2005-08-31
Support Year
1
Fiscal Year
2004
Total Cost
$467,580
Indirect Cost
Name
Stanford University
Department
Pathology
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Keck, Jamie M; Jones, Michele H; Wong, Catherine C L et al. (2011) A cell cycle phosphoproteome of the yeast centrosome. Science 332:1557-61