In 2000, our Fold and Function Assignment System (FFAS) server pioneered protein profile-profile algorithms, applying them to protein structure prediction. Since then, the basic ideas underpinning these algorithms have been used in over 20 distant homology recognition algorithms, and the public FFAS server is used by almost 1,000 registered users running hundreds of jobs per day, applying the results not only in protein structure prediction, but also in function prediction, target selection in structural genomics, and general analysis of diverse protein families. With ever increasing flow of new protein sequence, many of them representing new, uncharacterized families, the importance of distant homology recognition is constantly growing. We propose enhancing the FFAS structure and interface to match these new types of applications, developing the server into a major resource for studying broad and diversified protein families. We plan to extend the usefulness and maintainability of FFAS by restructuring its code using modem programming practices to develop a modular, multistage program ready to be integrated with other servers, as well as use other programs, developed both inside and outside our group, to improve quality of data at each step of the prediction process and also to export intermediate results to the user for analysis. By extending the set of analysis and visualization tools integrated into the FFAS server and improving its user interface, we want to make it easier to be used by a generally trained biologist. Finally, we plan to perform a significant hardware update to avoid delays in providing annotations for user-submitted sequences.

Public Health Relevance

Provide annotations for uncharacterized proteins, including human disease-related proteins and virulence factors, to help understand their functions and thus shorten discovery timelines for new therapies.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
1R01GM087218-01
Application #
7627177
Study Section
Special Emphasis Panel (ZRG1-BST-Q (01))
Program Officer
Lyster, Peter
Project Start
2009-09-30
Project End
2011-08-31
Budget Start
2009-09-30
Budget End
2010-08-31
Support Year
1
Fiscal Year
2009
Total Cost
$477,500
Indirect Cost
Name
Sanford-Burnham Medical Research Institute
Department
Type
DUNS #
020520466
City
La Jolla
State
CA
Country
United States
Zip Code
92037
Xu, Dong; Jaroszewski, Lukasz; Li, Zhanwen et al. (2014) AIDA: ab initio domain assembly server. Nucleic Acids Res 42:W308-13
Xu, Dong; Jaroszewski, Lukasz; Li, Zhanwen et al. (2014) FFAS-3D: improving fold recognition by including optimized structural features and template re-ranking. Bioinformatics 30:660-7
Bhabha, Gira; Ekiert, Damian C; Jennewein, Madeleine et al. (2013) Divergent evolution of protein conformational dynamics in dihydrofolate reductase. Nat Struct Mol Biol 20:1243-9
Zmasek, Christian M; Godzik, Adam (2012) This Deja vu feeling--analysis of multidomain protein evolution in eukaryotic genomes. PLoS Comput Biol 8:e1002701
Rouard, Mathieu; Guignon, Valentin; Aluome, Christelle et al. (2011) GreenPhylDB v2.0: comparative and functional genomics in plants. Nucleic Acids Res 39:D1095-102
Cai, Xiao-Hui; Jaroszewski, Lukasz; Wooley, John et al. (2011) Internal organization of large protein families: relationship between the sequence, structure, and function-based clustering. Proteins 79:2389-402
Cammarato, Anthony; Ahrens, Christian H; Alayari, Nakissa N et al. (2011) A mighty small heart: the cardiac proteome of adult Drosophila melanogaster. PLoS One 6:e18497
Godzik, Adam (2011) Metagenomics and the protein universe. Curr Opin Struct Biol 21:398-403
Zhang, Qing; Zmasek, Christian M; Cai, Xiaohui et al. (2011) TIR domain-containing adaptor SARM is a late addition to the ongoing microbe-host dialog. Dev Comp Immunol 35:461-8
Zmasek, Christian M; Godzik, Adam (2011) Strong functional patterns in the evolution of eukaryotic genomes revealed by the reconstruction of ancestral protein domain repertoires. Genome Biol 12:R4

Showing the most recent 10 out of 12 publications