Algorithms for protein superfamily classification and function prediction

Zhi, Degui

Abstract

The goal of this project is to develop new algorithms for protein function prediction. Recent rapid advancements in various technological developments produce biological data of unprecedented amount and complexity. Computational methods are becoming essential components in modern biomedical research. One of greatest challenges facing bioinformatician is the discovery of connections among different data sets and generating novel biological knowledge or hypotheses. Predicting the molecular function of novel proteins is ah urgent task for the post-genomics era. Especially, recent assessment of structural genomic efforts revealed a gap between experimental protein structure determination and the use ofthe structural knowledge for gaining understanding of biological function of the proteins at the molecular level. We will employ recent developments in discriminative machine learning approaches for constructing a residue-level classification system for function prediction from structure. Existing systems for functional prediction from structure either use global structural and sequence similarities over entire protein chain or use localized similarities such as putative functional sites. Our system will leverage the information from both global and local similarities, and identifies important residues and clusters of residues that are distinctive among different functional families. Our approach is based on and extend over an efficient optimization framework that we developed for protein superfamily classification. We expect that these methodological developments will not only improve the performance of state-of-the-art function prediction, but also help illuminating our understanding ofthe interplay of sequence and structure on defining functional variations among protein families. Beyond this major project, we will work on an additional project that extends the graph theoretical models for multiple sequence alignment we developed earlier to meet the challenge of domain annotation for large new sequence set.

Public Health Relevance

The advancement of medical research is partly based on our detailed understanding ofthe functions of genes and proteins. Our research will improve our understanding of protien evolution and function at the molecular level. Our computational approach will speed up the discovery of biological knowledge from large data sets generated by high-throughput methods.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Center for Research Resources (NCRR)
Type: Research Transition Award (R00)
Project #: 5R00RR024163-05
Application #: 8100273
Study Section: Special Emphasis Panel (NSS)
Program Officer: Sheeley, Douglas

Project Start: 2009-07-07
Project End: 2014-05-31
Budget Start: 2011-06-01
Budget End: 2014-05-31
Support Year: 5
Fiscal Year: 2011
Total Cost: $244,046
Indirect Cost

Institution

Name: University of Alabama Birmingham
Department: Biostatistics & Other Math Sci
Type: Schools of Public Health
DUNS #: 063690705

City: Birmingham
State: AL
Country: United States
Zip Code: 35294

Related projects


NIH 2011 R00 RR	Algorithms for protein superfamily classification and function prediction Zhi, Degui / University of Alabama Birmingham	$244,046
NIH 2010 R00 RR	Algorithms for protein superfamily classification and function prediction Zhi, Degui / University of Alabama Birmingham	$246,511
NIH 2009 R00 RR	Algorithms for protein superfamily classification and function prediction Zhi, Degui / University of Alabama Birmingham	$249,000

Publications

Jahandideh, Samad; Zhi, Degui (2014) Systematic investigation of predicted effect of nonsynonymous SNPs in human prion protein gene: a molecular modeling and molecular dynamics study. J Biomol Struct Dyn 32:289-300

Zhang, Kui; Zhi, Degui (2013) Joint haplotype phasing and genotype calling of multiple individuals using haplotype informative reads. Bioinformatics 29:2427-34

Jahandideh, Samad (2013) Diversity in structural consequences of MexZ mutations in Pseudomonas aeruginosa. Chem Biol Drug Des 81:600-6

Lin, Wan-Yu; Yi, Nengjun; Lou, Xiang-Yang et al. (2013) Haplotype kernel association test as a powerful method to identify chromosomal regions harboring uncommon causal variants. Genet Epidemiol 37:560-70

Wu, Guodong; Zhi, Degui (2013) Pathway-based approaches for sequencing-based genome-wide association studies. Genet Epidemiol 37:478-94

Zhi, Degui; Wu, Jihua; Liu, Nianjun et al. (2012) Genotype calling from next-generation sequencing data using haplotype information of reads. Bioinformatics 28:938-46

Lin, Wan-Yu; Yi, Nengjun; Zhi, Degui et al. (2012) Haplotype-based methods for detecting uncommon causal variants with common SNPs. Genet Epidemiol 36:572-82

Zhi, Degui; Chen, Rui (2012) Statistical guidance for experimental design and data analysis of mutation detection in rare monogenic mendelian diseases by exome sequencing. PLoS One 7:e31358

Jahandideh, Samad; Srinivasasainagendra, Vinodh; Zhi, Degui (2012) Comprehensive comparative analysis and identification of RNA-binding protein domains: Multi-class classification and feature selection. J Theor Biol 312C:65-75

Jahandideh, Samad; Mahdavi, Abbas (2012) RFCRYS: sequence-based protein crystallization propensity prediction by means of random forest. J Theor Biol 306:115-9

Showing the most recent 10 out of 20 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: