Identifying residues of importance in the protein products of genes is a challenging and important problem for informatics, genome annotation, molecular biology, biochemistry and drug discovery. Functional annotation of genes is inherently hierarchical; genes can be annotated at the level of genome sequence, transcript variant, protein product, protein domain, nucleotide or amino acid. Only a few resources annotate protein function at the level of the amino acid and language relating residue function and gene product sequence, structure and expression is challenging. To address this, I am investigating how sequence, evolutionary and structural descriptors can be used to quantify function. I am applying this knowledge to develop methods that can associate residues with known functional annotations, perform annotation transfer onto an experimentally determined or modeled protein structure, and determine the likely molecular effects of mutation, thus creating a framework for residue annotation. One of the greatest challenges for the computational biologist is identifying features (or attributes) that are useful for classification of genomic data. With this effort, we will continue our work describing novel features for classification of functional sites and we will test them using supervised machine learning tools. We will do this by, 1) testing the power of several diverse functional features for classification of catalytic residues in proteins, 2) applying these features to other important residue functional annotation problems, and 3) evaluate features based on homologous sequences. This research is important for understanding the molecular basis of diseases such as cancer and pharmacogenetics data from a molecular perspective. When completed, scientists will have a rich set of data and tools for basic health research. ? ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Career Transition Award (K22)
Project #
1K22LM009135-01
Application #
7079558
Study Section
Special Emphasis Panel (ZLM1-AP-K (J2))
Program Officer
Ye, Jane
Project Start
2006-04-24
Project End
2009-04-23
Budget Start
2006-04-24
Budget End
2007-04-23
Support Year
1
Fiscal Year
2006
Total Cost
$152,164
Indirect Cost
Name
Indiana University-Purdue University at Indianapolis
Department
Genetics
Type
Schools of Medicine
DUNS #
603007902
City
Indianapolis
State
IN
Country
United States
Zip Code
46202
Zhao, Yiqiang; Clark, Wyatt T; Mort, Matthew et al. (2011) Prediction of functional regulatory SNPs in monogenic and complex disease. Hum Mutat 32:1183-90
Mooney, Sean D; Krishnan, Vidhya G; Evani, Uday S (2010) Bioinformatic tools for identifying disease gene and SNP candidates. Methods Mol Biol 628:307-19
Mort, Matthew; Evani, Uday S; Krishnan, Vidhya G et al. (2010) In silico functional profiling of human disease-associated and polymorphic amino acid substitutions. Hum Mutat 31:335-46
Sanford, Jeremy R; Wang, Xin; Mort, Matthew et al. (2009) Splicing factor SFRS1 recognizes a functionally diverse landscape of RNA transcripts. Genome Res 19:381-94
Chen, Jake Y; Youn, Eunseog; Mooney, Sean D (2009) Connecting protein interaction data, mutations, and disease using bioinformatics. Methods Mol Biol 541:449-61
Sadat, M A; Dirscherl, S; Sastry, L et al. (2009) Retroviral vector integration in post-transplant hematopoiesis in mice conditioned with either submyeloablative or ablative irradiation. Gene Ther 16:1452-64
Radivojac, Predrag; Baenziger, Peter H; Kann, Maricel G et al. (2008) Gain and loss of phosphorylation sites in human cancer. Bioinformatics 24:i241-7
Singh, Arti; Olowoyeye, Adebayo; Baenziger, Peter H et al. (2008) MutDB: update on development of tools for the biochemical analysis of genetic variation. Nucleic Acids Res 36:D815-9
Radivojac, Predrag; Peng, Kang; Clark, Wyatt T et al. (2008) An integrated approach to inferring gene-disease associations in humans. Proteins 72:1030-7
Peters, B; Dirscherl, S; Dantzer, J et al. (2008) Automated analysis of viral integration sites in gene therapy research using the SeqMap web resource. Gene Ther 15:1294-8

Showing the most recent 10 out of 12 publications