Collaborative Research: Mathematical Framework for Biomolecules: From Protein to RNA to Chromosomes

Zhang, Jinfeng

Abstract

Despite rapid progress in structural bioinformatics, a rigorous and unifying mathematical and statistical framework is missing in our current toolbox for analysis, classification, and organization of individual as well as groups of biomolecules. We have recently developed such a framework based on the elastic shape analysis (ESA) for the comparison of protein and RNA structures. Under this framework, the formal geodesic distance for any two protein/RNA structures can be computed rapidly. Probability distributions can also be built for families of protein/RNA structures, and can be used to classify structures in a principled way through statistical hypothesis testing. In addition, sequence information can be naturally incorporated so that comparison of structures can be conducted in the joint sequence-structure space. We have also developed novel algorithms for matching and analyzing protein surfaces. We propose to significantly further develop these methodologies for important applications in structure biology, including studying chromosome structures by combining both 30 structure and sequence level information. The proposed research will make significant contributions to the following areas: (1) This proposal will fill an important gap in structure biology - the lack of a rigorous mathematical and statistical framework for biomolecular structure comparison; (2) Our proposed unifying framework will allow natural incorporation of sequence information for structure comparison; (3) Our approach can uncover distinct clusters at the deepest level of current classification scheme (i.e. SCOP family), enabling a finer classification of biomolecular structures. Preliminary results indicate that by using carefully measured structural similarity, we will obtain representative sets of proteins of higher quality than those by current sequence similarity based methods; (4) The probabilistic models designed for protein/RNA backbone structures and surfaces will capture the flexible nature of protein structures through the use of ensemble of conformations, while maintaining high computational efficiency. These models will also enable effective characterization of family-specific variations among proteins, an important task none of the existing methods work well; (5) Protein/RNA structures will be organized using network-based data structures using probabilistic approaches. This new organization will effectively integrates sequence, backbone structure, and surface information, facilitating discovery of novel insight; and (6) these new development will be rapidly generalized for studying chromosome structures. This proposed research will allow development of tools that will also be applicable in other areas of shape analysis, including medical image analysis, computer vision, and pattern recognition. Our work will help to increase the communication between the field of protein structure analysis and the field of shape analysis, and will stimulate more cross-over development in methodology and transform research activities in both fields.

Public Health Relevance

Analysis, classification and organization of biomolecules are fundamental tasks essential for understanding the sequence-structure-function relationships of biomolecules. In this project, we aim to develop rigorous and unifying mathematical and statistical frameworks for such tasks and apply them to study proteins, RNAs and chromosomes.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM126558-03
Application #: 9731563
Study Section: Special Emphasis Panel (ZGM1)
Program Officer: Lyster, Peter

Project Start: 2017-07-01
Project End: 2022-06-30
Budget Start: 2019-07-01
Budget End: 2020-06-30
Support Year: 3
Fiscal Year: 2019
Total Cost
Indirect Cost

Institution

Name: Florida State University
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 790877419

City: Tallahassee
State: FL
Country: United States
Zip Code: 32306

Related projects


NIH 2020 R01 GM	Collaborative Research: Mathematical Framework for Biomolecules: From Protein to RNA to Chromosomes Zhang, Jinfeng / Florida State University
NIH 2019 R01 GM	Collaborative Research: Mathematical Framework for Biomolecules: From Protein to RNA to Chromosomes Zhang, Jinfeng / Florida State University
NIH 2018 R01 GM	Collaborative Research: Mathematical Framework for Biomolecules: From Protein to RNA to Chromosomes Zhang, Jinfeng / Florida State University
NIH 2017 R01 GM	Collaborative Research: Mathematical Framework for Biomolecules: From Protein to RNA to Chromosomes Zhang, Jinfeng / Florida State University

Publications

Yang, Yiqing; Guo, Ruiqiong; Gaffney, Kristen et al. (2018) Folding-Degradation Relationship of a Membrane Protein Mediated by the Universally Conserved ATP-Dependent Protease FtsH. J Am Chem Soc 140:4656-4665

Tian, Wei; Lin, Meishan; Tang, Ke et al. (2018) High-resolution structure prediction of ?-barrel membrane proteins. Proc Natl Acad Sci U S A 115:1511-1516

Turpin, Zachary M; Vera, Daniel L; Savadel, Savannah D et al. (2018) Chromatin structure profile data from DNS-seq: Differential nuclease sensitivity mapping of four reference tissues of B73 maize (Zea mays L). Data Brief 20:358-363

Bou-Dargham, Mayassa J; Liu, Yuhang; Sang, Qing-Xiang Amy et al. (2018) Subgrouping breast cancer patients based on immune evasion mechanisms unravels a high involvement of transforming growth factor-beta and decoy receptor 3. PLoS One 13:e0207799

Perez-Rathke, Alan; Fahie, Monifa A; Chisholm, Christina et al. (2018) Mechanism of OmpG pH-Dependent Gating from Loop Ensemble and Single Channel Studies. J Am Chem Soc 140:1105-1115

Girimurugan, Senthil B; Liu, Yuhang; Lung, Pei-Yau et al. (2018) iSeg: an efficient algorithm for segmentation of genomic and epigenomic data. BMC Bioinformatics 19:131

Gürsoy, Gamze; Xu, Yun; Liang, Jie (2017) Spatial organization of the budding yeast genome in the cell nucleus and identification of specific chromatin interactions from multi-chromosome constrained chromatin model. PLoS Comput Biol 13:e1005658

Liu, Yuhang; Zhang, Jinfeng; Qiu, Xing (2017) Super-delta: a new differential gene expression analysis procedure with robust data normalization. BMC Bioinformatics 18:582

Comments

Be the first to comment on Jinfeng Zhang's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: