Functional characterization of genetic and post-transcriptional variation using machine learning methods

Korkin, Dmitry

Abstract

The goal of this research proposal is to develop new in-silico approaches for accurate functional annotation of genetic and post-transcriptional variants. The rapid growth of Next-Generation Sequencing (NGS) and high- throughput -omics data have brought us one step closer towards mechanistic understanding of the complex genetic disease, such as cancer, neurological disorders, diabetes, and others at the molecular level. In particular, these data revealed that complex diseases commonly manifest changes at the genetic and post- transcriptional levels. Bot of these types of changes often affect structure and function of the corresponding genes and their products. Understanding the functional implications of the genetic and post-transcriptional variation is an important task as it can provide critical insights into the molecular mechanisms underlying the disease. Here, we propose to leverage novel machine learning paradigms to design computational methods for predicting the effect of genetic and alternative splicing variants on the macromolecular interactions. Macromolecular interactions underlie many cellular functions in a healthy organism. The disease-induced changes in the genes, such as single nucleotide variations (SNVs) and alternative splicing variations (ASVs) have been recently reported to cause the protein-protein interaction network rewiring. Unfortunately, the experimental high-throughput techniques that characterize the large-scale effects of SNVs or ASVs on PPIs are expensive, time-consuming, and far from being comprehensive. The current in-silico methods either suffer from the limited applicability, or are less accurate when compared with the experimental methods. To overcome these challenges, we will use two recent machine learning paradigms, learning under privileged information (LUPI) and semi-supervised learning. If successful, we expect for the proposed methods to provide the critical advancement in the two main challenges of the current computational approaches, the limited coverage and lower than the experimental accuracy. The methods will be freely available to the community as the stand-alone tools as well as web- servers.

Public Health Relevance

The goal of this proposal is to build computational tools that discover the links between the disease-associated mutations as well as alternatively spliced protein isoforms and the protein-protein interactions mediated by the disease proteins. The tools use advanced machine learning methods to find such links in a fast and inexpensive way. These tools will be useful in elucidating the molecular mechanisms implicated in the complex genetic disorders.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Exploratory/Developmental Grants (R21)
Project #: 5R21LM012772-02
Application #: 9730612
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Ye, Jane

Project Start: 2018-07-01
Project End: 2021-06-30
Budget Start: 2019-07-01
Budget End: 2021-06-30
Support Year: 2
Fiscal Year: 2019
Total Cost
Indirect Cost

Institution

Name: Worcester Polytechnic Institute
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 041508581

City: Worcester
State: MA
Country: United States
Zip Code: 01609

Related projects


NIH 2019 R21 LM	Functional characterization of genetic and post-transcriptional variation using machine learning methods Korkin, Dmitry / Worcester Polytechnic Institute
NIH 2018 R21 LM	Functional characterization of genetic and post-transcriptional variation using machine learning methods Korkin, Dmitry / Worcester Polytechnic Institute

Comments

Be the first to comment on Dmitry Korkin's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: