Cells are the fundamental units that provide functions needed to sustain life in living organisms. Cellular functions are carried out by proteins, products of genes, and the process of producing proteins from genes (i.e., gene expression) is mediated by complex regulation systems. Much remains unknown about the mechanisms of gene regulations. Given all genes in a cell, the regulatory relationships among genes can be represented by networks, called gene regulatory networks. It has been a long-standing challenge to reconstruct these networks experimentally and computationally. A gene can express multiple isoforms (mRNA molecules), and hence produces multiple different proteins, which makes the underlying gene regulatory networks more complicated. Recent advances in single cell RNA-Sequencing (scRNA-Seq) technology has brought new opportunities in resolving high-quality regulatory networks, but also posed new computational challenges. The project aims to computationally reconstruct accurate regulatory networks at the isoform-level from large-scale sequencing data. Educational and outreach activities, such as courses on topics in computational biology and inclusion of minority students, will be carried out.

The project will develop efficient approaches to identify expressed isoforms and to determine expression abundances, and then develop a network-reconstruction method which improves current state-of-art. The new computational methods will be validated and applied to the field of immunology--to study cellular mechanisms in steroid-producing cells. The project will make contribution in improvements over existing methods. First, the proposed methods for developing a scalable transcript assembler will enable accurate determination and quantification of the expressed isoforms, and make it possible to build regulatory networks at the level of isoforms to reflect the possible difference in regulatory mechanisms for different isoforms. Second, many recently developed methods for network inference require cells to be pre-ordered with trajectory inference or RNA-velocity to mimic time-series data. Errors in the cell ordering can mislead network inference and lead to false predictions. The project proposes to perform cell ordering and network inference simultaneously, which is expected to provide better results for both cell ordering and network inference. The project will reconstruct transcript-level regulatory networks for different types of steroid-producing cells from both published and newly generated single-cell data. The results of the project can be found at the PI?s website: www.cc.gatech.edu/~xzhang954/.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Type
Standard Grant (Standard)
Application #
2019771
Program Officer
Jean Gao
Project Start
Project End
Budget Start
2020-07-01
Budget End
2023-06-30
Support Year
Fiscal Year
2020
Total Cost
$400,000
Indirect Cost
Name
Georgia Tech Research Corporation
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30332