Whole genome sequencing projects of human and other vertebrates have greatly advanced comparative genomics, which led to novel biological discoveries. Our long-term research goal is to use comparative genomics to elucidate the trajectory of vertebrate genome evolution and the origin of complex traits of different species. Such insights will in turn help us better understand the biology of the human genome. Advances in next-generation sequencing (NGS) technologies have provided us with unprecedented opportunities to tackle this problem. However, the large number of genomes being sequenced and the limitations of genome quality produced by NGS have underlined urgent needs for new computational methods to address several pressing challenges for the new generation of comparative genomic analysis. The objective in this particular application is to develop new computational methods to improve the accuracy of whole- genome comparisons for vertebrate genomes. We have two specific aims: (1) To develop a comparative assembly algorithm to improve vertebrate genomes assembled from NGS data; (2) To develop a probabilistic framework to improve the quality of multiple sequence alignments for vertebrate genomes. Our research plan is innovative because it provides novel algorithms and software tools to systematically improve the foundations for genome comparisons. The research is significant because the methods to be developed will allow researchers to more effectively utilize the new genome sequencing data. The proposed research will have sustained impact even with the increasing number of genomes and the advancement of sequencing technology. By improving the general methodology for next-generation comparative genomics, our work will have a high impact on large-scale genome projects such as G10K and ENCODE. As a result, this innovative project in computational biology will enable advancement in biomedical research.

Public Health Relevance

The proposed research in computational biology is expected to improve comparative genomic analysis to help better understand human biology and disease mechanisms. Thus, this project is relevant to NIH's mission that seeks to obtain fundamental knowledge that will help to enhance health.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG007352-04
Application #
9102153
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Wellington, Christopher
Project Start
2014-09-01
Project End
2018-12-31
Budget Start
2017-01-01
Budget End
2018-12-31
Support Year
4
Fiscal Year
2017
Total Cost
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
Schools of Arts and Sciences
DUNS #
052184116
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Rajaraman, Ashok; Ma, Jian (2018) Toward Recovering Allele-specific Cancer Genome Graphs. J Comput Biol 25:624-636
Tasan, Ipek; Sustackova, Gabriela; Zhang, Liguo et al. (2018) CRISPR/Cas9-mediated knock-in of an optimized TetO repeat for live cell imaging of endogenous loci. Nucleic Acids Res 46:e100
Kim, Young-Chae; Seok, Sunmi; Byun, Sangwon et al. (2018) AhR and SHP regulate phosphatidylcholine and S-adenosylmethionine levels in the one-carbon cycle. Nat Commun 9:540
Chen, Yu; Zhang, Yang; Wang, Yuchuan et al. (2018) Mapping 3D genome organization relative to nuclear compartments using TSA-Seq as a cytological ruler. J Cell Biol 217:4025-4048
Seok, Sunmi; Kim, Young-Chae; Byun, Sangwon et al. (2018) Fasting-induced JMJD3 histone demethylase epigenetically activates mitochondrial fatty acid ?-oxidation. J Clin Invest 128:3144-3159
Zhang, Ruochi; Wang, Yuchuan; Yang, Yang et al. (2018) Predicting CTCF-mediated chromatin loops using CTCF-MP. Bioinformatics 34:i133-i141
Yang, Yang; Gu, Quanquan; Zhang, Yang et al. (2018) Continuous-Trait Probabilistic Model for Comparing Multi-species Functional Genomic Data. Cell Syst 7:208-218.e11
Ma, Sai; Hsieh, Yuan-Pang; Ma, Jian et al. (2018) Low-input and multiplexed microfluidic assay reveals epigenomic variation across cerebellum and prefrontal cortex. Sci Adv 4:eaar8187
Yang, Yang; Zhang, Ruochi; Singh, Shashank et al. (2017) Exploiting sequence-based features for predicting enhancer-promoter interactions. Bioinformatics 33:i252-i260
Singh, Deepak K; Gholamalamdari, Omid; Jadaliha, Mahdieh et al. (2017) PSIP1/p75 promotes tumorigenicity in breast cancer cells by promoting the transcription of cell cycle genes. Carcinogenesis 38:966-975

Showing the most recent 10 out of 28 publications