Whole genome sequencing projects of human and other vertebrates have greatly advanced comparative genomics, which led to novel biological discoveries. Our long-term research goal is to use comparative genomics to elucidate the trajectory of vertebrate genome evolution and the origin of complex traits of different species. Such insights will in turn help us better understand the biology of the human genome. Advances in next-generation sequencing (NGS) technologies have provided us with unprecedented opportunities to tackle this problem. However, the large number of genomes being sequenced and the limitations of genome quality produced by NGS have underlined urgent needs for new computational methods to address several pressing challenges for the new generation of comparative genomic analysis. The objective in this particular application is to develop new computational methods to improve the accuracy of whole- genome comparisons for vertebrate genomes. We have two specific aims: (1) To develop a comparative assembly algorithm to improve vertebrate genomes assembled from NGS data; (2) To develop a probabilistic framework to improve the quality of multiple sequence alignments for vertebrate genomes. Our research plan is innovative because it provides novel algorithms and software tools to systematically improve the foundations for genome comparisons. The research is significant because the methods to be developed will allow researchers to more effectively utilize the new genome sequencing data. The proposed research will have sustained impact even with the increasing number of genomes and the advancement of sequencing technology. By improving the general methodology for next-generation comparative genomics, our work will have a high impact on large-scale genome projects such as G10K and ENCODE. As a result, this innovative project in computational biology will enable advancement in biomedical research.

Public Health Relevance

The proposed research in computational biology is expected to improve comparative genomic analysis to help better understand human biology and disease mechanisms. Thus, this project is relevant to NIH's mission that seeks to obtain fundamental knowledge that will help to enhance health.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Wellington, Christopher
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
Schools of Arts and Sciences
United States
Zip Code
Kim, Jaebum; Farré, Marta; Auvil, Loretta et al. (2017) Reconstruction and evolutionary history of eutherian chromosomes. Proc Natl Acad Sci U S A 114:E5379-E5388
Rajaraman, Ashok; Ma, Jian (2016) Reconstructing ancestral gene orders with duplications guided by synteny level genome reconstruction. BMC Bioinformatics 17:414
Li, Yang; Zhou, Shiguo; Schwartz, David C et al. (2016) Allele-Specific Quantification of Structural Variations in Cancer Genomes. Cell Syst 3:21-34
He, Feifei; Li, Yang; Tang, Yu-Hang et al. (2016) Identifying micro-inversions using high-throughput sequencing reads. BMC Genomics 17 Suppl 1:4
Tian, Dechao; Gu, Quanquan; Ma, Jian (2016) Identifying gene regulatory network rewiring using latent differential graphical models. Nucleic Acids Res 44:e140
Heo, Yun; Ramachandran, Anand; Hwu, Wen-Mei et al. (2016) BLESS 2: accurate, memory-efficient and fast error correction method. Bioinformatics 32:2369-71
Hou, Jack P; Emad, Amin; Puleo, Gregory J et al. (2016) A new correlation clustering method for cancer mutation analysis. Bioinformatics 32:3717-3728
Gupta, Aditya; Place, Michael; Goldstein, Steven et al. (2015) Single-molecule analysis reveals widespread structural variation in multiple myeloma. Proc Natl Acad Sci U S A 112:7689-94
Kim, Young-Chae; Byun, Sangwon; Zhang, Yang et al. (2015) Liver ChIP-seq analysis in FGF19-treated mice reveals SHP as a global transcriptional partner of SREBP-2. Genome Biol 16:268
Rittschof, Clare C; Bukhari, Syed Abbas; Sloofman, Laura G et al. (2014) Neuromolecular responses to social challenge: common mechanisms across mouse, stickleback fish, and honey bee. Proc Natl Acad Sci U S A 111:17929-34

Showing the most recent 10 out of 12 publications