Structural variants, including duplications, insertions, deletions, inversions, and translocations of large blocks of DNA sequence, have been shown to be associated with various human diseases. These variants also frequently occur as somatic alterations in cancer. Identifying and characterizing structural variants in a genome sequence is a challenging task. We propose to develop computational methods to enable comprehensive studies of structural variation in normal and diseased genomes.
In Aim 1 we develop a general computational framework for classification and comparison of structural variants across multiple samples and measurement platforms using a novel geometric and probabilistic approach.
In Aim 2 we design algorithms to maximize the effectiveness of emerging single-molecule sequencing technologies for detecting and assembling complex structural variants and rearranged transcripts.
In Aim 3 we develop algorithms to reconstruct the organization of cancer genomes and investigate how structural variants alter genome organization during somatic evolution. Finally, in Aim 4, we study the population genetics of inversion polymorphisms in the human genome, including their effects on haplotype block structure and whether inversions under selection leave distinctive genetic signatures. We will apply these approaches to data from human, cancer, mouse, and pathogen genomes in collaboration with several biomedical researchers. Successful completion of the proposed studies will facilitate future research of the role of structural variation in human and cancer genetics.

Public Health Relevance

Identifying the inherited genetic differences associated with disease and the acquired mutations that lead to cancer are major challenges in genomics. One important class of such mutations are structural variants, which include duplications, insertions, deletions, inversions, and translocations of large blocks of DNA sequence. These variants have been implicated in several diseases including autism and cancer. New genome technologies are enabling large-scale measurement of these variants, but demand novel computational methods to maximize the information from these measurements. We will develop a number of algorithms to facilitate the identification and characterization of structural variants. These approaches will aid in the discovery of genetic variants that will provide better diagnostics and/or personalized treatments for various diseases.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG005690-04
Application #
8600962
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
2011-01-01
Project End
2015-12-31
Budget Start
2014-01-01
Budget End
2014-12-31
Support Year
4
Fiscal Year
2014
Total Cost
$441,407
Indirect Cost
$138,119
Name
Brown University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
001785542
City
Providence
State
RI
Country
United States
Zip Code
02912
Parks, Matthew M; Raphael, Benjamin J; Lawrence, Charles E (2018) Using controls to limit false discovery in the era of big data. BMC Bioinformatics 19:323
Leiserson, Mark D M; Reyna, Matthew A; Raphael, Benjamin J (2016) A weighted exact test for mutually exclusive mutations in cancer. Bioinformatics 32:i736-i745
El-Kebir, Mohammed; Satas, Gryte; Oesper, Layla et al. (2016) Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. Cell Syst 3:43-53
Weinreb, Caleb; Raphael, Benjamin J (2016) Identification of hierarchical chromatin domains. Bioinformatics 32:1601-9
Doris, Stephen M; Smith, Deborah R; Beamesderfer, Julia N et al. (2015) Universal and domain-specific sequences in 23S-28S ribosomal RNA identified by computational phylogenetics. RNA 21:1719-30
Leiserson, Mark D M; Vandin, Fabio; Wu, Hsin-Ta et al. (2015) Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat Genet 47:106-14
Leiserson, Mark D M; Wu, Hsin-Ta; Vandin, Fabio et al. (2015) CoMEt: a statistical approach to identify combinations of mutually exclusive alterations in cancer. Genome Biol 16:160
El-Kebir, Mohammed; Oesper, Layla; Acheson-Field, Hannah et al. (2015) Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics 31:i62-70
Leiserson, Mark D M; Gramazio, Connor C; Hu, Jason et al. (2015) MAGI: visualization and collaborative annotation of genomic aberrations. Nat Methods 12:483-4
Parks, Matthew M; Lawrence, Charles E; Raphael, Benjamin J (2015) Detecting non-allelic homologous recombination from high-throughput sequencing data. Genome Biol 16:72

Showing the most recent 10 out of 40 publications