In a resequencing experiment, assembling reads into a coherent picture enables joint analysis of raw reads, offering an unbiased approach to detect genomic differences between individuals in population studies or to identify somatic changes in cancer research. This approach is gaining interest as large scale studies, such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) projects, compile their preliminary findings. Our implementation of a de novo assembly algorithm and its downstream analysis pipelines are popular tools in the field for interrogating genomes (ABySS) and transcriptomes (Trans-ABySS). Using these tools, our team has been contributing analysis results to a number of cancer studies, including several TCGA and ICGC projects. We also make these software available for the community;as of January 2014, ABySS and Trans-ABySS have collectively received over 700 citations (source: Thomson-Reuters) while enjoying vibrant user discussion venues at Google Groups. Building on the success of our analysis platforms, we will continue developing our algorithms, and will adapt them to data from the rapidly evolving sequencing technologies. We propose to improve the performance of ABySS and Trans-ABySS, and continue supporting a growing user base with better genome, transcriptome, and metagenome assembly and analysis tools. We will also expand the functionality of our analysis pipelines to integrate orthogonal data that support detected events;present alternative isoform usage in assembled transcriptomes as slice graphs;reconstruct 3'untranslated regions;and refine contig to reference alignments and their interpretation for better structural variation and chimeric transcript detection. To accomplish these goals, we will focus on (1) algorithmic improvements on the primary sequence assembly and alignment approaches, (2) high performance computing platforms, and optimize our analysis approaches on the next generation of central processing unit (CPU) architectures, and (3) downstream analysis pipelines, building streamlined standard operating procedures. With sequencing technologies changing rapidly, and their throughput still increasing exponentially, there is a need to adapt established bioinformatics tools, such as ABySS and Trans-ABySS, improve their performance, and make their use accessible to a growing community. The continued development of our tools will enable translational genomics studies on the road to precise personal medicine.
Analysis tools we developed to investigate DNA and RNA sequences from normal and diseased samples are being used by a wide group of investigators. With sequencing technologies changing rapidly, and their costs dropping sharply, there is a need to adapt established bioinformatics tools, improve their performance, and make their use accessible to a growing community. The continued development of our tools will enable translational genomics studies on the road to precise personal medicine.
|Coombe, Lauren; Zhang, Jessica; Vandervalk, Benjamin P et al. (2018) ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers. BMC Bioinformatics 19:234|
|Yeo, Sarah; Coombe, Lauren; Warren, René L et al. (2018) ARCS: scaffolding genome drafts with linked reads. Bioinformatics 34:725-731|
|Chiu, Readman; Nip, Ka Ming; Chu, Justin et al. (2018) TAP: a targeted clinical genomics pipeline for detecting transcript variants using RNA-seq data. BMC Med Genomics 11:79|
|Khan, Hamza; Mohamadi, Hamid; Vandervalk, Benjamin P et al. (2018) ChopStitch: exon annotation and splice graph construction using transcriptome assembly and whole genome sequencing data. Bioinformatics 34:1697-1704|
|Kucuk, Erdi; Chu, Justin; Vandervalk, Benjamin P et al. (2017) Kollector: transcript-informed, targeted de novo assembly of gene loci. Bioinformatics 33:1782-1788|
|Mohamadi, Hamid; Khan, Hamza; Birol, Inanc (2017) ntCard: a streaming algorithm for cardinality estimation in genomics data. Bioinformatics 33:1324-1330|
|Hammond, S Austin; Warren, René L; Vandervalk, Benjamin P et al. (2017) The North American bullfrog draft genome provides insight into hormonal regulation of long noncoding RNA. Nat Commun 8:1433|
|Hasan, Nabeeh A; Warren, René L; Epperson, L Elaine et al. (2017) Complete Genome Sequence of Mycobacterium chimaera SJ42, a Nonoutbreak Strain from an Immunocompromised Patient with Pulmonary Disease. Genome Announc 5:|
|Chu, Justin; Mohamadi, Hamid; Warren, René L et al. (2017) Innovations and challenges in detecting long read overlaps: an evaluation of the state-of-the-art. Bioinformatics 33:1261-1270|
|Yang, Chen; Chu, Justin; Warren, René L et al. (2017) NanoSim: nanopore sequence read simulator based on statistical characterization. Gigascience 6:1-6|
Showing the most recent 10 out of 18 publications