In a resequencing experiment, assembling reads into a coherent picture enables joint analysis of raw reads, offering an unbiased approach to detect genomic differences between individuals in population studies or to identify somatic changes in cancer research. This approach is gaining interest as large scale studies, such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) projects, compile their preliminary findings. Our implementation of a de novo assembly algorithm and its downstream analysis pipelines are popular tools in the field for interrogating genomes (ABySS) and transcriptomes (Trans-ABySS). Using these tools, our team has been contributing analysis results to a number of cancer studies, including several TCGA and ICGC projects. We also make these software available for the community;as of January 2014, ABySS and Trans-ABySS have collectively received over 700 citations (source: Thomson-Reuters) while enjoying vibrant user discussion venues at Google Groups. Building on the success of our analysis platforms, we will continue developing our algorithms, and will adapt them to data from the rapidly evolving sequencing technologies. We propose to improve the performance of ABySS and Trans-ABySS, and continue supporting a growing user base with better genome, transcriptome, and metagenome assembly and analysis tools. We will also expand the functionality of our analysis pipelines to integrate orthogonal data that support detected events;present alternative isoform usage in assembled transcriptomes as slice graphs;reconstruct 3'untranslated regions;and refine contig to reference alignments and their interpretation for better structural variation and chimeric transcript detection. To accomplish these goals, we will focus on (1) algorithmic improvements on the primary sequence assembly and alignment approaches, (2) high performance computing platforms, and optimize our analysis approaches on the next generation of central processing unit (CPU) architectures, and (3) downstream analysis pipelines, building streamlined standard operating procedures. With sequencing technologies changing rapidly, and their throughput still increasing exponentially, there is a need to adapt established bioinformatics tools, such as ABySS and Trans-ABySS, improve their performance, and make their use accessible to a growing community. The continued development of our tools will enable translational genomics studies on the road to precise personal medicine.

Public Health Relevance

Analysis tools we developed to investigate DNA and RNA sequences from normal and diseased samples are being used by a wide group of investigators. With sequencing technologies changing rapidly, and their costs dropping sharply, there is a need to adapt established bioinformatics tools, improve their performance, and make their use accessible to a growing community. The continued development of our tools will enable translational genomics studies on the road to precise personal medicine.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
1R01HG007182-01A1
Application #
8631896
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Felsenfeld, Adam
Project Start
2014-03-04
Project End
2017-01-31
Budget Start
2014-03-04
Budget End
2015-01-31
Support Year
1
Fiscal Year
2014
Total Cost
$249,465
Indirect Cost
$16,641
Name
British Columbia Cancer Agency
Department
Type
DUNS #
209137736
City
Vancouver
State
BC
Country
Canada
Zip Code
V5 1-L3