With the reference human genome sequence now completed, the next wave of large-scale sequencing will be aimed at genomes that can further inform the human sequence or otherwise provide significant value for biological discovery. These sequences must be of high quality, yet must be generated efficiently and at a substantially lower cost. In this proposal, we describe technical developments that will allow us to produce longer sequence read lengths, decrease sequencing costs, improve physical map construction, streamline genome assembly, and automate sequence finishing. To support these advances, we will develop enhanced informatics tools and infrastructure to effectively integrate and improve management of the entire range of our laboratory processes. On the basis of these technical developments, we will produce genome sequence data at a rate of 3.3M reads/month in Year 1, scaling moderately to 3.8M reads/month in Year 3. Over the same time period, we aim to increase average read length by at least 300 bp, and to cut our per-read cost from $1.35 to $0.75 or less. Refined methods and tools to more efficiently finish genome sequences to high quality and continuity standards, as well as methods and tools for detection and annotation of genes and other elements encoded within those genomes, will further enhance the output data from our Center. Coupled with advances in strategy, these improvements will substantially improve the efficiency and the economics of genome sequencing, making it much more feasible to consider the analysis of additional human and animal genomes. ? ?
Liu, Jianfang; Lichtenberg, Tara; Hoadley, Katherine A et al. (2018) An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell 173:400-416.e11 |
Bailey, Matthew H; Tokheim, Collin; Porta-Pardo, Eduard et al. (2018) Comprehensive Characterization of Cancer Driver Genes and Mutations. Cell 173:371-385.e18 |
Magrini, Vincent; Gao, Xin; Rosa, Bruce A et al. (2018) Improving eukaryotic genome annotation using single molecule mRNA sequencing. BMC Genomics 19:172 |
Blue, Elizabeth E; Bis, Joshua C; Dorschner, Michael O et al. (2018) Genetic Variation in Genes Underlying Diverse Dementias May Explain a Small Proportion of Cases in the Alzheimer's Disease Sequencing Project. Dement Geriatr Cogn Disord 45:1-17 |
Hmeljak, Julija; Sanchez-Vega, Francisco; Hoadley, Katherine A et al. (2018) Integrative Molecular Characterization of Malignant Pleural Mesothelioma. Cancer Discov 8:1548-1565 |
Sanchez-Vega, Francisco; Mina, Marco; Armenia, Joshua et al. (2018) Oncogenic Signaling Pathways in The Cancer Genome Atlas. Cell 173:321-337.e10 |
Way, Gregory P; Sanchez-Vega, Francisco; La, Konnor et al. (2018) Machine Learning Detects Pan-cancer Ras Pathway Activation in The Cancer Genome Atlas. Cell Rep 23:172-180.e3 |
Ricketts, Christopher J; De Cubas, Aguirre A; Fan, Huihui et al. (2018) The Cancer Genome Atlas Comprehensive Molecular Characterization of Renal Cell Carcinoma. Cell Rep 23:313-326.e5 |
Knijnenburg, Theo A; Wang, Linghua; Zimmermann, Michael T et al. (2018) Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome Atlas. Cell Rep 23:239-254.e6 |
Blue, E E; Yu, C-E; Thornton, T A et al. (2018) Variants regulating ZBTB4 are associated with age-at-onset of Alzheimer's disease. Genes Brain Behav 17:e12429 |
Showing the most recent 10 out of 234 publications