The primary goal of this supplement request is to create a comprehensive, high quality map of genome variation from the complete set of whole genome sequencing (WGS) data produced by the CCDG program to date. The genome variation map will encompass small variants (SNVs and indels < 50 bp) as well as larger structural variants (SVs) including deletions, duplications, inversions, mobile element insertions, complex SVs and multi- allelic copy number variants (CNVs). A secondary goal is to initiate work towards the creation of Community Resources derived from the Freeze2 genome variation map, including publicly available site-frequency catalogs, imputation reference panels and functional constraint annotations. The overall strategy will mirror our successful Freeze1 pilot effort from last year, with modifications to improve comprehensiveness and efficiency.
Freeze2 will consist of all WGS, WES and WGG data generated prior to the June 2018 progress report. The collaborative variant calling activity descried will focus purely on WGS data, however, WES and WGG data will be aggregated on the cloud at the same time and made available for analysis. We estimate that ~55,000 WGS datasets will be available for Freeze2.
Showing the most recent 10 out of 14 publications