In the Center's for Common Disease Genomics (CCDG) program, the Human Genome Sequencing Center (HGSC) has focused on production of genomic data to enable discovery of alleles associated with common diseases. We recruited samples from large cohorts and focused on early onset cardiac death (EOCAD) and intracranial hemorrhagic stroke (ICH). We generated whole genome sequences (WGS) in order to capture non-coding information and to maximize identification of structural variants and worked with the CCDG consortia to integrate and harmonize data, developing and sharing large variant call sets. Technical innovations and platform efficiencies reduced genome sequencing costs approximately two-fold over the program period so far. In year five we will complete the CCDG program, generating an additional 13,500 additional WGS, divided between cases of EOCAD (9,000 from an available pool of approximately 14,000) and cases of ICH (4,500 from an available pool of approximately 7,000), with particular emphasis on ethnicities currently under represented in biomedical research. The data will be subjected to quality control analysis and in conjunction with other available data and other CCDG members, ascertained for disease-allele association, and submitted to AnVIL and other appropriate databases.
The genetic changes underlying most common diseases are yet to be discovered. While the cost and practicality of performing whole genome analysis of large numbers of DNA samples from humans with specific disorders has greatly improved, the optimum pathway to gene discovery in common disorders is not known. This program evaluates different designs in multiple disease studies to both find common disease alleles and to optimize the framework for future studies.