The focus of this proposal is to develop innovative library preparation tools and techniques to enable haplotype resolved whole human genome sequencing. Recent advances in Next-Gen sequencing technology, along with the development of robust analysis methods, have given researchers the ability to identify sequence variants. However, the ultimate goal of relating sequence variants to human diseases is still quite difficult and likely not possible except for very simple single-gene diseases. Furthermore, even for single gene diseases, when large patient cohorts are studied typically less that 50% of the cases can be linked directly to a genetic variant. It is my hypothesis that improved sequencing methods are needed to elucidate the complex genotype/phenotype relation. Namely, new methods are needed to understand the long- range sequence contiguity of human genome and resolve the phase of the sequence variants. I also hypothesize that long-range genome interactions play a critical role in many diseases, and the cis/trans relation between sequence variants is essential for understanding the genetic basis of disease. However, all high throughput (Next-Gen) sequencing technologies today generate very short reads, and these short reads are insufficient for phasing sequence variants. Typically, these sequencing technologies produce results that are limited to finding polymorphisms, and the importance of haplotypes (or the cis/trans phasing of variants) has been largely neglected. In order to truly understand the genetic makeup of a specific disease there is a need to develop methods to identify the specific chromosome of all polymorphisms, and this is the focus of this proposal.
Recent advances in Next-Gen sequencing technology, along with the development of robust analysis methods, have given researchers the ability to identify sequence variants. However, the ultimate goal of relating sequence variants to human diseases is still quite difficult and likely not possible except for very simple single-gene diseases. This proposal focuses on the development of improved methods upstream of the Next- Gen sequencer that are needed maximize the value of genome sequencing data. I believe our data will further allow biomedical researchers to decipher the elusive genotype/phenotype relation.
Feng, Kuan; Costa, Justin; Edwards, Jeremy S (2018) Next-generation sequencing library construction on a surface. BMC Genomics 19:416 |
Kim, Jungeun; Weber, Jessica A; Jho, Sungwoong et al. (2018) KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses. Sci Rep 8:5677 |
Ogasawara, Yasushi; Torrez-Martinez, Norah; Aragon, Anthony D et al. (2015) High-Quality Draft Genome Sequence of Actinobacterium Kibdelosporangium sp. MJ126-NF4, Producer of Type II Polyketide Azicemicins, Using Illumina and PacBio Technologies. Genome Announc 3: |
Jun, JeHoon; Cho, Yun Sung; Hu, Haejin et al. (2014) Whole genome sequence and analysis of the Marwari horse breed and its genetic origin. BMC Genomics 15 Suppl 9:S4 |
Yim, Hyung-Soon; Cho, Yun Sung; Guang, Xuanmin et al. (2014) Minke whale genome and aquatic adaptation in cetaceans. Nat Genet 46:88-92 |