The focus of this proposal is to develop innovative library preparation tools and techniques to enable haplotype resolved whole human genome sequencing. Recent advances in Next-Gen sequencing technology, along with the development of robust analysis methods, have given researchers the ability to identify sequence variants. However, the ultimate goal of relating sequence variants to human diseases is still quite difficult and likely not possible except for very simple single-gene diseases. Furthermore, even for single gene diseases, when large patient cohorts are studied typically less that 50% of the cases can be linked directly to a genetic variant. It is my hypothesis that improved sequencing methods are needed to elucidate the complex genotype/phenotype relation. Namely, new methods are needed to understand the long- range sequence contiguity of human genome and resolve the phase of the sequence variants. I also hypothesize that long-range genome interactions play a critical role in many diseases, and the cis/trans relation between sequence variants is essential for understanding the genetic basis of disease. However, all high throughput (Next-Gen) sequencing technologies today generate very short reads, and these short reads are insufficient for phasing sequence variants. Typically, these sequencing technologies produce results that are limited to finding polymorphisms, and the importance of haplotypes (or the cis/trans phasing of variants) has been largely neglected. In order to truly understand the genetic makeup of a specific disease there is a need to develop methods to identify the specific chromosome of all polymorphisms, and this is the focus of this proposal.

Public Health Relevance

Recent advances in Next-Gen sequencing technology, along with the development of robust analysis methods, have given researchers the ability to identify sequence variants. However, the ultimate goal of relating sequence variants to human diseases is still quite difficult and likely not possible except for very simple single-gene diseases. This proposal focuses on the development of improved methods upstream of the Next- Gen sequencer that are needed maximize the value of genome sequencing data. I believe our data will further allow biomedical researchers to decipher the elusive genotype/phenotype relation.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG006876-03
Application #
8894056
Study Section
Special Emphasis Panel (ZHG1-HGR-N (M1))
Program Officer
Schloss, Jeffery
Project Start
2013-09-01
Project End
2016-07-31
Budget Start
2015-08-01
Budget End
2016-07-31
Support Year
3
Fiscal Year
2015
Total Cost
$438,749
Indirect Cost
$148,186
Name
University of New Mexico
Department
Chemistry
Type
Schools of Arts and Sciences
DUNS #
868853094
City
Albuquerque
State
NM
Country
United States
Zip Code
87106
Feng, Kuan; Costa, Justin; Edwards, Jeremy S (2018) Next-generation sequencing library construction on a surface. BMC Genomics 19:416
Kim, Jungeun; Weber, Jessica A; Jho, Sungwoong et al. (2018) KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses. Sci Rep 8:5677
Ogasawara, Yasushi; Torrez-Martinez, Norah; Aragon, Anthony D et al. (2015) High-Quality Draft Genome Sequence of Actinobacterium Kibdelosporangium sp. MJ126-NF4, Producer of Type II Polyketide Azicemicins, Using Illumina and PacBio Technologies. Genome Announc 3:
Jun, JeHoon; Cho, Yun Sung; Hu, Haejin et al. (2014) Whole genome sequence and analysis of the Marwari horse breed and its genetic origin. BMC Genomics 15 Suppl 9:S4
Yim, Hyung-Soon; Cho, Yun Sung; Guang, Xuanmin et al. (2014) Minke whale genome and aquatic adaptation in cetaceans. Nat Genet 46:88-92