We aim to provide a comprehensive foundation for development of a novel high-throughput and cost-effective experimental pipeline for high-throughput sequencing of the personal haploid genomes. Our particular approach is based on integrating two novel high-throughput and chromosomal-range haplotyping methods independently developed by the PI's laboratory (phasing through single chromosome isolations) and by our collaborator's group (Dr. Shendure's group) (phasing through fosmid library sequencing). Now the single chromosome haplotyping suffers from a severe (55-70%) locus dropout issue and the fosmid haplotyping suffers from a switch error issue when assembling large contigs into chromosomal haplotypes. So neither of them could be directly applied for generating the phased genome sequencing for each individual. However, when integrated, they could compensate to each other's caveats. The estimated cost in the prototype of pipeline is $7,000 for each unphased human genome (12 days, 4 samples in parallel) and $14,000 for each pair of phased haploid human genomes (24 days, 8 samples in parallel). In this project, we will develop, test and optimize the experimental pipeline for high accuracy, high coverage, high throughput and lower cost. The haploid structure of homologous chromosomes is a fundamental genetic information that is essential for functional interpretation of genomes and disease-causing mutation discoveries. However, this information is missing in genetic and genomic studies now because it is technically impossible so far to experimentally observe this information by current high-throughput technologies. We anticipate that the proposed project will begin to fill this void by demonstrating the feasibility of a novel high-throughput, long-range, molecular haplotyping approach amenable to exploiting next-generating sequencing capacity.
This project will develop a new technology that is urgently and critically needed for discovering disease-causing genetic variations and potentially novel preventive and therapeutic strategies to improve medical care in the incoming era of personalized medicine.
|Ma, Yamin; Zhao, Jian; Wong, Jian-Syuan et al. (2014) Accurate inference of local phased ancestry of modern admixed populations. Sci Rep 4:5800|
|Li, Qiling; Ma, Yamin; Li, Wenzhi et al. (2014) A promoter that drives gene expression preferentially in male transgenic rats. Transgenic Res 23:341-9|
|Rao, Weinian; Ma, Yamin; Ma, Li et al. (2013) High-resolution whole-genome haplotyping using limited seed data. Nat Methods 10:6-7|
|Li, Qiling; Kang, Ting; Tian, Xiaohua et al. (2013) Multimeric stability of human C-reactive protein in archived specimens. PLoS One 8:e58094|