We recently received whole genome sequence reads on one Pima Indian. The sequencing work was done by Bejing Genomics Institute. In the past few months we wrote programs and successfully annotated all of the variants identified in this single sample. All of the variants predicting amino acid substitutions were also catalogued. Using NCBI reference genome build 36 and dbsnp129 as our reference, this Pima Indian had a total of 3.64 million SNPs, where 1.57 million SNPs were homozygous and the remaining 2.07 million SNPs were heterozygous. Of these SNPs, 3.17 million were known SNPs (already reported in dbSNP) and the remaining 0.47 million SNPs were novel (not reported in dbSNP). The novel SNP rate was 87.1%, which is close to the 90% standard line. We anticipate receiving 20 additional genomes for analysis within the next year. We have also recently received whole exome raw sequence data on 180 Pima Indians. All of these individuals have previously been studied as inpatients in our Clinical Research Center where they were characterized for metabolic traits related to diabetes and obesity. Sequencing of their exomes was done by Shanghai Bio. We have completed alignment of the raw reads and calculated that the average per-base coverage was 42x. All of these exomes have been annotated for SNP variation, small insertions and deletions (indels) and large mobile elements. Variants are undergoing bioinformatic analyses to predict which are likely to be damaging. Selected variants are either being validated by re-sequencing or are being directly genotyped in large samples for association analyses.
Showing the most recent 10 out of 14 publications