Human genomics is now, more than ever, poised to ask, and answer, challenging questions pertaining to the relationship between inherent variability observed in populations and human diseases, such as cancer. Whether it is susceptibility, sensitivity to drugs or differences in survival, it is evident that common genetic variation has emerged as a key component for pursuing a comprehensive understanding of cancer. Two major trends in the last five years have dramatically increased the opportunities for investigating the contribution of common genetic variants to complex diseases. First, advances in the technological platforms for genotyping the most common variants in the genome, the single nucleotide polymorphism, SNP, together with bio-informatic tools for manipulating large data sets of genetic information enable researchers to look comprehensively across genes or the entire genome. Second, the public database of common SNPs (e.g., minor allele frequency >5%) is comprehensive, providing a SNP every 500-800 bp, mainly due to the International HapMap. It is estimated that there are at least 10 million common SNPs in the human genome but only a small fraction have functional consequences. Re-sequence programs, such as SNP500Cancer or Seattle SNP have shown that rarer SNPs are distributed throughout the genome and that common SNPs are not evenly spaced. Neighboring SNPs, if not accounted for, can undermine the veracity of genotype assays. By definition, a complex disease is not explained by variation in one or two genes but instead by many genes, each of which contributes in a small way, often in combination with environmental exposures. The sample size required to detect the small to modest increases in risk due to any given variant is large (e.g., hundreds or thousands). Further, to validate genetic determinants with a relative risk of less than two, a succession of population-based studies is required to definitely replicate key findings for any given marker. In the past, we used the direct approach SNPs, choosing variants because of a priori knowledge of biological function. However, the exponential increase in annotation of common variants has generated a large catalogue of variants, of which we know nothing of the function of the vast majority. In place of the candidate functional variant has emerged the indirect approach, which utilizes the ancestral relationship between SNPs to efficiently monitor untested variants in the locus. Linkage disequilibrium (LD), the non-random association between genetic markers on the same chromosome, can be exploited to detect association on the premise that a genotyped marker may be linked to the causal variant. This kind of 'surrogacy' enables one to find the region or part of a gene without knowing the exact function a priori. The indirect approach assumes that the human genome is broadly organized into units, defined as linear blocks or inter-digitating bins. The goal of the laboratory is to investigate the genetic basis of cancer and its outcomes. The major focus of the laboratory is on annotating and applying common genetic variation in candidate genes in key pathways in innate immunity and cancer biology, such as telomere stability. We have incorporated current approaches to identify and validate the most common form of germ-line genetic variation, the single nucleotide polymorphism (SNP). We have transitioned from the analysis of functional variants by the direct approach, which was limited by our knowledge of function, to a more efficient indirect approach. The latter seeks to comprehensively analyze common variants across a locus in an effort to determine genetic markers. Our strategies include using haplotype-tagging SNPs and the greedy algorithm for tag SNPs in well-conducted molecular epidemiology studies in cancer. Annotation of common germ-line variation in the human genome has substantially accelerated the investigation of the genetics of cancer etiology and cancer outcomes.
Showing the most recent 10 out of 137 publications