Sequential activation of oncogenes and inactivation of tumor suppressor and DNA repair genes causes human cancers. The activation and inactivation of these cancer genes can be mutations or epigenetic modification such as methylation of CpG islands and chromatin modification. My researches center on integrating biological knowledge, genome sequences, and high-throughput experiments to identify genes and genetics elements that are important for the cancer development. Bioinformatics approach to cancer epigenetics. 1) Data mining--We developed a computational method to study the allele-specific gene expression by mining the EST database. Bayesian statistics was used to estimate the genotype of each SNP in the library. Significant reduction in the frequency of the cDNA libraries expressing both alleles could identify SNPs located in the imprinted genes. Among the top 1% of SNPs with the small p values, 4 of 194 were in the known imprinted genes. We are experimentally determining if any of 190 candidate SNPs is located in novel imprinted genes. 2) Genetic elements--We performed comparative genome sequence analysis for each of the 46 imprinted genes to identify the regions conserved between human and mouse. We searched for motifs in the conserved non-coding sequences that can serve as genetic regulatory elements for imprinting. We also identified the transcription factor binding sites for each of imprinted genes. We found that five motifs and two transcription factor binding sites, AP1 and SOX5, occurred more frequently in the imprinted genes locus than rest of the genome. Genome-wide identification of tumor-associated alternative RNA splicing. We performed a genome-wide computational analysis of the EST database to identify alternative RNA splicing isoforms that are associated with human cancers. We found 26,258 alternative splicing isoforms and analysis of ESTs and their library sources suggested that 845 alternative splicing isoforms were significantly associated with human cancers. We also found considerably more GC dinucleotides at the splicing donor sites in tumors than in normal samples. Experimental validation demonstrated that 45 of 76 alternative splicing isoforms were present in some of the tumors but not in the matched normal samples.
Showing the most recent 10 out of 11 publications