Sequential activation of oncogenes and inactivation of tumor suppressor and DNA repair genes causes human cancers. The activation and inactivation of these cancer genes can be mutations or epigenetic modification such as methylation of CpG islands and chromatin modification. My researches center on integrating biological knowledge, genome sequences, and high-throughput experiments to identify genes and genetics elements that are important for the cancer development. Bioinformatics approach to cancer epigenetics. 1) Data mining--We developed a computational method to study the allele-specific gene expression by mining the EST database. Bayesian statistics was used to estimate the genotype of each SNP in the library. Significant reduction in the frequency of the cDNA libraries expressing both alleles could identify SNPs located in the imprinted genes and displaying allelic variation in gene expression. Among the top 1% of SNPs with the small p values, 4 of 194 were in the known imprinted genes. We have been performing a large scale experimental validation for allelic variation and genomic imprinting and study the role of allelic variation and epigenetic regulation in human cancers. 2) Genetic elements--We performed a comparative genomic sequence analysis between human and mouse for 24 imprinted genes on human chromosomes 1, 6, 7, 11, 13, 14, 15, 18, 19, and 20. The MEME program was used to search for motifs within conserved sequences among the imprinted genes and we then used the MAST program to analyze the presence or absence of motifs in the imprinted genes and 128 non-imprinted genes. Our analysis identified 15 motifs that were significantly enriched in the imprinted genes. We generated a logistic regression model by combining multiple motifs as input variables and the 24 imprinted genes and the 128 non-imprinted genes as a training set. The accuracy, sensitivity, and specificity of our model were 98%, 92% and 99%, respectively. The model was further validated by an open test on 12 additional imprinted genes. The motifs identified in this study are novel imprinting signatures, which should improve our understanding of genomic imprinting and the role of genomic imprinting in human diseases. Genome-wide identification of tumor-associated alternative RNA splicing. We performed a genome-wide computational analysis of the EST database to identify alternative RNA splicing isoforms that are associated with human cancers. We found 26,258 alternative splicing isoforms and analysis of ESTs and their library sources suggested that 845 alternative splicing isoforms were significantly associated with human cancers. We also found considerably more GC dinucleotides at the splicing donor sites in tumors than in normal samples. Experimental validation demonstrated that 45 of 76 alternative splicing isoforms were present in some of the tumors but not in the matched normal samples.. A computational approach for measuring coherence of gene expression in pathways. Our hypothesis is that genes in the same pathway are more likely to be coordinately regulated than a randomly selected gene set. A correlation coefficient for each pair of genes in a pathway was estimated based on gene expression in normal or tumor samples and statistically significant correlation coefficients were identified. We have developed a single index, the coherence indicator as the ratio of the significant pairs divided by the total number of gene pairs in the pathway, to measure the degree of coordinated gene expression the pathway. We defined all genes appeared in the KEGG pathways as a reference gene set. Our analysis indicated that the mean coherence indicator of pathways is significantly larger than the mean coherence indicator of random gene sets drawn from the reference gene set. Thus, the result supports our hypothesis. We analyzed three data sets, two Affymetrix microarrays and one cDNA microarray. Seven of 96 pathways had a significant coherence indicator in normal tissue and 14 of 96 pathways had significant coherence indicators in tumor tissue in all three data sets. The increase in the number of pathways with significant coherence indicators may reflect the fact that tumor cells have a higher rate of metabolism than normal cells. Five pathways involved in oxidative phosphorylation, ATP synthesis, protein synthesis, or RNA synthesis were coherent in both normal and tumor tissue, demonstrating that these are essential genes and high level expression of which is required regardless of cell types.

Agency
National Institute of Health (NIH)
Institute
Division of Cancer Epidemiology And Genetics (NCI)
Type
Intramural Research (Z01)
Project #
1Z01CP010155-05
Application #
7066204
Study Section
(LPG)
Project Start
Project End
Budget Start
Budget End
Support Year
5
Fiscal Year
2004
Total Cost
Indirect Cost
Name
Cancer Epidemiology and Genetics
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Lee, Maxwell P; Dunn, Barbara K (2008) Influence of genetic inheritance on global epigenetic states and cancer risk prediction with DNA methylation signature: challenges in technology and data analysis. Nutr Rev 66 Suppl 1:S69-72
Riss, Joseph; Khanna, Chand; Koo, Seongjoon et al. (2006) Cancers as wounds that do not heal: differences and similarities between renal regeneration/repair and renal cell carcinoma. Cancer Res 66:7216-24
Hu, Nan; Wang, Chaoyu; Hu, Ying et al. (2005) Genome-wide association study in esophageal cancer using GeneChip mapping 10K array. Cancer Res 65:2542-6
Lin, Wei; Yang, Howard H; Lee, Maxwell P (2005) Allelic variation in gene expression identified through computational analysis of the dbEST database. Genomics 86:518-27
Lee, Maxwell P; Howcroft, Kevin; Kotekar, Aparna et al. (2005) ATG deserts define a novel core promoter subclass. Genome Res 15:1189-97
Lee, Maxwell P (2005) Genome-wide analysis of allele-specific gene expression using oligo microarrays. Methods Mol Biol 311:39-47
Yang, Howard H; Lee, Maxwell P (2004) Application of bioinformatics in cancer epigenetics. Ann N Y Acad Sci 1020:67-76
Wang, Zhining; Fan, Hongtao; Yang, Howard H et al. (2004) Comparative sequence analysis of imprinted genes between human and mouse to reveal imprinting signatures. Genomics 83:395-401
Yang, Howard H; Hu, Ying; Buetow, Kenneth H et al. (2004) A computational approach to measuring coherence of gene expression in pathways. Genomics 84:211-7
Wang, Zhining; Lo, H Shuen; Yang, Howard et al. (2003) Computational analysis and experimental validation of tumor-associated alternative RNA splicing in human cancer. Cancer Res 63:655-7

Showing the most recent 10 out of 11 publications