The recent development and implementation of second-generation, deep sequencing technologies has provided an unprecedented opportunity to characterize genomic changes in cancer. However, the enormous data output from these sequencing platforms also presents a formidable statistical and computational challenge to separate and validate the minority of cancer-causing driver mutations from the overwhelming majority of irrelevant bystander passenger mutations. Computational analysis plays a critically important role in making biological sense out of the mountains of genomic sequencing data. In this proposal, I propose to continue my research work as an expert in cancer bioinformatics, to develop computational tools to narrow down the candidate cancer-causing disease mutations involved in the development and progression of cancer. My bioinformatics work will focus on three research areas: (i) identification of genes causing familial disposition to cancer susceptibility, (ii) detection and characterization of large structural alterations in cancer, and (iii) identification of cancer genes using mouse cancer model system. My short term goal is to develop bioinformatics tools to develop bioinformatics tools to identify cancer-causing disease mutations using genome sequencing data from human patients and mouse cancer models. My long term goal is to characterize these driver mutations further, generating molecular targets to improve diagnosis, risk stratification and treatment of cancer. In my first aim, I propose to develop a bioinformatics pipeline to identify cancer predisposing germline mutations from patients with strong familial history of cancer using whole-genome or whole-exome sequencing data.
This aim tests the hypothesis that cancer predisposing mutations can be weighted and validated from enormous sequencing data sets using statistical and bioinformatics methods. In my second specific aim, I will develop bioinformatics algorithms to detect and characterize large structural variants in human cancer.
This aim tests the hypothesis that integrative analysis of different genome sequencing platforms can be further refined and validated the full structure of complex genomic alteration. In my third aim, I will develop algorithms to identify the genes that accelerate the development of cancer in mouse cancer models.
This aim tests the hypothesis that the computational algorithms and statistical approaches can identify genes predisposing animals to develop cancer and can predict their relevance to human cancer. Successful completion of this groundbreaking new informatics research as a Research Specialist will shed new light on the molecular basis of many cancers, will contribute to active cancer research at Ohio State, and will continue significant recent progress in developing new genomics technologies and analytical methods in studies of human cancers.
/Relevance The recent advent of second-generation sequencing technologies has opened many opportunities to catalog the genomic causes underlying cancer development and progression. Cancers arise as a result of various forms of genetic changes in the DNA sequences, and the analysis of genomic mutations provides extraordinary new opportunities to understand the formation of cancers and to develop new therapeutic approaches for cancer. However, the volume and complexity of cancer genome sequencing data pose formidable computational problems. In this application, I will continue to develop and use innovative computational approaches to determine cancer-causing disease mutations from massive genome sequence data.