The recent development and implementation of second-generation, deep sequencing technologies has provided an unprecedented opportunity to characterize genomic changes in cancer. However, the enormous data output from these sequencing platforms also presents a formidable statistical and computational challenge to separate and validate the minority of cancer-causing driver mutations from the overwhelming majority of irrelevant bystander passenger mutations. Computational analysis plays a critically important role in making biological sense out of the mountains of genomic sequencing data. In this proposal, I propose to continue my research work as an expert in cancer bioinformatics, to develop computational tools to narrow down the candidate cancer-causing disease mutations involved in the development and progression of cancer. My bioinformatics work will focus on three research areas: (i) identification of genes causing familial disposition to cancer susceptibility, (ii) detection and characterization of large structural alterations in cancer, and (iii) identification of cancer genes using mouse cancer model system. My short term goal is to develop bioinformatics tools to develop bioinformatics tools to identify cancer-causing disease mutations using genome sequencing data from human patients and mouse cancer models. My long term goal is to characterize these driver mutations further, generating molecular targets to improve diagnosis, risk stratification and treatment of cancer. In my first aim, I propose to develop a bioinformatics pipeline to identify cancer predisposing germline mutations from patients with strong familial history of cancer using whole-genome or whole-exome sequencing data.
This aim tests the hypothesis that cancer predisposing mutations can be weighted and validated from enormous sequencing data sets using statistical and bioinformatics methods. In my second specific aim, I will develop bioinformatics algorithms to detect and characterize large structural variants in human cancer.
This aim tests the hypothesis that integrative analysis of different genome sequencing platforms can be further refined and validated the full structure of complex genomic alteration. In my third aim, I will develop algorithms to identify the genes that accelerate the development of cancer in mouse cancer models.
This aim tests the hypothesis that the computational algorithms and statistical approaches can identify genes predisposing animals to develop cancer and can predict their relevance to human cancer. Successful completion of this groundbreaking new informatics research as a Research Specialist will shed new light on the molecular basis of many cancers, will contribute to active cancer research at Ohio State, and will continue significant recent progress in developing new genomics technologies and analytical methods in studies of human cancers.

Public Health Relevance

/Relevance The recent advent of second-generation sequencing technologies has opened many opportunities to catalog the genomic causes underlying cancer development and progression. Cancers arise as a result of various forms of genetic changes in the DNA sequences, and the analysis of genomic mutations provides extraordinary new opportunities to understand the formation of cancers and to develop new therapeutic approaches for cancer. However, the volume and complexity of cancer genome sequencing data pose formidable computational problems. In this application, I will continue to develop and use innovative computational approaches to determine cancer-causing disease mutations from massive genome sequence data.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-SRB-1 (A1))
Program Officer
Mariotto, Angela B
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Ohio State University
Schools of Medicine
United States
Zip Code
Suarez-Kelly, Lorena P; Akagi, Keiko; Reeser, Julie W et al. (2018) Metaplastic breast cancer in a patient with neurofibromatosis type 1 and somatic loss of heterozygosity. Cold Spring Harb Mol Case Stud 4:
Warburton, Alix; Redmond, Catherine J; Dooley, Katharine E et al. (2018) HPV integration hijacks and multimerizes a cellular enhancer to generate a viral-cellular super-enhancer that drives high viral oncogene expression. PLoS Genet 14:e1007179
El Refaey, Mona; Xu, Li; Gao, Yandi et al. (2017) In Vivo Genome Editing Restores Dystrophin Expression and Cardiac Function in Dystrophic Mice. Circ Res 121:923-929
Starrett, Gabriel J; Marcelus, Christina; Cantalupo, Paul G et al. (2017) Merkel Cell Polyomavirus Exhibits Dominant Control of the Tumor Genome and Transcriptome in Virus-Associated Merkel Cell Carcinoma. MBio 8:
Yoshida, Junko; Akagi, Keiko; Misawa, Ryo et al. (2017) Chromatin states shape insertion profiles of the piggyBac, Tol2 and Sleeping Beauty transposons and murine leukemia virus. Sci Rep 7:43613
Ding, Xia; Ray Chaudhuri, Arnab; Callen, Elsa et al. (2016) Synthetic viability by BRCA2 and PARP1/ARTD1 deficiencies. Nat Commun 7:12425
Horie, Kyoji; Kokubu, Chikara; Yoshida, Junko et al. (2011) A homozygous mutant embryonic stem cell bank applicable for phenotype-driven genetic screening. Nat Methods 8:1071-7