Purpose or scope: A role for somatic mutations in carcinogenesis and genetic disease is well accepted, but the degree to which mutation rates influence cancer initiation and development is under continuous debate. Recently accumulated genomic data has revealed that thousands of tumor samples are riddled by hypermutation, broadening support that many cancers acquire a mutator phenotype. This major expansion of cancer mutation datasets has provided unprecedented statistical power for the analysis of mutation spectra, which has confirmed several classical sources of mutation in cancer, highlighted new prominent mutation sources and empowered the search for cancer drivers. In our work we combined mechanistic knowledge obtained through our experiments with yeast models to interrogate the large whole-genome datasets of cancer mutations in order to gain mechanistic insight for understanding the impact of mutations on cancer and genetic disease. Research subject: The optimal levels of genome instability needed to sustain fitness of an organism are maintained by a complex set of DNA metabolic functions and pathways. Understanding the interplay between the biological mechanisms maintaining a stable genome and the environmental factors promoting genome instability is important for improving policies pertaining to the impact of the environment on human health. My long-term interest is in understanding physiological mechanisms and environmental causes of extreme levels of genome instability that can give rise to diseases and may alter the life-span of organisms. During the reviewed period, me and my group addressed these questions by combining the following general approaches: (i) Gaining new mechanistic information through research in yeast models reporter based and whole-genome sequencing. This approach elucidates mechanisms of genome instability and defines their specific features. (ii) Using mechanistic knowledge acquired from small genome studies for designing analyses of publicly available large datasets of genome changes in human cancers. Knowledge acquired from mechanistic research in yeast allows to build stringent statistical hypotheses thereby increasing the statistical power in bioinformatic interrogation of the exponentially growing datasets of cancer genomics such as The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC). (iii) Assessing load and signatures of somatic genome changes in humans. Analytical pipeline and information about mutation signatures generated through interrogation of cancer genomics data sets are applied to whole-genome sequencing analyses of cells isolated from healthy individuals. The combination of approaches (i) and (iii) provides additional research opportunities by way of using new knowledge generated through bioinformatic analysis of large public datasets and through sequencing genomes of human subjects for developing the next level of mechanistic hypotheses testable via small genome systems. Accomplishments: More than half of the mutations in tumors are anticipated to have risen in healthy cells. Accumulation of somatic mutations over the lifetime of an individual can be facilitated by genetic factors like impaired DNA repair pathways, and by exogenous DNA damaging agents. However, the impacts of environmental exposures and individual repair capacities on mutation loads in healthy people are still largely unknown, precluding establishment of the normal and pathological levels of somatic mutation loads in humans. Previously, we demonstrated that mutation loads and spectra in the genomes of single skin fibroblast-derived clonal lineages from two healthy individuals resemble cancers. We showed that while, all samples carry mutation signature associated with aging - CT changes at CpG dinucleotides, cells from sun-exposed body sites carry a higher mutation burden with a signature of UV mutagenesis as compared to unexposed sites. We will further sequence genomes from single-lymphocyte and melanocyte-derived clones and single hair follicles of the same donors, thus determining how penetrance of UV-radiation and/or other mutagenic factors impact mutagenesis across the different strata of skin and across diverse cell-types in the body. Somatic mutation load also can be used as a measure of the ability of the cells to repair lesions. As such, we hypothesize that individuals with potentially deleterious polymorphisms in DNA repair genes, would have higher mutation loads than carriers of functional alleles. Recently, it was shown that tumors with an impaired MBD4 glycosylase carry higher CT changes at CpG motifs. MBD4 is a mismatch-specific glycosylase that corrects T:G mismatches formed upon deamination of methylated cytosines. Amplification of the exons from >3000 individuals via the NIEHS Environmental Polymorphisms Registry, with asymmetric barcodes followed by Pacific Biosciences single molecule real-time sequencing is used for identifying individuals with common and rare deleterious alleles in the given gene. Sequencing single cell-derived clones from these donors will provide the range of mutation loads, and mutation signatures attributable to defects in this repair pathway.