The University of California - San Francisco is awarded a grant to develop rigorous mathematical methods for analyzing the recent high-throughput nucleosome mapping data in yeasts and provide novel computational tools for studying the relations among DNA sequence, nucleosome stability, and nucleosome positioning. Eukaryotic DNA has a complicated three-dimensional structure called chromatin consisting of millions of nucleosomes, the protein-DNA complexes that contain 146 base pairs of DNA wrapping around eight histones. Nucleosomes play a critical role in regulating gene expression by controlling the accessibility of DNA and modulating transcription factor binding activities. Nucleosomes are thus of paramount importance, but there currently do not exist rigorous computational tools for studying the biological signals that regulate their positioning and stability. This research will devlop spectral decomposition methods for analyzing nucleosomal and linker DNA, including their physical properties derived from the molecular measurements of DNA flexibility. Dr. Song will apply a statistical theory for analyzing the maximal frequency spectrum of categorical time series in order to answer a long-standing question of whether certain periodic DNA properties preferentially exist in individual nucleosomes. Furthermore, wrapping DNA around histones introduces superhelical stress, which we show to be often countered by increased sequence-dependent stabilization of DNA. The developed computational tools will help unravel new connections between DNA bendability and stabilization energy and, thus, facilitate the discovery of novel nucleosome positioning signals at important regulatory sites. We currently do not understand how chromatin structure is faithfully inherited upon cell division, and it is likely that the genetic information contained in DNA sequences may significantly influence the process of epigenetic inheritance. Better understanding the mathematical and physical properties of DNA will enhance our understanding of chromatin structure.
This project is uniquely positioned to integrate Dr Song's research in computational epigenomics with innovative educational programs that will significantly advance the training of biology students in mathematics and statistics at UCSF. The project will help improve the infrastructure for education and research at UCSF by synergizing different graduate groups at UCSF. The project will support vertical integration of research and education, whereby the proposed research will constitute an important theme for teaching epigenetics to quantitative scientists and for teaching mathematics to students at UCSF. The applicant will continue to teach a new statistics course that he has recently designed in the Biological and Medical Informatics graduate program, drawing examples from genomics. Topical mini-courses on high-throughput sequencing technology and systems biology will be also taught in order to disseminate the applicant's current research activities and to help prepare students and postdocs conduct their own research in genomics. Undergraduate students from other institutions will also have opportunities to participate in our research through the UC Berkeley Summer Internship Program. Dr Song will involved local high school students through the Science and Health Education Partnership (SEP) at UCSF. SEP invites promising seniors from San Francisco's public high schools and places them under the guidance of UCSF investigators for 8-week-long summer research projects. The project will exemplify the rich possibility of applying mathematics and statistics to solving biologically important questions. Computational tools will be implemented into open source software that will be accessible to other researchers studying chromatin structure. Further information about the project may be obtained from the PI's lab website at http://songlab.ucsf.edu/Welcome.html.