Evidence is accumulating to support the hypothesis that different combinations of histone modifications confer different functional specificities. Identification of various histone modification patterns and linking them with functional elements of the genome is of great interest in epigenetics. High-throughput experimental techniques, such as ChIP-chip and ChIP-Seq, lead to a rich amount of histone modification data. However, current experimental and computational methods have only been able to explore these data to a very limited extent. This project bears a long-term objective of developing novel statistical methods for sparse structure identification from histone modification data. Imposing sparsity is an ideal way for handling extremely high-dimensional data with noisy information and small sample size.
Four specific aims are proposed, including (1) identification of new functional sites on the genome;(2) accurate dissemination between different regulatory elements;(3) identification of the interaction between histone modifications in regulation;(4) uncovering the predictive DNA motifs of the chromatin signature. Novel sparse statistical methods will be developed to achieve these aims, including a high-dimensional clustering method combined with variable selection, a classification method featured by sparse covariance estimation based dimension reduction, a joint estimation of graphical models for multiple functional elements, and a multi-response multi-predictor regression method. This project will be conducted through the collaboration between two statisticians and a biochemist. The proposed methods will be validated through and applied to both published datasets and those provided by the epigenome roadmap project in which one of the PIs is involved.

Public Health Relevance

Epigenetic modifications such as histone modifications play critical roles in regulating gene expression and aberrant epigenetic modifications have been observed in many diseases. A statistically rigorous characterization and understanding of such modifications can greatly facilitate development of new therapeutics.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM096194-03
Application #
8326620
Study Section
Special Emphasis Panel (ZGM1-CBCB-5 (BM))
Program Officer
Brazhnik, Paul
Project Start
2010-09-01
Project End
2014-08-31
Budget Start
2012-09-01
Budget End
2013-08-31
Support Year
3
Fiscal Year
2012
Total Cost
$250,215
Indirect Cost
$26,597
Name
University of Michigan Ann Arbor
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
073133571
City
Ann Arbor
State
MI
Country
United States
Zip Code
48109
Won, Kyoung-Jae; Zhang, Xian; Wang, Tao et al. (2013) Comparative annotation of functional regions in the human genome using epigenomic data. Nucleic Acids Res 41:4423-32
Huang, Shuai; Li, Jing; Ye, Jieping et al. (2013) A sparse structure learning algorithm for Gaussian Bayesian Network identification from high-dimensional data. IEEE Trans Pattern Anal Mach Intell 35:1328-42