Evidence is accumulating to support the hypothesis that different combinations of histone modifications confer different functional specificities. Identification of various histone modification patterns and linking them with functional elements of the genome is of great interest in epigenetics. High-throughput experimental techniques, such as ChIP-chip and ChIP-Seq, lead to a rich amount of histone modification data. However, current experimental and computational methods have only been able to explore these data to a very limited extent. This project bears a long-term objective of developing novel statistical methods for sparse structure identification from histone modification data. Imposing sparsity is an ideal way for handling extremely high-dimensional data with noisy information and small sample size.
Four specific aims are proposed, including (1) identification of new functional sites on the genome;(2) accurate dissemination between different regulatory elements;(3) identification of the interaction between histone modifications in regulation;(4) uncovering the predictive DNA motifs of the chromatin signature. Novel sparse statistical methods will be developed to achieve these aims, including a high-dimensional clustering method combined with variable selection, a classification method featured by sparse covariance estimation based dimension reduction, a joint estimation of graphical models for multiple functional elements, and a multi-response multi-predictor regression method. This project will be conducted through the collaboration between two statisticians and a biochemist. The proposed methods will be validated through and applied to both published datasets and those provided by the epigenome roadmap project in which one of the PIs is involved.
Epigenetic modifications such as histone modifications play critical roles in regulating gene expression and aberrant epigenetic modifications have been observed in many diseases. A statistically rigorous characterization and understanding of such modifications can greatly facilitate development of new therapeutics.
|Guo, Jian; Cheng, Jie; Levina, Elizaveta et al. (2015) ESTIMATING HETEROGENEOUS GRAPHICAL MODELS FOR DISCRETE DATA WITH AN APPLICATION TO ROLL CALL VOTING. Ann Appl Stat 9:821-848|
|Zou, Na; Baydogan, Mustafa; Zhu, Yun et al. (2015) A Transfer Learning Approach for Predictive Modeling of Degenerate Biological Systems. Technometrics 57:362-373|
|Whitaker, John W; Nguyen, Tung T; Zhu, Yun et al. (2015) Computational schemes for the prediction and annotation of enhancers from epigenomic assays. Methods 72:86-94|
|Li, Yun; Zhu, Ji; Wang, Naisyin (2015) Regularized Semiparametric Estimation for Ordinary Differential Equations. Technometrics 57:341-350|
|Xu, Peirong; Zhu, J I; Zhu, Lixing et al. (2015) Covariance-enhanced discriminant analysis. Biometrika 102:33-45|
|Guo, Jian; Levina, Elizaveta; Michailidis, George et al. (2015) Graphical Models for Ordinal Data. J Comput Graph Stat 24:183-204|
|Mukherjee, A; Chen, K; Wang, N et al. (2015) On the degrees of freedom of reduced-rank estimators in multivariate regression. Biometrika 102:457-477|
|Cheng, Jie; Levina, Elizaveta; Wang, Pei et al. (2014) A sparse Ising model with covariates. Biometrics 70:943-53|
|Won, Kyoung-Jae; Zhang, Xian; Wang, Tao et al. (2013) Comparative annotation of functional regions in the human genome using epigenomic data. Nucleic Acids Res 41:4423-32|
|Huang, Shuai; Li, Jing; Ye, Jieping et al. (2013) A sparse structure learning algorithm for Gaussian Bayesian Network identification from high-dimensional data. IEEE Trans Pattern Anal Mach Intell 35:1328-42|
Showing the most recent 10 out of 21 publications