Metabolomics is one of the major areas of high-throughput biology. Metabolomic profiling by liquid chromatography-mass spectrometry (LC/MS) measures thousands of metabolites at the same time. The LC/MS metabolomic profiling data poses unique challenges due to several characteristics including the intrinsic uncertainty in matching features to known metabolites, the mixing of true zeroes and missing values, and distinct data distribution and dependency patterns that hamper integrative analysis with other types of high- dimensional data. In this study, we plan to tackle the problems by developing Bayesian hierarchical models for network marker selection that incorporates matching uncertainties, a regression framework for integrative analysis of multipartite omics networks, and a novel modeling strategy to address the unique challenge of missing values in the metabolic network. We will apply newly developed methods to large-scale, high-impact metabolomics and transcriptomics data to derive new biological insights, and provide easy-to-use software for the community.
Bayesian network biomarker selection in metabolomics data Narrative: Metabolomic profiling data poses unique challenges that have not been addressed so far. In this study, we plan to tackle the problems by developing new Bayesian hierarchical models to select network biomarkers, a new framework to integrate metabolomics data with other types of high-dimensional data, and a novel strategy to address the unique challenge of missing values in the metabolic networks. 1
Fei, Teng; Zhang, Tengjiao; Shi, Weiyang et al. (2018) Mitigating the adverse impact of batch effects in sample pattern detection. Bioinformatics 34:2634-2641 |
Yu, Tianwei (2018) Nonlinear variable selection with continuous outcome: a fully nonparametric incremental forward stagewise approach. Stat Anal Data Min 11:188-197 |
Jin, Zhuxuan; Kang, Jian; Yu, Tianwei (2018) Missing value imputation for LC-MS metabolomics data by incorporating metabolic network and adduct ion relations. Bioinformatics 34:1555-1561 |
Yu, Tianwei (2018) A new dynamic correlation algorithm reveals novel functional aspects in single cell and bulk RNA-seq data. PLoS Comput Biol 14:e1006391 |
Kong, Yunchuan; Yu, Tianwei (2018) A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data. Bioinformatics 34:3727-3737 |