New Machine Learning Methods for Biomedical Data

Shen, Xiaotong

Abstract

In the past few years, we have witnessed a dramatic increase of the amount of data available to biomedical research. An example is the recent advances of high-throughput biotechnologies, making it possible to access genome-wide gene expressions. To address biomedical issues at molecular levels, extraction of the relevant information from massive data of complex structures is essential. This calls for advanced mechanisms for statistical prediction and inference, especially in genomic discovery and prediction, where statistical uncertainty involved in a discovery process is high. The proposed approach focuses on the development of mixture model-based and large margin approaches in semisupervised and unsupervised learning, motivated from biomedical studies in gene discovery and prediction. In particular, we propose to investigate how to improve accuracy and efficiency of mixture model-based and large margin learning systems in generalization. In addition, we will develop innovative methods taking the structure of sparseness and the grouping effect into account to battle the curse of dimensionality, and blend them with the new learning tools. A number of technical issues will be investigated, including: a) developing model selection criteria and performing automatic feature selection, especially when the number of features greatly exceeds that of samples; b) developing large margin approaches for multi-class learning, with most effort towards sparse as well as structured learning; c) implementing efficient computation for real-time applications, and d) analyzing two biological datasets for i) gene function discovery and prediction for E. coli, and ii) new class discovery and prediction for BOEC samples; e) developing public-domain software. Furthermore, computational strategies will be explored based on global optimization techniques, particularly convex programming and difference convex programming. ? ? ?

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM081535-02
Application #: 7468497
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Lyster, Peter

Project Start: 2007-07-15
Project End: 2011-06-30
Budget Start: 2008-07-01
Budget End: 2009-06-30
Support Year: 2
Fiscal Year: 2008
Total Cost: $268,274
Indirect Cost

Institution

Name: University of Minnesota Twin Cities
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 555917996

City: Minneapolis
State: MN
Country: United States
Zip Code: 55455

Related projects


NIH 2014 R01 GM	New Machine Learning Tools for Biomedical Data Shen, Xiaotong; Pan, Wei / University of Minnesota Twin Cities
NIH 2013 R01 GM	New Machine Learning Tools for Biomedical Data Shen, Xiaotong; Pan, Wei / University of Minnesota Twin Cities	$283,518
NIH 2012 R01 GM	New Machine Learning Tools for Biomedical Data Shen, Xiaotong; Pan, Wei / University of Minnesota Twin Cities	$294,260
NIH 2011 R01 GM	New Machine Learning Tools for Biomedical Data Shen, Xiaotong; Pan, Wei / University of Minnesota Twin Cities	$290,523
NIH 2010 R01 GM	New Machine Learning Methods for Biomedical Data Shen, Xiaotong / University of Minnesota Twin Cities	$264,640
NIH 2009 R01 GM	New Machine Learning Methods for Biomedical Data Shen, Xiaotong / University of Minnesota Twin Cities	$267,801
NIH 2008 R01 GM	New Machine Learning Methods for Biomedical Data Shen, Xiaotong / University of Minnesota Twin Cities	$268,274
NIH 2007 R01 GM	New Machine Learning Methods for Biomedical Data Shen, Xiaotong / University of Minnesota Twin Cities	$266,852

Publications

Liu, Binghui; Shen, Xiaotong; Pan, Wei (2016) Nonlinear Joint Latent Variable Models and Integrative Tumor Subtype Discovery. Stat Anal Data Min 9:106-116

Liu, Binghui; Shen, Xiaotong; Pan, Wei (2016) Integrative and regularized principal component analysis of multiple sources of data. Stat Med 35:2235-50

Gao, Chen; Zhu, Yunzhang; Shen, Xiaotong et al. (2016) Estimation of multiple networks in Gaussian mixture models. Electron J Stat 10:1133-1154

Wei, Peng; Cao, Ying; Zhang, Yiwei et al. (2016) On Robust Association Testing for Quantitative Traits and Rare Variants. G3 (Bethesda) 6:3941-3950

Austin, Erin; Shen, Xiaotong; Pan, Wei (2015) A Novel Statistic for Global Association Testing Based on Penalized Regression. Genet Epidemiol 39:415-26

Kim, Junghi; Pan, Wei; Alzheimer's Disease Neuroimaging Initiative (2015) Highly adaptive tests for group differences in brain functional connectivity. Neuroimage Clin 9:625-39

Kim, Junghi; Wozniak, Jeffrey R; Mueller, Bryon A et al. (2015) Testing group differences in brain functional connectivity: using correlations or partial correlations? Brain Connect 5:214-31

Pan, Wei; Kwak, Il-Youp; Wei, Peng (2015) A Powerful Pathway-Based Adaptive Test for Genetic Association with Common or Rare Variants. Am J Hum Genet 97:86-98

Pan, Wei; Chen, Yue-Ming; Wei, Peng (2015) Testing for polygenic effects in genome-wide association studies. Genet Epidemiol 39:306-16

Zhang, Yiwei; Pan, Wei (2015) Principal component regression and linear mixed model in association analysis of structured samples: competitors or complements? Genet Epidemiol 39:149-55

Showing the most recent 10 out of 60 publications

Comments

Be the first to comment on Xiaotong Shen's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: