Statistical Learning for Biomedical Data

Malley, James

Abstract

This projects studies statistical learning machines as applied to biomedical and clinical prediction, probabilitiy assignment, regresssion, and ranking problems. The algorithms involved include Random Forests, support vector machines, neural networks, and variations of the boosting algorithm. These are all recently developed techniques orginally constructed by the machine learning community, and which are only now starting to see applications in biomedical problems. As the methods were not designed through statistical reasoning or routinely applied to data collected by clinicians or biomedical researchers, these new techniques require modifications and enhancements appropriate to data collected from these alternate sources. In particular, we address the problem of (1) greatly unbalanced data sets, where the researcher typically has only a handful of positive cases and a great many negative cases, (2) the issue of accurate estimates of prediction error rates, where the researcher typically has a relatively small data set upon which to do both model fitting and testing, and (3) the interpretation of the means by which the prediction engine operates and the development of practical prognostic factors. These three problems are essential questions facing the use of modern prediction engines, but have been only lightly studied by the machine learning community. We have applied these statistical learning machine methods to a wide variety of biological datasets. At the invitaion of Cambridge University Press we are in the process of writing a textbook on the subject of """"""""Statistical Learning for Biological Data"""""""".

Funding Agency

Agency: National Institute of Health (NIH)
Institute: Center for Information Technology (CIT)
Type: Intramural Research (Z01)
Project #: 1Z01CT000271-05
Application #: 7593231
Study Section

Project Start
Project End
Budget Start
Budget End
Support Year: 5
Fiscal Year: 2007
Total Cost: $73,387
Indirect Cost

Institution

Name: Center for Information Technology
Department
Type
DUNS #

City
State
Country: United States
Zip Code

Related projects


NIH 2008 Z01 CT	Statistical Learning for Biomedical Data Malley, James D. / Center for Information Technology	$186,795
NIH 2007 Z01 CT	Statistical Learning for Biomedical Data Malley, James D. / Center for Information Technology	$73,387
NIH 2006 Z01 CT	Statistical Learning for Biomedical Data Malley, James D. / Computer Research and Technology
NIH 2005 Z01 CT	Statistical Learning Machines with Biomedical Applicatio Malley, James D. / Computer Research and Technology
NIH 2004 Z01 CT	Statistical Learning Machines w/ Biomedical Applications Malley, James D. / Computer Research and Technology
NIH 2003 Z01 CT	Statistical Learning Machines with Biomedical Applicati* Malley, James D. / Computer Research and Technology

Publications

Nicodemus, Kristin K; Malley, James D (2009) Predictor correlation impacts machine learning algorithms: implications for genomic studies. Bioinformatics 25:1884-90

Konig, I R; Malley, J D; Weimar, C et al. (2007) Practical experiences on the necessity of external validation. Stat Med 26:5499-511

Paul, Scott M; Siegel, Karen Lohmann; Malley, James et al. (2007) Evaluating interventions to improve gait in cerebral palsy: a meta-analysis of spatiotemporal measures. Dev Med Child Neurol 49:542-9

Mamyrova, Gulnara; O'Hanlon, Terrance P; Monroe, Jason B et al. (2006) Immunogenetic risk and protective factors for juvenile dermatomyositis in Caucasians. Arthritis Rheum 54:3979-87

Ward, Michael M; Pajevic, Sinisa; Dreyfuss, Jonathan et al. (2006) Short-term prediction of mortality in patients with systemic lupus erythematosus: classification of outcomes using random forests. Arthritis Rheum 55:74-80

O'Hanlon, Terrance P; Carrick, Danielle Mercatante; Targoff, Ira N et al. (2006) Immunogenetic risk and protective factors for the idiopathic inflammatory myopathies: distinct HLA-A, -B, -Cw, -DRB1, and -DQA1 allelic profiles distinguish European American patients with different myositis autoantibodies. Medicine (Baltimore) 85:111-27

O'Hanlon, Terrance P; Rider, Lisa G; Mamyrova, Gulnara et al. (2006) HLA polymorphisms in African Americans with idiopathic inflammatory myopathy: allelic profiles distinguish patients with different clinical phenotypes and myositis autoantibodies. Arthritis Rheum 54:3670-81

O'Hanlon, Terrance P; Carrick, Danielle Mercatante; Arnett, Frank C et al. (2005) Immunogenetic risk and protective factors for the idiopathic inflammatory myopathies: distinct HLA-A, -B, -Cw, -DRB1 and -DQA1 allelic profiles and motifs define clinicopathologic groups in caucasians. Medicine (Baltimore) 84:338-49

Jerebko, Anna K; Malley, James D; Franaszek, Marek et al. (2005) Support vector machines committee classification method for computer-aided polyp detection in CT colonography. Acad Radiol 12:479-86

O'Hanlon, Terrance; Koneru, Bhanu; Bayat, Elham et al. (2004) Immunogenetic differences between Caucasian women with and those without silicone implants in whom myositis develops. Arthritis Rheum 50:3646-50

Showing the most recent 10 out of 12 publications

Comments

Be the first to comment on James Malley's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: