? ? It is now widely recognized that there is a great need for more powerful automated methods to assist biomedical scientists in filtering, querying, and extracting information from the scientific literature. Building on our past research accomplishments in biomedical text mining, we plan to develop new algorithms and software systems that will significantly improve the ability of biomedical researchers to exploit the scientific literature. In particular, we plan to develop, evaluate and field systems that (1) aid in annotating high-throughput experiments by extracting and organizing information from text sources, and (2) assist genome database curators by identifying relevant articles and predicting appropriate ontology codes for specific query genes and proteins. In support of these systems, we plan to develop novel machine-learning based text-mining algorithms for training on coarsely labeled data, and inducing models of relationships among specific types of entities expressed in natural language. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
2R01LM007050-04A2
Application #
7264196
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2000-09-29
Project End
2010-06-30
Budget Start
2007-07-01
Budget End
2008-06-30
Support Year
4
Fiscal Year
2007
Total Cost
$279,300
Indirect Cost
Name
University of Wisconsin Madison
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
161202122
City
Madison
State
WI
Country
United States
Zip Code
53715
Chasman, Deborah; Gancarz, Brandi; Hao, Linhui et al. (2014) Inferring host gene subnetworks involved in viral replication. PLoS Comput Biol 10:e1003626
Vlachos, Andreas; Craven, Mark (2012) Biomedical event extraction from abstracts and full papers using search-based structured prediction. BMC Bioinformatics 13 Suppl 11:S5
Kawaler, Emily; Cobian, Alexander; Peissig, Peggy et al. (2012) Learning to predict post-hospitalization VTE risk from EHR data. AMIA Annu Symp Proc 2012:436-45
Andrzejewski, David; Zhu, Xiaojin; Craven, Mark (2009) Incorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors. Proc Int Conf Mach Learn 382:25-32
Smith, Adam A; Vollrath, Aaron; Bradfield, Christopher A et al. (2009) Clustered alignments of gene-expression time series data. Bioinformatics 25:i119-27
Smith, Adam A; Craven, Mark (2008) Fast multisegment alignments for temporal expression profiles. Comput Syst Bioinformatics Conf 7:315-26
Smith, Adam A; Vollrath, Aaron; Bradfield, Christopher A et al. (2008) Similarity queries for temporal toxicogenomic expression profiles. PLoS Comput Biol 4:e1000116
Noto, Keith; Craven, Mark (2006) A specialized learner for inferring structured cis-regulatory modules. BMC Bioinformatics 7:528
Settles, Burr (2005) ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text. Bioinformatics 21:3191-2
Ray, Soumya; Craven, Mark (2005) Learning statistical models for annotating proteins with function information using biomedical text. BMC Bioinformatics 6 Suppl 1:S18

Showing the most recent 10 out of 14 publications