It is now widely recognized that there is a great need for more powerful automated methods to assist biomedical scientists in filtering, querying, and extracting information from the scientific literature. Building on our past research accomplishments in biomedical text mining, we plan to develop new algorithms and software systems that will significantly improve the ability of biomedical researchers to exploit the scientific literature. In particular, we plan to develop, evaluate and field systems that (1) aid in annotating high-throughput experiments by extracting and organizing information from text sources, and (2) assist genome database curators by identifying relevant articles and predicting appropriate ontology codes for specific query genes and proteins. In support of these systems, we plan to develop novel machine-learning based text-mining algorithms for training on coarsely labeled data, and inducing models of relationships among specific types of entities expressed in natural language.
Chasman, Deborah; Gancarz, Brandi; Hao, Linhui et al. (2014) Inferring host gene subnetworks involved in viral replication. PLoS Comput Biol 10:e1003626 |
Vlachos, Andreas; Craven, Mark (2012) Biomedical event extraction from abstracts and full papers using search-based structured prediction. BMC Bioinformatics 13 Suppl 11:S5 |
Kawaler, Emily; Cobian, Alexander; Peissig, Peggy et al. (2012) Learning to predict post-hospitalization VTE risk from EHR data. AMIA Annu Symp Proc 2012:436-45 |
Andrzejewski, David; Zhu, Xiaojin; Craven, Mark (2009) Incorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors. Proc Int Conf Mach Learn 382:25-32 |
Smith, Adam A; Vollrath, Aaron; Bradfield, Christopher A et al. (2009) Clustered alignments of gene-expression time series data. Bioinformatics 25:i119-27 |
Smith, Adam A; Craven, Mark (2008) Fast multisegment alignments for temporal expression profiles. Comput Syst Bioinformatics Conf 7:315-26 |
Smith, Adam A; Vollrath, Aaron; Bradfield, Christopher A et al. (2008) Similarity queries for temporal toxicogenomic expression profiles. PLoS Comput Biol 4:e1000116 |
Noto, Keith; Craven, Mark (2006) A specialized learner for inferring structured cis-regulatory modules. BMC Bioinformatics 7:528 |
Settles, Burr (2005) ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text. Bioinformatics 21:3191-2 |
Ray, Soumya; Craven, Mark (2005) Learning statistical models for annotating proteins with function information using biomedical text. BMC Bioinformatics 6 Suppl 1:S18 |
Showing the most recent 10 out of 14 publications