Natural language processing (NLP) technology is now widespread (e.g. Google Translate) and has several important applications in biomedical research. We propose a new target for NLP: extraction of scientific questions stated in publications. A system that automatically captures and organizes scientific questions from across the biomedical literature could have a wide range of significant impacts, as attested to in our diverse collection of support letters from researchers, journal editors, educators and scientific foundations. Prior work focused on making binary (or probabilistic) assessments of whether a text is hedged or uncertain, with the goal of downgrading such statements in information extraction tasks?not computationally capturing what the uncertainty is about. In contrast, we propose an ambitious plan to identify, represent, integrate and reason about the content of scientific questions, and to demonstrate how this approach can be used to address two important new use cases in biomedical research: contextualizing experimental results and enhancing literature awareness. Contextualizing results is the task of linking elements of genome-scale results to open questions across all of biomedical research. Literature awareness is the ability to understand important characteristics of large, dynamic collections of research publications as a whole. We propose to produce rich computational representations of the dynamic evolution of research questions, and to prototype textual and visual interfaces to help students and researchers explore and develop a detailed understanding of key open scientific questions in any area of biomedical research.
The scientific literature is full of statements of important unsolved questions. By using artificial intelligence systems to identify and categorize these questions, the proposed work would help other researchers discover when their findings might address an important question in another scientific area. This work would also make it easier for students, journal editors, conference organizers and others understand where science is headed by tracking the evolution of scientific questions.