CAREER: From Data to Knowledge: Extracting and Utilizing Concept Graphs in Online Environments

Caragea, Cornelia

Abstract

Knowledge bases today are central to the successful utilization of information available in the large and growing amounts of digital data on the Web. Such technologies have started to unleash a transformation of Web search from a keyword match to discovery, learning, and creativity, which are crucial to promoting the goal of knowledge discovery. Unfortunately, the search for information remains inherently difficult for significant portions of the Web such as the Scholarly Web, which contains many millions of scientific documents. For example, PubMed has over 20 million documents, whereas Google Scholar is estimated to have more than 100 million. Open-access digital libraries such as CiteSeerX, which acquire freely-available research articles from the Web, witness an increase in their document collections as well. Despite attractive advancements by scholarly search portals, semantic search technologies that "understand" complex concepts and their relations and can systematically satisfy users' intricate information needs have yet to be investigated on the Scholarly Web. The goal of this project is to design solutions to make information more accessible and comprehensible to Scholarly Web users in particular, and Web users in general, and to help them discover knowledge more effectively and efficiently. The approach taken will be to develop an integrated framework, focusing on the extraction and utilization of scholarly knowledge graphs in online scholarly environments. Educationally, this work will involve: training of graduate, undergraduate, and high-school students, particularly encouraging the participation of women and underrepresented groups in the research efforts; curriculum development and integration of research into courses taught by the PI; exposure of students to industry and international experiences; and education for the general public.

The project will target the following research objectives: (1) explore the construction of scholarly knowledge graphs that combine evidence from multiple resources in an open information extraction framework; (2) design and develop novel algorithms for the detection and analysis of interesting and previously unknown connections between concepts, in order to enforce knowledge discovery on the Scholarly Web; and (3) investigate the utility of scholarly knowledge graphs in a question answering system. The results of this research will be integrated into the CiteSeerX digital library (http://citeseerx.ist.psu.edu). The software, tools, and benchmark datasets, which will be developed during the course of this project will be made publicly available. All findings will be shared with the research community through publications in academic journals and presented in Information Retrieval, Text Mining and Natural Language Processing conferences. For further information, see the project web page: www.cse.unt.edu/~ccaragea/skg.html.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 1914575
Program Officer: Sylvia Spengler

Project Start
Project End
Budget Start: 2018-08-26
Budget End: 2022-05-31
Support Year
Fiscal Year: 2019
Total Cost: $395,379
Indirect Cost

CAREER: From Data to Knowledge: Extracting and Utilizing Concept Graphs in Online Environments
Caragea, Cornelia
University of Illinois at Chicago, Chicago, IL, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments