Information floods people's daily lives, and it is overwhelming. Summarization systems that identify salient pieces of information and present it concisely can help. The single most important characteristic a text summary must possess to make it usable in real-world scenarios is its reliability. A summary is reliable if its content can be trusted to remain accurate to the original. While deep neural architectures have demonstrated success in abstractive summarization, studies reveal that system-generated abstracts can contain inaccurate factual details or hallucinated content that change the meaning of the original texts. An abstractive summarization system seeks to transform lengthy source texts to a succinct summary using natural language generation capabilities; the summary can contain new words and phrases that are unseen in the source input. With greater flexibility of lexical choices comes increased demand for reliability---summaries must keep the meaning of the original intact. Without emphasizing summary reliability, system outputs can render useless at best, and misleading and detrimental at worst. Thus, there exists a pressing need, and this project aims to develop robust text summarizers whose outputs can preserve the meaning of the original. This project will have major impact on science and technology as well as the development of society. The knowledge acquired in this project can be extended to help build robust language generation capabilities that are crucial for machine translation. This project will fund both undergraduate and graduate students where undergraduate students are teamed up with graduate students to gain hands-on experiences and promote mentorship.

This project aims to build robust abstractive summarization systems whose summaries can remain true to the original texts by harnessing the power of deep neural models and linguistic structure prediction. Given that major relations of a summary (e.g., who did what to whom) are often the same or similar to those of the source text, the project focuses on developing methods that learn to promote summaries that preserve important source relations and discourage summaries that contain erroneous relations, thus preventing a summary from dramatically changing the meaning of the original text. The research objective includes the following. (a) Developing an abstractive, sentence-to-sentence summarizer that jointly performs generation of summary sentences and parsing sentence structures. (b) Developing a many-to-one sentence summarizer that explicitly models coreference relationships between mentions observed in the source text. Drawing upon recent developments in deep neural architectures, these efforts are expected to improve a neural abstractive multi-document summarizer to help it properly encode the source texts and decode the summary sequence. (c) Devising a novel, semi-automatic evaluation scheme leveraging question-answering to assess to what extent system summaries preserve the meaning of the original texts.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1909603
Program Officer
Tatiana Korelsky
Project Start
Project End
Budget Start
2019-10-01
Budget End
2022-09-30
Support Year
Fiscal Year
2019
Total Cost
$498,829
Indirect Cost
Name
The University of Central Florida Board of Trustees
Department
Type
DUNS #
City
Orlando
State
FL
Country
United States
Zip Code
32816