This CAREER project focuses on building text generation systems that interact with people to improve their writing and also help them learn to write. Such “machine-in-the-loop” writing assistants are potentially transformative technologies for improving the writing quality and productivity of human authors, as well as providing new tools for writing pedagogy through cyberlearning applications. However, they have been relatively underexplored by the natural language processing community due to major difficulties in modeling, evaluation, and data collection. The technologies developed in this project address these challenges by (1) developing platforms that leverage existing online author communities to enable the design and evaluation of machine-in-the-loop writing assistants; (2) advancing text generation modeling to improve the quality of generated text; and (3) enabling assistants to rewrite and reorganize human-authored text through developments in automatic paraphrasing. In addition to aiding authors in online communities, the writing assistants developed through the project are deployed in K-12 classrooms to advance writing pedagogy. The project incorporates undergraduate students, including those outside of computer science, in natural language processing research, and provides significant outreach to underrepresented minorities.

To make meaningful progress on the development of machine-in-the-loop writing assistants, the project includes a collaboration with Protagonist Labs, which runs online platforms for collaborative storytelling in both creative and pedagogical settings and already has incorporated systems built by the investigator’s team into user-facing interfaces. User interaction on such platforms allows fine-grained evaluation of novel text generation methods, which include neural language models that integrate context compression, context retrieval, and discrete latent variables into the generation process to improve overall coherence and relevance. In addition to producing new text, fully-featured writing assistants must also be able to rewrite user text into a specified form, such as a target writing style or suitability for a target audience. To this end, the project introduces new paraphrase generation models at a variety of units of text, including phrases, sentences, and paragraphs. This research effort aims to spur research into interactive text generation systems, and as such its outputs will include publicly-released pretrained models and open-sourced code.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
2046248
Program Officer
D. Langendoen
Project Start
Project End
Budget Start
2021-09-01
Budget End
2026-08-31
Support Year
Fiscal Year
2020
Total Cost
$95,400
Indirect Cost
Name
University of Massachusetts Amherst
Department
Type
DUNS #
City
Hadley
State
MA
Country
United States
Zip Code
01035