This EArly Grant for Exploratory Research aims to advance the text simplification technology that automatically rewrites complex English texts into simpler English texts. Research into this topic has many potential practical applications. It can provide reading aids for people with disabilities, low-literacy, non-native backgrounds or non-expert knowledge. It can also help with many other computer technologies that need to process difficult words and complicated sentences. This one-year exploratory project focuses on simplification for children with different reading levels. If this technology is successful, it could help make knowledge accessible to all children and gradually help to improve their reading skills.

Simplification can be thought of as a monolingual translation task, where the output is equivalent in meaning to the input, but its surface form is constrained by a readability or grade-level requirement. Prior work has drawn the connection between machine translation and text simplification, but has treated the SMT technology as a black box. Going beyond previous work, this study provides an extensive exploration of adapting key parts of the statistical machine translation pipeline to simplify text. It aims to tailor simplification to different readability levels. The three research activities being undertaken in this study are: (1) constructing a "parallel corpus" consisting of complex sentence paired with several different levels of simplification, (2) developing automatic metrics for targeted simplification, and (3) designing features for targeted simplification.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1430651
Program Officer
Tatiana Korelsky
Project Start
Project End
Budget Start
2014-05-01
Budget End
2016-04-30
Support Year
Fiscal Year
2014
Total Cost
$99,663
Indirect Cost
Name
University of Pennsylvania
Department
Type
DUNS #
City
Philadelphia
State
PA
Country
United States
Zip Code
19104