This Early Grant for Exploratory Research seeks to investigate the viability of a knowledge-rich, joint-learning approach to coreference resolution, with the ultimate goal of advancing the state of the art in coreference resolution. Given recent advances in research on lexical semantics and discourse, and the development of large-scale lexical databases, the first objective of this grant is to investigate whether existing language technologies are mature enough to accurately extract semantic, discourse, and world knowledge from structured and unstructured data so that learning-based coreference systems can be significantly improved when such knowledge is employed.
An assumption underlying the first objective is the use of a pipeline system architecture, where sophisticated linguistic information from various sources is computed prior to coreference resolution. While a pipeline architecture is popularly-used in coreference research, the errors made by the upstream components may propagate to the coreference component and adversely affect its performance. To address this problem, the second objective of this grant is to explore an approach in which multiple tasks in the pipeline are learned in a joint fashion. While most research on joint learning for language processing focuses on two tasks, this work seeks to take the challenge involved in joint learning to the next level by simultaneously learning a large number of tasks in semantics, discourse, and information extraction, which can all benefit from their interactions with each other and with coreference in the learning process.