Of the approximately 7,000 languages currently spoken in the world, linguists estimate that nearly half will be extinct within the next century. Very few of these languages have been well documented, but for the past two decades researchers have been working with speakers of a variety of endangered languages to preserve the rich data these languages possess. This data is invaluable for language education efforts and for linguistic research on the incredible diversity of the world's languages. Unfortunately, however, much of this data resides in the personal files of these researchers, inaccessible to others interested in the language, and not in a form that enables the data to be shared easily.

The COULD project (Cleaning, Organizing, and Uniting Linguistic Databases), a partnership between Canadian universities (Concordia and McGill) and Harvard University, is offering a solution to this problem. The main goal of the project is to create a new universal data format for language data, centralized online storage for this data, and a set of related online tools which will allow users all over the world to collect, preserve, and share language data more easily and more accurately. The project will assist with the integration of previously collected data into this new format, enabling this vast amount of linguistic data (currently only being used by specialists) to be accessible to second language learners, community members who would like to preserve their linguistic heritage, and other language and culture enthusiasts, in addition to other researchers.

With respect to broader impacts, the project is also of particular importance for those languages that are in threat of disappearing forever if the younger generations do not learn them. Non-linguists interested in preserving a community's language (especially members of the community itself) will be able to use the project's online tools to document their language without relying on experts. In addition, the project's lesson creation application will enable users to create learning materials, which are lacking for many endangered languages, and will support communities in their efforts to teach their language to younger generations.

Agency
National Science Foundation (NSF)
Institute
SBE Office of Multidisciplinary Activities (SMA)
Type
Standard Grant (Standard)
Application #
1429961
Program Officer
Joan Maling
Project Start
Project End
Budget Start
2014-05-15
Budget End
2016-01-31
Support Year
Fiscal Year
2014
Total Cost
$124,999
Indirect Cost
Name
Harvard University
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02138