Of the approximately 7,000 languages currently spoken in the world, linguists estimate that nearly half will be extinct within the next century. Very few of these languages have been well documented, but for the past two decades researchers have been working with speakers of a variety of endangered languages to preserve the rich data these languages possess. This data is invaluable for language education efforts and for linguistic research on the incredible diversity of the world's languages. Unfortunately, however, much of this data resides in the personal files of these researchers, inaccessible to others interested in the language, and not in a form that enables the data to be shared easily.
The COULD project (Cleaning, Organizing, and Uniting Linguistic Databases), a partnership between Canadian universities (Concordia and McGill) and Harvard University, is offering a solution to this problem. The main goal of the project is to create a new universal data format for language data, centralized online storage for this data, and a set of related online tools which will allow users all over the world to collect, preserve, and share language data more easily and more accurately. The project will assist with the integration of previously collected data into this new format, enabling this vast amount of linguistic data (currently only being used by specialists) to be accessible to second language learners, community members who would like to preserve their linguistic heritage, and other language and culture enthusiasts, in addition to other researchers.
With respect to broader impacts, the project is also of particular importance for those languages that are in threat of disappearing forever if the younger generations do not learn them. Non-linguists interested in preserving a community's language (especially members of the community itself) will be able to use the project's online tools to document their language without relying on experts. In addition, the project's lesson creation application will enable users to create learning materials, which are lacking for many endangered languages, and will support communities in their efforts to teach their language to younger generations.