Children with Specific Language Impairment (SLI) experience a delay in acquisition of certain language skills, with no evidence of hearing impediments or other cognitive, behavioral, or neurological problems. To diagnose monolingual children with SLI, clinicians have at hand standardized tests, such as the Test for Early Grammatical Impairment, that provide "cut-off" thresholds defining the normal range for children of different ages. Diagnosing bilingual children with SLI is far more complicated, however, due to a lack of standardized tests, a lack of bilingual clinicians, and most importantly a lack of a deep understanding of bilingualism and its implications on language disorders. In addition, bilingual children often exhibit code-switching patterns that make the assessment task even more challenging. The PIs' goal in this project is to contribute to the early and accurate identification of English-Spanish bilingual children with SLI, by developing an automated method for discriminating syntactic patterns indicative of SLI. Recent approaches on differential diagnosis of bilingual children are focused either on assessing the phonological systems of both languages to identify children with speech disorders, or on the analysis of error patterns on specific morphemes, such as article gender and number agreement. In contrast, the PIs' approach is not to restrict the analysis to a specific syntactic structure, but rather to focus on adapting Machine Learning (ML) and Natural Language Processing (NLP) techniques so that they can learn the patterns that distinguish an otherwise typical language development. The PIs will pursue the objectives along two core topics: automatic part-of-speech (POS) tagging of bilingual discourse (in which they will investigate the use of ML approaches, in particular domain adaptation techniques, for combining existing linguistic resources on both languages), and statistical methods for discriminating patterns of language use indicative of SLI (in which syntactic information, generated by the tagger will be used to train statistical models). The intuitive motivation for this approach is that the language patterning of bilingual children with SLI will be different from those of typically developing children both at the syntactic level and at the interaction level of the two languages, and these differences will be captured by the statistical methods.

Broader Impacts: The clinical implications for this research are far-reaching, particularly regarding the issue of both over- and under-identification of bilingual children experiencing SLI. Because the criteria for this diagnosis involve identification of disordered patterns of language form, content, and use, children who engage in code-switching are at risk of being inappropriately labeled as SLI and placed in special education services. The ability to apply objective technology to the diagnostic process will serve as another sensitive evaluation instrument, eventually allowing for more accurate differentiation of children demonstrating language differences from those experiencing language disorders. For the NLP community, this research will advance the state-of-the-art by developing approaches that can solve problems where the task involves cross-linguistic features, children?s spontaneous speech, and small amounts of data. The NLP methods developed will be generalizable to other clinical tasks and bilingual populations.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0812134
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
2008-09-01
Budget End
2010-01-31
Support Year
Fiscal Year
2008
Total Cost
$110,493
Indirect Cost
Name
University of Texas at Dallas
Department
Type
DUNS #
City
Richardson
State
TX
Country
United States
Zip Code
75080