This proposal was submitted in response to EHR Core Research (ECR) program announcement NSF 19-508. The ECR program of fundamental research in STEM education provides funding in critical research areas that are essential, broad and enduring. EHR seeks proposals that will help synthesize, build and/or expand research foundations in the following focal areas: STEM learning, STEM learning environments, STEM workforce development, and broadening participation in STEM. The ECR program is distinguished by its emphasis on the accumulation of robust evidence to inform efforts to (a) understand, (b) build theory to explain, and (c) suggest interventions (and innovations) to address persistent challenges in STEM interest, education, learning, and participation.

This EHR Core Research project investigates the use of natural language processing (NLP) to automatically identify characteristics of assessment items and explore their utility for analyzing patterns of student performance. The project investigates three central categories of assessment items: cognitive, linguistic, and context; and evaluates how they influence students' test performance. This research will contribute to the science of test-item development in secondary science education by addressing the lack of systematic research in constructing contextualized items to assess students' conceptual understanding.

This interdisciplinary research project will strengthen the theoretical and empirical basis for item development that accounts for cognitive, contextual and linguistic needs and helps improve assessment tools and methods for measuring valuable constructs in science learning. The investigators will address four research objectives: (1) define and operationalize the targeted dimensions of science assessment items based on literature reviews, cognitive interviews, and inter-coder reliability study; (2) develop and empirically test machine learning algorithms that can be used to automatically profile a large and diverse set of items; (3) provide evidence about the possible main and interaction effects of item characteristics on item parameters and differential patterns of student performance at grades 7 and 8; and (4) for variables with significant effects, develop interpretable variants of the automatic algorithms that provide guidance for improving assessment items. Project outcomes will include: i) development of a parameterized model that contributes to the understanding of how assessment items vary along cognitive, context, and linguistic dimensions by leveraging machine learning to incorporate large scale data analysis; ii) publicly available software tools for profiling contextualized assessment items, together with a catalog of items that have been profiled; iii) empirical evidence about the effectiveness of the automatic approaches for profiling items that align with the vision of the Next Generation Science Standards Framework for K-12 Science Education; and iv) a monograph describing the approach and procedures required to develop and evaluate contextualized items for content review, analysis of cognitive demands, and alignment to standards.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Washington
United States
Zip Code