The goal of this project, led by a team at Educational Testing Services, is to develop automated tools by which assessments aligned with the Next Generation Science Standards (NGSS) can be scored to reveal student reasoning patterns, some of which would reflect particular weaknesses in student reasoning. Reasoning patterns refer to various ways of student thinking when making sense of a natural phenomenon or trying to solve a problem. The investigators will conduct a proof of concept study to develop automated diagnostics that can identify middle-school students? reasoning patterns based on their written responses to assessments of their understanding of concepts in ecology. With the states increasingly moving towards implementing assessments aligned to the NGSS, feedback based on individual students? reasoning patterns would allow teachers the ability to develop more individualized feedback and would also support the design of automated instruction based on evidence of what students know and how they learn, rather than instruction based simply on whether they had answered correctly or not. The project is funded by the EHR Core Research (ECR) program, which supports work that advances the fundamental research literature on STEM learning.
The investigators will conduct a proof of concept study to identity student reasoning patterns for making sense of ecosystems by investigating student responses to constructed test items collected from an NGSS-aligned assessment database. In the first stage of the study, the investigators will use a 3-dimensional (content knowledge, procedural knowledge, and epistemic knowledge) approach to assessment that aligns with the NGSS. They will leverage cutting-edge natural language processing (NLP) techniques to identify student reasoning patterns, attempting to label text as data description and system relationship description and attempting to identify superficial integration as opposed to integrated reasoning. The investigators will engage in an iterative process to compare the classification produced by the automated tools with that produced by human scorers. With this classification developed and validated, the investigators will have demonstrated the feasibility of later attempting to develop an NLP-based automated system that could provide immediate feedback to students, identifying weaknesses in reasoning rather than only whether an answer was correct, and to teachers, allowing them to tailor instruction to individual students.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.