This basic research project will develop new statistical techniques that will provide more robust estimates of the Value-Added Models (VAM). Multivariate response value-added models will be developed to include continuous and categorical responses and nested data structures, and address missing data problems. These models will employ latent-class mixture models, and will use classification trees and random forest methods for data analyses. The new techniques will allow the models to be used not only with continuous response data, such as test scores, but also categorical response data such as completion of a STEM degree. The techniques will also allow researchers to investigate the effects of missing data on value added models, as can occur when students drop out of STEM degree programs during college. The models will improve upon the current VAM models in three aspects: 1) incorporating the various missing data structures, 2) considering both continuous and categorical outcomes, and 3) taking into account complex relationships among subgroups of students and program characteristics.

The potential benefits of developing such value added statistical models will be for informing educational policy and practice. These benefits will include better decisions based on more precise estimates of teacher effects and the effects of other inputs on student outcomes in STEM. The researchers propose to address limitations of current value-added models to provide stronger models for assessing STEM program effectiveness and measure teacher or school effects on student achievement.

Project Report

Value-added models are commonly promoted as statistical methods for assessing the contributions made by individual teachers to a student’s knowledge. These models attempt, through analyzing student growth on assessment instruments after they have been instructed by different teachers, to assess the degree to which each teacher or school "adds" to a student’s knowledge. The NSF-funded project "Statistical Methods for Assessing Teaching and Program Effectiveness" developed new statistical methods for value-added assessment and investigated properties and limitations of the models currently in use. One frequently mentioned shortcoming of value-added models is that they focus on test scores, which may be an imperfect measurement of student learning. We developed multiresponse value-added models that assess teachers’ contributions toward long-term real-world student outcomes such as graduation or employment in a STEM field. While in general teacher effects on successive test scores of students are positively correlated, our data set exhibited a negative correlation between the teacher effect on calculus grade and the teacher effect on whether the student graduated with a science or engineering degree, indicating that the grades were providing incomplete information on possible teacher effects on students. Some students consistently obtain a nearly perfect score on standardized tests. These students therefore are limited in the amount of improvement they can show on future tests. This is called a ceiling effect, and it can affect the value-added scores of teachers who instruct classes of high-scoring students. We proposed new value-added models that account for the ceiling effect, and showed that these models could result in less biased assessments of teachers who instruct gifted students. Many of the models used for value-added assessment assume that every student has data for every time period studied. In practice, that assumption is rarely met. If the data are missing due to reasons that are unrelated to student or teacher performance, then fitting a standard model with the available observations is a reasonable procedure. But if data are missing because, say, low-performing students are encouraged to skip the exam, then ignoring the missing data can result in biased assessments. We developed new correlated-parameter models to investigate the sensitivity of value-added scores to missing data, and showed that in some data sets, different models for the missing data would result in substantial differences in the ranks of the teachers. Longitudinal student achievement outcomes typically have a complex dependence structure. Oftentimes students are not nested within classrooms, because they take classes with different teachers when progressing through school. This non-hierarchical data structure, further complicated by the presence of informative missing data and test-score ceiling effect, imposes great computational challenges on estimation of the current and newly developed value-added models. We developed efficient and stable computational methods for modeling such data with or without missing values, and implemented the methods in the R statistical computing language. The GPvam package (CRAN.r-project.org/package=GPvam) is publicly available software for computing maximum likelihood estimates for several widely used value-added models. Many researchers and commentators have expressed concerns that, when value-added models are used for high-stakes decisions such as merit pay or tenure, they may be manipulated and the tests may lose their original value as instruments of assessment. We investigated the sensitivity of seven value-added models to manipulation, and found that it would be very easy for a teacher to inflate his or her value-added score by using information not included in the model, without resorting to fraud or cheating. These results are consistent with the teachings of the statistician and quality improvement expert W. Edwards Deming. We then proposed an alternative approach for using information from value-added models following Deming’s quality improvement philosophy. While this research was done in the context of value-added models in education, the models and results obtained apply to many other settings as well. Similar problems occur in health care, where it may be desired to evaluate the contributions of different health professionals on patient outcomes.

Agency
National Science Foundation (NSF)
Institute
Division of Research on Learning in Formal and Informal Settings (DRL)
Application #
0909630
Program Officer
Gregg E. Solomon
Project Start
Project End
Budget Start
2009-09-01
Budget End
2012-08-31
Support Year
Fiscal Year
2009
Total Cost
$308,916
Indirect Cost
Name
Arizona State University
Department
Type
DUNS #
City
Tempe
State
AZ
Country
United States
Zip Code
85281