The expansion of federally funded state student longitudinal data systems (SLDS) provides a rich source of data that has great potential in STEM education research and evaluation. However, access to that data has been hampered by the requirements of the Family Education Rights and Privacy Act (FERPA). The researchers in this study will examine the ways in which two general approaches to statistical disclosure control will enable states to share data while still complying with the standards of FERPA. The researchers will use imputation and data masking of risk strata with the inclusion of cross tabulations of variables with data from five states that have agreed to authorize the use of their data. They will create protected datasets, study them to determine whether they protect against disclosure and carry out a number of analyses to determine whether they yield essentially the same answers as the corresponding analyses using the original data.
Protecting against the loss of privacy is an essential component of the concerns about the ever-growing data that are being collected about all members of society. However, collecting data that are not available for legitimate research and evaluation decreases the value of that collection. States need better mechanisms to ensure privacy of data they collect. At the same time they also need assurances that the findings from the data they release to researchers will provide valid answers to the research and evaluation questions posed. This project will build and test the data masking models that are the necessary infrastructure for the effective use of large scale state longitudinal data systems.