Brown University proposes to create lightweight data science curricula for K-12. Data science can be roughly defined as "programming + statistics" and thus has two points of control: how much programming is needed, and how much statistics is used. The term "lightweight" here is not used pejoratively but to connote an approach that is aligned with the CS for All Initiative's goal of scaling CS education to all schools and all students. This project will limit mathematics to "naïve" statistics rather than depending on continuous mathematics. For programming, the curriculum will create support that requires no more sophistication than required by CS Principles curricula, and perhaps even less. Finally, it will also be light on setup, using Google Sheets as the primary "database" system, lifting a big administrative burden on teachers and schools, and enabling elegant user interfaces that are already familiar to many students and teachers.
The programming support will be embedded into Pyret, a pedagogic programming language, extending it with direct support for tables and tabular operations. This will allow students to write concise programs using constructs designed to be evocative of SQL. To increase access to younger students and to limit typing, the PIs will incorporate a dual-mode block-and-text interface into Pyret. In order to engage more students, the work will include support for curricular personalization, allowing students to pick out the topics that they are interested in, and find the corresponding questions they want to ask about these data. The PIs will provide curated datasets, as well as means for teachers to find other similar data, to help students who need suggestions to get started. Finally, building on previous work of the PI and others, project will involve some preliminary investigation on whether providing high-level constructs and letting teachers teach "to the constructs" will naturally lead to better plan composition.