There is considerable interest in data sharing, encouraged by funding agencies which are requiring more detailed data-sharing plans, as well as greater attention and support from the scientific community towards rigorous, open, and reproducible science. With this project, we will build a domain-specific data repository, ?LDbase?, containing behavioral data from the field studying learning disabilities that will serve as a powerful resource for our community. We are uniquely qualified to build this repository. Our investigator team combines expertise in learning disabilities, advanced methodology, and research librarian techniques. LDbase will accelerate intellectual discovery by facilitating data reuse and reproducibility, ultimately building an enduring record that represents the richness, diversity, and complexity of the science done by learning disabilities researchers. We will achieve this through three specific aims: (1) Create a data repository representing a vast knowledge database on learning disabilities; (2) Release a powerful open access combined dataset and provide statistical training for combining datasets using integrative data analysis; and (3) Use integrated data to determine the most valid way to classify a reading disability for each child. LDbase will be seeded by data from six large research sites that will contribute large amounts of behavioral data from tens of thousands of participants, and then open to external data depositors and data users. With these high quality ?big data?, we will answer a fundamental question in the field concerning personalized classification of reading disability, showing the powerful applied usefulness of these data. LDbase will advance the sharing of learning disabilities related behavioral data as a powerful new tool to answer research questions not answerable without it, build a community of researchers invested in data sharing and open science practices, and establish a sustainable infrastructure to support the longevity of the project.
Behavioral data related to learning disabilities is expensive and difficult to collect, typically requiring extensive federal investment to get appropriately powered samples. Sharing data from these samples will propel creative new research questions for a small additional investment, and enhance the learning disabilities field and its engagement with the public. Sharing data is especially important for promoting public health as it drives innovation in science, and here we spur innovative research for children in schools.