The application of high-throughput technologies in biomedical research has become widespread and requires specialized skills to effectively design experiments analyze large datasets and integrate new data with existing large datasets. These technologies are increasingly being applied in environmental health sciences to provide comprehensive and timely mechanistic knowledge on how the environment affects human health. With the increased application of these technologies, more researchers need training to conceptually develop, properly design and implement comprehensive, large-scale big data studies. Accordingly, our proposed program in Population-Scale Genomics Studies of Environmental Stress has the long-term goal of training a network of Big Data to Knowledge (BD2K) practitioners in the application of modern sequencing technologies, computational approaches and biostatistical methods. The program couples three annual training workshops with networking tools aimed at keeping participants trained, engaged and connected. Workshops will feature a faculty of prominent researchers to provide the training necessary to maximize the application of these technologies. Each workshop will feature novel datasets of model organisms that participants create and analyze to link gene-environment interactions with the fitness of individuals. Hands-on training in a number of bioinformatics tools will be provided. Within this inquiry-based framework, faculty will lecture on a diverse set of topics including ecological genomics, experimental design, genome sequencing and population genetics. Workshops will include a module on responsible conduct of research. The proposed program builds upon our existing Environmental Genomics Course at MDI Biological Laboratory that was first established in 2010. To our knowledge, it is the only course of its kind in the U.S. that provides a highly interactive, hands-on research experience for researchers interested in studying gene-environment interactions using natural populations. The proposed workshop training is modeled after our 2014 Environmental Genomics Course and forms the foundation with which to build a network of expertly trained BD2K practitioners. The proposed BD2K practitioner network will ensure long-term benefits for program participants especially as new technologies and analysis methods arise in this rapidly changing field.
The long-term goal of the proposed NIH Big Data to Knowledge (BD2K) training initiative in Environmental Genomics is to increase the number of BD2K practitioners and build a virtual network of big data scientists. We will couple three annual workshops that focus on population-scale genomics studies of environmental stress with networking tools aimed at keeping participants connected, trained and engaged in the application of modern sequencing technologies, computational approaches, and biostatistical methods.