Modern biomedical research is increasingly making use of genome-scale from next-generation sequencing platforms, including Roche 454, Illumina GA2, and ABI SOLiD. These platforms make it possible for individual labs to quickly and cheaply generate vast amounts of genomic and transcriptomic data from de novo sequencing, resequencing, ChIP-seq, mRNA-seq, and allelotyping experiments. Despite this ability to generate large data sets, biomedical researchers are rarely trained in the computational and statistical techniques necessary to make sense of this data. Thus, many researchers must rely on others - often computational scientists with little biological training - to design and implement appropriate data reduction and data mining techniques. Moreover, most institutions do not have access to computational resources necessary to run these analyses.
Our specific aims are to help bridge this gap in a short, two-week course, by teaching biomedical researchers to (1) run analyses on remote UNIX servers hosted in the Amazon Web Services """"""""cloud"""""""";(2) perform mapping and assembly on large short-read data sets;(3) tackle specific biological problems with existing short-read data;and (4) design computational pipelines capable of addressing their own research questions.
All specific aims will be accompanied by in-depth hands-on practical training in the relevant techniques. Our experience is that this practical training leads to a substantial improvement in the basic computational sophistication of participants. This short course will help train the current and next generation of independent biomedical re- searchers in basic computational thinking and procedure, as well as teaching them how to make use of scalable Internet computing resources for their own research. Our end goal is increase the efficiency and sophistication with which biomedical researchers make use of novel sequencing technologies.
Many biomedical researchers are not well trained in computational tools that would help them make use of genomic and other bioinformatics data. We propose a two- week short course for advanced researchers that will help train to take advantage of sequence data in their research.