The Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis (CAMERA, http://camera.calit2.net/) is a semantically enabled database and distributed computational infrastructure that provides a single system for depositing, locating, analyzing, visualizing, and sharing microbial biology data. With the rapid advance of newer DNA sequencing methods, so called Next Generation Sequencing (NGS) technologies, such as Illumina HiSeq and MiSeq, it is becoming increasingly difficult for researchers using sequencing data to meet the computing requirements for large-scale NGS datasets with existing methods. In response to these aspects of the BIG DATA challenge, the CAMERA team is developing new bioinformatics algorithms, high performance computing solutions, visualization interfaces, and data resources to specifically address the NGS data analysis challenges. Here, the group proposes a crosscutting methodology for analyzing NGS data that marries innovative bioinformatics algorithms and workflows with leading edge computational methods for managing large scale distributed computing. The integration of XSEDE resources for BIG DATA analysis will provide the scale and specification necessary to drive the development of this system. This project will be conducted over two years. Year one will be focused on the refinement of core CAMERA CI (e.g. Panfish) and the continued development of core NGS workflows/algorithms. Specifically, CAMERA CI will be extended to take full advantage of two new NSF XSEDE resources to be commissioned in early 2015 (Wrangler at TACC & Comet at SDSC). Year 2 will be focused on the production integration of Wrangler and Comet and the subsequent deployment of the NGS workflows (via CAMERA CI) to the entire CAMERA community. These new software tools and pipelined processes facilitate the processing and analyze very large-scale metagenomic data on the scale of tens of GB per sample and provide comprehensive and unique functions such as 16S analysis[7], taxonomy binning[8], assembly, rRNA finding, clustering, filtering, function and pathway annotation, and visualization]. These next generation tools enable orders of magnitude faster computational process, more comprehensive analysis, integrated data output, and novel ways to investigate complex data, once made to operate in extensible HPC cloud environments. The Broader Impact is viewed as that currently, manual operations are necessary to complete analysis with these tools due to the complexity of the process and the large number of software tools involved. The goal of this project is to develop a series of fully integrated and easy-to-use analysis workflows encapsulating these tools. These new workflows of software tools will significantly improve NGS data analysis for researchers who use metagenomics as an investigative tool, researchers who are now impeded by challenges with regard to managing and analyzing BIG DATA.