Graph analytics form a canonical Big Data problem that is of significant value to the long tail of science, from social sciences to genomics. While graph algorithms for big-memory machines abound, they are inaccessible to the wider community. Developing appropriate abstractions for graph applications on distributed cyber-infrastructure like Clouds and commodity clusters has been challenging. This work explores a subgraph-centric approach which offers the potential for an order-of-magnitude performance benefit. This work investigates graph algorithms, focused on de novo plant genome sequencing, that use a scalable subgraph-centric graph programming model for Clouds. It offers a novel research direction that can profoundly impact next-generation genome sequencing in addition to other domains where graph abstractions can be employed. It catalyzes research into distributed graph analytics through a critical mass of subgraph-centric algorithms, mitigating the lost opportunity cost in delayed adoption of the technology and domain specific computing abstractions. In the process, it will fundamentally advance scalable graph processing to rapidly accelerate and democratize cyber-infrastructure for Big Data for next generation sequencing.