Large-scale image data analysis has in recent years become a key bottleneck in natural science research, particularly in the field of neuroscience. Technological advances in automated data acquisition have enabled the collection of terabyte and petabyte-size datasets. Extracting the rich information contained in these datasets manually would require an inordinate amount of human labor; reconstructing the neural connectivity in a complete fruitfly brain or cortical column of a mouse from electron microscopy data, key tasks of interest, would require ten thousand years of human labor using current state-of-the-art manual and semi-automated approaches. Improved automated image analysis tools are likely to be directly useful to the neuroscience community, enabling large-scale dense reconstruction of neural circuits from microscopy data, in which the morphology of every neuronal process is traced and all chemical synaptic connections between cells are identified, thereby mapping the complete "wiring diagram" of the circuit contained in the neural tissue. Such reconstructions have the potential to fundamentally impact the understanding of neural circuits by enabling competing models of brain architecture to finally be rigorously verified or falsified experimentally.

The large size of the datasets, the need for high accuracy to avoid incorrect scientific conclusions being drawn about the data, and the need for well-calibrated confidence measures in order to limit the time that must be spent manually verifying the output of algorithms, are all substantial challenges not well-addressed by existing segmentation methods. The investigators propose to (i) Develop efficient algorithms for convolutional locality-sensitive hashing, a novel generalization of locality-sensitive hashing techniques to the highly applicable setting of dense overlapping patches from a larger data volume. (ii) Develop efficient algorithms for the overlapping patch and convolutional variants of sparse coding designed to scale to very large datasets, filter sizes and numbers of filters. The proposed convolutional locality-sensitive hashing approach will be employed to enable this. (iii) Develop algorithms that leverage (i) and (ii) to segment electron microscopy data, and compare empirically to existing segmentation methods. All of the proposed methods are highly scalable to executions on large compute clusters in order to handle large training and test datasets. Furthermore, since the proposed methods allow explicit representation of the data, they are expected to be better calibrated than parametric methods such as the existing neural network-based methods for segmentation of electron microscopy data that currently achieve the best accuracy.

National Science Foundation (NSF)
Division of Information and Intelligent Systems (IIS)
Standard Grant (Standard)
Application #
Program Officer
Kenneth C. Whang
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California Berkeley
United States
Zip Code