Foraminifera or "forams" are marine protozoa that live in microscopic shells, most often made of the mineral calcite. Foraminifera shells provide the backbone for much work in the field of paleoceanography, which is the study of past climates using seafloor sediments. Students and lab employees are often required to pick several thousands of specimens from ocean sediments for each study. After a steep learning curve, picking therefore becomes a repetitive and low-reward task, making it well-suited for automation using machine learning and robotics. The project aims to develop an autonomous sorting system for foraminifera, which is accessible (in terms of usability and cost) to the scientific community. This system will be compatible with existing off-the-shelf microscopes, it will make use of microfluidics (or alternatively micromanipulation) in order to facilitate the transport of the samples from a container to their sorted receptacles, and will utilize machine learning for recognition. The tools and datasets developed by the researchers will be made available to the entire scientific community, and the aim is to keep the fabrication cost under three thousand dollars.
Building on prior work from the researchers, in which they developed a visual identification system for six species of forams using images under varying lighting directions, this project will: (1) automate the imaging and sorting process by using microfluidics (or alternatively another micromanipulation technique that will be developed); (2) scale up the recognition to thirty five species of planktonic foraminifera that are widely used by paleoceanographers by incorporating multiple laboratories for imaging, and a cloud infrastructure for crowd-sourcing of the data capture and labeling; (3) expand on the existing machine learning techniques to enable robust joint morphological characterization and recognition of forams; and (4) provide a detailed comparison between human and autonomous performance. In order to train the required models, the researchers will consider a number of techniques including transfer learning and data augmentation. Deep features learned from other datasets of forams using different imaging modalities will be exploited. The dataset obtained in this project will be augmented by creating synthetic images of forams. In order to ensure robustness, penalty terms that enforce topological persistence for segmentation and robustness to image perturbations for recognition will be employed.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.