Cells are basic structural and functional units of all known living organisms. Understanding the structures and spatial localizations of large individual macromolecules inside cells is fundamental to the biological research community. However, such information has been difficult to obtain due to the lack of data acquisition techniques. Recent advances in Cryo-electron tomography (cryo-ET) have enabled submolecular resolution 3D visualization of the near-native structures and spatial organizations of large macromolecules and their interactions with other subcellular components in single cells. However, the rapidly-increasing amount of diverse cryo-ET data brings along major challenges to high-throughput systematic analysis. Automation and computation efficiency have become bottlenecks. This project will focus on improving both automation and speed of macromolecule recognition and localization in tomograms using unsupervised deep learning. The proposed methods will have a wide range of applications in life science that involve cryo-ET. This project will train graduate and undergraduate students in computational biology, bioinformatics, and bioimage analysis, as well integrate research results into university curricula.

Cryo-ET has emerged as the most powerful technique for the structural recovery, recognition, and localization of macromolecules in situ. To significantly improve the automation and speed of macromolecule recognition and localization in tomograms, this project will develop two key unsupervised deep learning techniques, including (1) a novel simultaneous simulator and denoiser to create realistically simulated subtomograms; and (2) a method for clustering macromolecule structures by disentangling structure information from orientations and displacements. To facilitate broad use of the methods developed from this project, the software implementation of the proposed methods will be integrated into the open-source software AITom, so that they are ready to be used by the structural biology community. The methods focus on efficiently constructing initial homogeneous subtomogram clusters and producing initial structures for further structure refinement, which well complements existing structural refinement methods and will significantly leverage the systematic de novo and in situ analysis of macromolecules in single cells. The results of this research will be provided on the Xu Lab website and GitHub site.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2020-06-01
Budget End
2023-05-31
Support Year
Fiscal Year
2020
Total Cost
$250,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213