In higher-order organisms, long DNA strands are tightly packed in the cell nucleus to accommodate spatial constraints. As different parts of the DNA structure are continuously accessed and operated on by a machinery of functional molecules, the packing is highly dynamic and carefully controlled. During the process of coiling and uncoiling of the DNA and supporting protein structures, important physical interactions between different genomic regions take place. These so-called chromatin interactions regulate cellular functions and aid in responding to external biological signals. Capturing evolving multiway chromatin interaction patterns is therefore of crucial importance for understanding genetic regulatory mechanisms and genomic network modules. This project aims to advance platforms for measuring single-cell chromatin interactions and develop accompanying machine learning algorithms that enable efficient information extraction from the acquired data.
The specific machine learning questions to be addressed include denoising and imputing multiway chromatin measurements, extracting dynamic chromatin community signatures via new hypergraph clustering methods and determining local and long-range interactions patterns through appropriate generalizations of PageRank methods. Furthermore, special attention will be placed on interpreting the findings within different biological contexts. To maximize the utility of the new algorithmic schemes, all relevant implementations supporting single-cell chromatin interaction data mining and analysis will be made readily available to the public.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.