Protein-DNA interactions play crucial roles in many biological processes, such as structure-based drug design and structure-based transcription factor binding site prediction. Transcription factors, a type of proteins, are considered as one of the prime drug targets since mutations on transcription factors and aberrant protein-DNA interactions have been implicated in many diseases including cancer. The major challenges in modeling protein-DNA complexes are accurate assessment of the quality of complex models and a small number of known protein-DNA complex structures. This project aims to better understand protein-DNA interactions from a structural perspective by developing methods for assessment of the quality of protein-DNA complex models. A comprehensive database of protein-DNA complex structure will be constructed for analysis, modeling, and assessment, which will be a valuable resource for the scientific community. The project will actively promote and recruit postdoc and students from underrepresented groups through summer programs. Techniques and results of this project will be integrated into curriculum design to foster creative learning and understanding of relationships among structure modeling, macromolecular interaction, biological function, and biomedical applications.

Computational modeling of protein-DNA complexes, including homology modeling and protein-DNA docking, is a cost-efficient alternative to fill the void in the protein-DNA complex structure landscape. The major challenges are accurate assessment of the quality of complex models and the assessment of the similarity between protein-DNA complexes. While standard protocols have been developed for comparing protein-ligand complexes, they are not suitable for assessing the similarity of protein-DNA complexes due to their unique structural and chemical features. DNA has a double-helical structure and the hydrogen bonds between protein and DNA bases are crucial for protein-DNA binding specificity. New methods are clearly needed for accurate comparison of protein-DNA complexes, which in turn can help develop methods for accurate quality assessment of protein-DNA complex models. In this project, a novel method will be developed for accurately capturing the essence of similarity between protein-DNA complex structures and new algorithms will be developed for assessing the quality of protein-DNA models. In addition, these newly developed methods will be applied to assess the quality of homology protein-DNA complex models and structure-based transcription factor-binding site prediction with these models. The datasets and algorithms developed from this project will be made freely available to the research community. The results of the project can be found at https://guolab.uncc.edu

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Type
Standard Grant (Standard)
Application #
2051491
Program Officer
Jean Gao
Project Start
Project End
Budget Start
2021-05-01
Budget End
2024-04-30
Support Year
Fiscal Year
2020
Total Cost
$673,715
Indirect Cost
Name
University of North Carolina at Charlotte
Department
Type
DUNS #
City
Charlotte
State
NC
Country
United States
Zip Code
28223