People are threatened by an unprecedented pandemic: COVID-19 (caused by SARS-CoV-2). This virus is now threatening not only physical health, but also psychology, education, economy, and every corner of the infrastructure of society. So far, there is no treatment for this disease, while vaccines and neutralizing antibodies are perceived as one of the eventual solutions to this crisis. A critical piece of knowledge supporting vaccine and antibody development is understanding the genome of this virus. What is the common sequence shared among the SARS-CoV-2 strains across the globe? What are the subtypes? Are the genome variances overlapping with important genomic regions for vaccine design? Using state of the art machine learning approaches, this research will identify the shared, representative sequence across SARS-CoV-2 strains and group them by major types. This project will connect this information to the important genomic regions identified in the literature that can be used for vaccines, and thereby continuously inform the ongoing effort of vaccine development, antibody selection, and therapeutic development. The research from this study would provide society benefits through monthly updates onto web interfaces that allow the vaccine developers, the biomedical research community as well as the general public to conveniently get access to the above information. This project will support training of a graduate student in bioinformatics and provide outreach opportunities to K-12 students and the public.

This work will be a continuous effort to monthly subtype SARS-CoV-2 strains and update the shared sequences of SARS-CoV-2, in order to facilitate vaccine development and antibody design. Specifically, this research will be focusing on three objectives: 1) identifying and updating the common sequences of SARS-CoV-2 by forcing the common sequence to have the minimal evolutionary distance with all strains, or covering as many sequences as possible 2) subtyping the SARS-CoV-2 strains into major groups, which will be important to inform treatment, management and prevention measures; 3) connecting the subtyped and common genomic sequences of SARS-CoV-2 to epitopes identified in the literature. To develop vaccines or neutralizing antibody treatment, it is critical that the major variations are covered and considered. The algorithms and visualization tools will overlay the curated list of potential epitopes on top of the subtypes and the shared sequence of the virus genomes, and directly support the effort of vaccine and antibody development. This RAPID award is made by the Systematics and Biodiversity Science Cluster in the Division of Environmental Biology, using funds from the Coronavirus Aid, Relief, and Economic Security (CARES) Act.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Environmental Biology (DEB)
Type
Standard Grant (Standard)
Application #
2030541
Program Officer
Katharina Dittmar
Project Start
Project End
Budget Start
2020-05-15
Budget End
2022-04-30
Support Year
Fiscal Year
2020
Total Cost
$199,705
Indirect Cost
Name
Regents of the University of Michigan - Ann Arbor
Department
Type
DUNS #
City
Ann Arbor
State
MI
Country
United States
Zip Code
48109