This project intends to study the core enzymes that drive the production and breakdown of carbohydrates. These enzymes, called the carbohydrate active enzymes (CAZymes), are found in all living organisms and particularly in plants and plant-associated microbes. The complex carbohydrates found in plant cell walls are the most abundant, and renewable, organic material on Earth. If we had efficient systems to convert them to biomaterials and biofuels they would be attractive targets for bio-manufacturing projects. Important effects in the natural world are (i) the CAZymes produced by plant microbial pathogens cause plant cell wall breakdown leading to devastating crop loss ($5 billion in the United States and Canada each year) and (ii) bacteria in animal guts produce hundreds of CAZymes that digest the carbohydrates in the diet, some of which may have positive, and others toxic, consequences to the host. The research approach combines genomics and bioinformatics: the genome of a green algae will be sequenced and then bioinformatics tools will be used to carry out data analysis. This green algae is the common ancestor of all land plants, its genome compared to those of plants will show how evolution has modified core carbohydrate chemistry to meet changing environmental challenges. Bioengineering of these enzymes may well contribute to the development of a more sustainable and secure bioeconomy (e.g., bioenergy and agricultural industries) in the US, as part of the global Genomics market, whose value is expected to reach $20 billion by 2020. Students trained in the course of this project will be poised to become the next generation of scientists, able to exploit their understanding of comparative genome sequence analysis to create new understanding and novel applications. The educational and outreach objectives of this project are to engage students as active participants in the research activities, including data analysis, and to to train undergraduate students and K-12 Science teachers to understand the basics of genome sequencing and comparison methods, including hands-on skills.
In the first Aim, new bioinformatics programs will be developed to allow in-depth CAZyme annotation with predicted biochemical activities. In the second Aim, the genomic context of CAZymes will be studied in microbial genomes and metagenomes of various ecological environments. Overall four computational tools will be developed, integrated, and delivered as a CAZyme bioinformatics web portal named dbCAN2. These free online tools will facilitate CAZyme research in various research fields such as genomics, carbohydrate, bioenergy, plant disease, food security, human gut microbiome, evolution and ecology. In the third Aim, this project will sequence and mine the genomes and transcriptomes of algae and early plants for CAZymes. This includes sequencing the genome and transcriptome of a green alga Zygnema circumcarinatum, the immediate ancestor of all land plants that is extremely critical for understanding the early evolution of carbohydrate-rich cell walls. The specific education activities include: (i) working with the Office of Student Engagement and Experiential Learning (OSEEL) of Northern Illinois University to bring undergraduate students, particularly under-represented minority students, into CAZyme bioinformatics research; (ii) collaborating with the Center for Secondary Science Teacher Education of NIU to integrate DNA sequencing and data analysis topics into the curriculum of the Teacher Licensure Program as well as the professional development programs for K-12 Science teachers; and (iii) incorporating Zygnema genome annotation as new lab components into BIOS308 (Genetics) and BIOS441 (Practical Bioinformatics). Research products of this project will be disseminated at: http://cys.bios.niu.edu/dbCAN2/.