A Data-driven Framework for Brain Transcriptomic Cell Type Definition, Ontology, and Nomenclature Defining the complete census of neuronal and non-neuronal cell types in the brain is a major priority for the NIH BRAIN Initiative, since cellular complexity is a major barrier to understanding brain function and the mechanistic underpinnings of disease. Single cell transcriptomics has revolutionized this field with the scale and information content to characterize complex tissues, and is leading quickly to a brain-wide classification of cell types in mouse, monkey and human. Transcriptomics is also uniquely suited to allow quantitative comparisons across species, across developmental time, and between brain and other organs, and is the common denominator with other large-scale efforts to characterize the entire human body in the Human Cell Atlas and HuBMAP consortia. The opportunity is now here to create a new quantitative framework for defining cell types in the brain, generating a new data-driven cell type ontology and a nomenclature convention similar in concept to the reference genomes that unify genomics. Importantly, the design principles should be extensible beyond brain to other organs so that the schema can be adopted across the other major consortium projects, but also to incorporate other important cellular phenotypes important for neurobiological function. The proposed project aims to bring together a team of experts in single-cell transcriptomics, informatics, ontology development and computational biology who are also leaders and members of the major cell type consortia to develop a data-driven framework of brain cell types. First, the project aims to develop standards for quantitative definitions of transcriptomic-based cell types from the BRAIN Initiative Cell Census Network (BICCN) datasets, and tools for mapping other datasets (other data types or data from other researchers) to this reference. This will create reference data structures for features of transcriptomic cell types and taxonomies that will be deployed through the BICCN portal. Secondly, the project aims to build on prior work on developing cell and phenotype ontologies to develop a new, data-driven formal cell ontology for the whole brain reference. Part of this ontology is a nomenclature convention for systematic naming of cell types that allows similar naming of homologous cell types across species. Finally, a major goal is to engage the international cell type community in developing and refining these standards and reference classification to ensure their usefulness and widespread adoption. This will involve active engagement of the community through a working group structure, and periodic domain expert workshops with the BICCN, HCA, HuBMAP and INCF consortia. All standards, ontologies and tools will be deployed on the BICCN portal with mechanisms for community feedback and vetting.

Public Health Relevance

A Data-driven Framework for Brain Transcriptomic Cell Type Definition, Ontology, and Nomenclature Major investments in characterizing cellular diversity using single cell transcriptomic methods are quickly generating a map of cell types across the whole brain in mouse, monkey and human. This remarkable transformative resource now requires a standardization of quantitative methods for cell type definition, and the development of a formal data-driven cell ontology and nomenclature convention similar in concept to the human genome reference in genomics. The current proposal aims to develop these standards and cell ontologies for the brain, but with extensible principles to allow adoption across the larger community and major consortia including the Human Cell Atlas and HuBMAP.

National Institute of Health (NIH)
National Institute of Mental Health (NIMH)
Multi-Year Funded Research Project Grant (RF1)
Project #
Application #
Study Section
Special Emphasis Panel (ZMH1)
Program Officer
Yao, Yong
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Allen Institute
United States
Zip Code