The University of California at San Diego has received a grant to develop computational tools to support the study of the development and repair of nervous system development at the molecular level. Knowledge of the tissue distribution of essential molecules is necessary for understanding how biological systems function, how they grow, and how they repair themselves following trauma or disease. In this project, a multidisciplinary team of investigators will design, test and implement new tools to analyze data obtained by means of the recently developed technique of mass spectrometry imaging applied to the mapping of peptides and proteins in biological tissues. Application of these new methods will yield detailed maps of the temporal and spatial distributions of thousands of individual molecules and the capacity to examine patterns of expression as well as correlations in expression within ensembles of molecules. These new methods will be developed and tested first in simple model organisms, to characterize and compare the molecular components in the embryonic, adult and regenerating nervous system. Later, they will be applied in studies of mammalian nervous system slices in order to answer, among other questions, how stem cells are intercalated into and how they mature in adult nervous systems, during normal replacement or artificial replacement following cell loss due to disease or aging. All computational and bioinformatic tools developed in the course of this project will be made available openly to other scientists. The project will train a group of scientists at multiple levels, from undergraduates to postdoctoral fellows, in this exciting new area of basic and applied research. Addiitonal information may be found at http://genomes.ucsd.edu/leechmaster/.
Brains of animals and humans are complex structures comprised of cells (neurons and glial cells) with a multiplicity of shapes, sizes and functional roles. Their enormously varied anatomies, connectivities through channels and synapses, and their wide ranging functions are thought to be defined by the specific sets of molecules, particularly peptides and proteins, that are present in each neuron, or released and exchanged for specific intercellular signaling. These peptides and proteins are in turn defined by the sets of genes that each cell expresses. Thus, a deep understanding of how the nervous system functions in normal and pathological circumstances, and how it is assembled during development, requires mapping of the molecules that are present in neurons in different embryonic stage and adult tissues. This project's main goals have been to develop and implement novel approaches to generate and analyze molecular maps of the nervous systems of simple and more complex animals, in particular, of peptides and proteins, what might be called the proteomic maps of the brain. Using Mass Spectrometry Imaging (MSI), a recently developed and very powerful technique, to detect these molecules and measure their mass-charge characteristics as a function of position in slices of tissue provides the raw data used in this work. Basically, this technique extracts molecules at specific locations in the tissue under study and determines their mass-charge ratio, in a raster pattern that can then be used to create a two dimensional display of the abundance of each molecule of a specific mass-charge characteristic at each location. Further analysis requires the identification of which peptides and which proteins are present in the mass spectrometry data (spectra), which in turn requires knowledge of the putative sequence of amino acids, as provided by genomic and transcriptomic databases. A key aspect of this project was the creation of such databases for one of the species used in this work, and these have been made available to the scientific community as an open resource. The central achievement of this project are a set of computer programs for the analysis of the distributions of the molecules detected. These programs allow the researcher to obtain information about the patterns of expression of peptides and proteins and their relationships to the different regions of the nervous system and of the developing embryo. Pipelines for the prediction of putative peptide or protein identities have been tested and adapted to this type of map data, and the full analysis has been published for several examples. The nature of these patterns will serve as a starting point for further analysis of the possible roles of these molecules in the specific functions of the structures in which they were localized. Another feature of this software is a determination of which molecules have the same or related distributions, which may signify that their roles are related. While other researchers have developed segmentation techniques, our methods allow for (a) semi-supervised segmentation, allowing the user some control over segment choice, and (b) a principled approach to evaluating the quality of segments. Finally, all of the software developed in this project has been made available open source to researchers interested in using them in their own investigations and for further development of applications.