One of the most common ways of studying a vertebrate immune system is to statistically compare populations of antigen receptors (either immunoglobulins or T-cell receptors) derived from different tissues under various experimental/clinical conditions (for instance, when monitoring chemotherapy patients). The problem is difficult and does not fit readily in the standard statistical frameworks due to (i) extremely diverse antigen receptor repertoires maintained by the immune system and (ii) technological limitations on data collection. In particular, when applied to antigen receptor studies, the traditional statistical methods of species richness and diversity inference, as e.g., ones used in ecology, often seriously underreport the true richness and diversity of TCR repertoires. This contributes to the relatively poor understanding of such repertoires'biological traits, despite great advances of modern molecular technology in TCR data collection. The proposed research project is an interdisciplinary undertaking by a team of researchers with backgrounds in applied mathematics, statistics, bioinformatics, and experimental immunology. The project's goal is to (i) systematically review the existing statistical methods for analyzing antigen receptor data and (ii) propose new, more efficient ones. In broad terms, the antigen receptor dataset may be characterized as a k-way table of n observations, with multiple cells of low counts and with a total number of cells (population richness) unknown. To analyze such tables, we propose to develop a comprehensive approach applicable to data obtained from the standard biological assays, like flow cytometry, spectratyping and DNA sequencing, under the hierarchical multinomial and Poisson models for counts data. The new proposed methods will be evaluated vis-a-vis traditional ones using the simulations as well as the data from cancer studies in TCR-min mice which have specially limited TCR repertoire. The statistical methodology derived and deemed most successful will be implemented in the public domain software to be made available at CRAN and caBIG archives.

Public Health Relevance

The proposed research will develop analytical and computational tools for analyzing antigen receptor data. Proper analysis of such data is one of the fundamental issues in studying vertebrate immune responses and therefore, the methods developed in this proposal will have broad applications to immunological studies in general. The tools developed in this proposal will help us to understand better the nature and various functions of T-cells which will lead to the development of more effective approaches to immunotherapy and cancer treatment.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Li, Jerry
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Ohio State University
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code
Hallgren, Justin; Pietrzak, Maciej; Rempala, Grzegorz et al. (2014) Neurodegeneration-associated instability of ribosomal DNA. Biochim Biophys Acta 1842:860-8
Wojciech, Lukasz; Ignatowicz, Alicja; Seweryn, Michal et al. (2014) The same self-peptide selects conventional and regulatory CD4? T cells with identical antigen receptors. Nat Commun 5:5061
Sadee, Wolfgang; Hartmann, Katherine; Seweryn, Micha? et al. (2014) Missing heritability of common diseases and treatments outside the protein-coding exome. Hum Genet 133:1199-215
Greene, Joshua; Birtwistle, Marc R; Ignatowicz, Leszek et al. (2013) Bayesian multivariate Poisson abundance models for T-cell receptor data. J Theor Biol 326:1-10
Cebula, Anna; Seweryn, Michal; Rempala, Grzegorz A et al. (2013) Thymus-derived regulatory T cells contribute to tolerance to commensal microbiota. Nature 497:258-62
Rempala, Grzegorz A; Seweryn, Michal; Ignatowicz, Leszek (2011) Model for comparative analysis of antigen receptor repertoires. J Theor Biol 269:1-15