One of the most common ways of studying a vertebrate immune system is to statistically compare populations of antigen receptors (either immunoglobulins or T-cell receptors) derived from different tissues under various experimental/clinical conditions (for instance, when monitoring chemotherapy patients). The problem is difficult and does not fit readily in the standard statistical frameworks due to (i) extremely diverse antigen receptor repertoires maintained by the immune system and (ii) technological limitations on data collection. In particular, when applied to antigen receptor studies, the traditional statistical methods of species richness and diversity inference, as e.g., ones used in ecology, often seriously underreport the true richness and diversity of TCR repertoires. This contributes to the relatively poor understanding of such repertoires'biological traits, despite great advances of modern molecular technology in TCR data collection. The proposed research project is an interdisciplinary undertaking by a team of researchers with backgrounds in applied mathematics, statistics, bioinformatics, and experimental immunology. The project's goal is to (i) systematically review the existing statistical methods for analyzing antigen receptor data and (ii) propose new, more efficient ones. In broad terms, the antigen receptor dataset may be characterized as a k-way table of n observations, with multiple cells of low counts and with a total number of cells (population richness) unknown. To analyze such tables, we propose to develop a comprehensive approach applicable to data obtained from the standard biological assays, like flow cytometry, spectratyping and DNA sequencing, under the hierarchical multinomial and Poisson models for counts data. The new proposed methods will be evaluated vis-a-vis traditional ones using the simulations as well as the data from cancer studies in TCR-min mice which have specially limited TCR repertoire. The statistical methodology derived and deemed most successful will be implemented in the public domain software to be made available at CRAN and caBIG archives.

Public Health Relevance

The proposed research will develop analytical and computational tools for analyzing antigen receptor data. Proper analysis of such data is one of the fundamental issues in studying vertebrate immune responses and therefore, the methods developed in this proposal will have broad applications to immunological studies in general. The tools developed in this proposal will help us to understand better the nature and various functions of T-cells which will lead to the development of more effective approaches to immunotherapy and cancer treatment.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Li, Jerry
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Ohio State University
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code
Lu, Rong; Wang, Danxin; Wang, Min et al. (2018) Estimation of Sobol's Sensitivity Indices under Generalized Linear Models. Commun Stat Theory Methods 47:5163-5195
Pietrzak, Maciej; Rempala, Grzegorz A; Nelson, Peter T et al. (2016) Non-random distribution of methyl-CpG sites and non-CpG methylation in the human rDNA promoter identified by next generation bisulfite sequencing. Gene 585:35-43
Rempa?a, Grzegorz A; Weso?owski, Jacek (2016) Double asymptotics for the chi-square statistic. Stat Probab Lett 119:317-325
Hartmann, Katherine; Seweryn, Micha?; Handelman, Samuel K et al. (2016) Non-linear interactions between candidate genes of myocardial infarction revealed in mRNA expression profiles. BMC Genomics 17:738
Pietrzak, Maciej; Papp, Audrey; Curtis, Amanda et al. (2016) Gene expression profiling of brain samples from patients with Lewy body dementia. Biochem Biophys Res Commun 479:875-880
Szurek, Edyta; Cebula, Anna; Wojciech, Lukasz et al. (2015) Differences in Expression Level of Helios and Neuropilin-1 Do Not Distinguish Thymus-Derived from Extrathymically-Induced CD4+Foxp3+ Regulatory T Cells. PLoS One 10:e0141161
Sturrock, Marc; Hao, Wenrui; Schwartzbaum, Judith et al. (2015) A mathematical model of pre-diagnostic glioma growth. J Theor Biol 380:299-308
Lu, Rong; Smith, Ryan M; Seweryn, Michal et al. (2015) Analyzing allele specific RNA expression using mixture models. BMC Genomics 16:566
Linder, Daniel F; Rempa?a, Grzegorz A (2015) Bootstrapping least-squares estimates in biochemical reaction networks. J Biol Dyn 9:125-46
Hallgren, Justin; Pietrzak, Maciej; Rempala, Grzegorz et al. (2014) Neurodegeneration-associated instability of ribosomal DNA. Biochim Biophys Acta 1842:860-8

Showing the most recent 10 out of 22 publications