Bioinformatics Data """"""""Cleaning"""""""" for Immune Repertoire Sequencing

Johnson, David

Abstract

The Specific Aim of this Phase I proposal is to test the feasibility of a bioinformatics technology for correction of amplification bias in T cell recepor ? (TCR ?) repertoire sequencing (REP-SEQ), thus providing the foundation for this technology in clinical diagnostics. The T cell repertoire is the foundation of human adaptive immunity, and deep T cell repertoire sequencing is now commonly used in a research setting to quantify immune responses (Robins et al., 2009;Wang et al., 2010;Robins et al., 2012). Clinical immune repertoire sequencing has a large awaiting market because multiplexing all possible V(D)J combinations into a single assay significantly decreases material and labor costs compared with current diagnostic methods. For example, conventional leukemia minimal residual disease (MRD) work-ups demand laborious customization, cost up to ~$5000 per patient, and have a turnaround time of several weeks. We estimate that for the MRD market alone, our technology would save ~$140 million in annual costs for diagnostics labs, and would produce more sensitive data in less than half the time of standard MRD work-ups. The technical innovation of the product is to use bioinformatics to """"""""clean"""""""" the no representative amplification that plagues multiplexed repertoire amplification (Robins et al., 2012). First, we will build a lare control library of TCR ? plasmid clones. Next, we will build REP-SEQ libraries using the control clones as templates and generate a large training set of data from these libraries using next-generation sequencing (NGS). Finally, we will build a linear model for correction of raw data using this training set, and test the feasibility of the linear model using a second set of TCR ? clones. We will require that the bioinformatics method consistently clean biased REP-SEQ measurements such that regression analysis between observed clone counts versus expected clone counts achieves an average R2 of >0.95 and an average slope of >0.9 (power=0.8, ?=0.05) and such that clonotypes present as low as 0.01% have an average coefficient of variation (CV) of <10% across hundreds of measurements (power=0.8, ? =0.05). Additionally, the technology must be sufficiently sensitive for reliable detection of clonotypes that are present as low as 1 copy in 1 million, such that the area under the receiver operator characteristic curve (AUC) is greater than 0.8 across hundreds of measurements (? =0.05). The methods that we develop in Phase I will enable us to perform a large 510(k) validation study for FDA approval of a molecular kit for clinical REP-SEQ in Phase II. The final product will be priced at <$1000 per sample and will enable diagnostics labs throughout the US to streamline their operations without having to ship samples to a reference lab.

Public Health Relevance

Diagnostics laboratories often analyze T cells to help characterize disease. We are building a streamlined, cheaper, and more comprehensive system for T cell analysis in clinical laboratories.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #: 1R43CA171469-01A1
Application #: 8453266
Study Section: Special Emphasis Panel (ZRG1-IMST-J (15))
Program Officer: Rahbar, Amir M

Project Start: 2012-09-24
Project End: 2013-04-30
Budget Start: 2012-09-24
Budget End: 2013-04-30
Support Year: 1
Fiscal Year: 2012
Total Cost: $205,650
Indirect Cost

Bioinformatics Data """"""""Cleaning"""""""" for Immune Repertoire Sequencing
Johnson, David Scott
Gigagen, Inc., San Francisco, CA, United States

Abstract

Public Health Relevance

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Public Health Relevance

Funding Agency

Institution

Comments