Analysis of Quantitative High Throughput Screening Data

Shockley, Keith

Abstract

Thousands of chemicals in wide commercial use have not been tested for adverse effects on humans, but are present in the environment. Accordingly, there is a need to improve chemical prioritization for in vivo toxicity testing and, ultimately, to find cell-based alternatives for evaluating the large inventory of potentially harmful substances. Quantitative high throughput screening (qHTS) assays are multiple-concentration experiments with an important role in the efforts of the National Toxicology Program to meet these testing challenges and advance toxicology from a predominantly observational science to a predominantly predictive science. qHTS can simultaneously assay thousands of chemicals over a wide chemical space with reduced cost per substance. Previous approaches for making activity calls from qHTS data were based on pharmaceutical applications seeking to minimize false positives and usually relied on heuristics rather than statistical tests to make activity calls. For that reason, we developed a three-stage algorithm to classify substances from qHTS data into statistically supported activity categories relevant to toxicological evaluation, seeking to improve sensitivity while minimizing Type I error rate (Shockley, 2012). The first stage of our approach fits a four-parameter Hill equation to find active substances with a robust concentration-response profile within the tested concentration range. The robust criterion specifies that response profiles are statistically significant using both unweighted and weighted non-linear least squares (NLS and WNLS) regression. NLS weights all data points equally and, consequently, may not discriminate between a profile with data along both asymptotes and a profile supported by a single point. WNLS weights each response point based on 1/s2, where s is the sample standard deviation estimated from all response data within a defined concentration range containing the response point of interest, so that more influence is given to neighboring data points with similar response levels than neighboring data points with very different responses. The second stage finds relatively potent substances with substantial activity at the lowest tested concentration, substances not captured in the first stage. The third and final stage separates statistically significant profiles from responses that lack statistically compelling support, or inactives. This framework accommodates large volumes of qHTS data, tolerates missing data, and does not require replicate measurements. We evaluated this three-stage classification algorithm via extensive simulations (Shockley, 2012). The area under receiver operating characteristic curves (AUC) was used to assess performance. Using AUC statistics, our algorithm outperformed overall F-tests comparing the fit of the Hill equation to a horizontal line (no response) when the concentration for half maximal response (AC50) was less than 0.1 micro molar. It also outperformed t-test approaches in detecting known actives when the AC50 was greater than 0.001 micro molar. The three-stage decision strategy yielded good (AUC >= 0.75) to high (AUC >= 0.9) performance for 14 point concentration-response curves when the response was in the detectable region of the simulated assay (>25% of the positive control). Our approach was able to detect relatively potent substances (e.g., AC50 = 0.001 micro molar) with as few as 4 data points when the tested response was at least 50% of the positive control response. The three-stage algorithm described above is based on the Hill equation model. However, concentration-response data can be complex and it may be more informative to find alternative patterns in the data not based on fits to sigmoidal curves. We are currently developing approaches to find complex patterns in qHTS data based on principles of order restricted inference. Multiple hypothesis testing is used in order to compare the significance of all searched patterns and identify the most appropriate pattern describing a response profile. It is often unclear how to prioritize chemicals for follow-up studies due to the large uncertainties that accompany parameter estimates derived from nonlinear regression model fits to data generated in qHTS experiments. Therefore, we have also used a weighted entropy score (WES) as a measure of average activity level in order to rank chemical in qHTS experiments. WES scores can be used to rank all chemicals in a tested library without a pre-specified model structure, or WES can be used to complement existing approaches by ranking returned """"""""hits"""""""". The performance of WES has been evaluated using data simulated from a Hill model. WES outperforms rankings based on AC50 (estimated concentration of half-maximal response) across the full range of conditions that are typical of qHTS studies.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of Environmental Health Sciences (NIEHS)
Type: Investigator-Initiated Intramural Research Projects (ZIA)
Project #: 1ZIAES102865-04
Application #: 8734169
Study Section

Project Start
Project End
Budget Start
Budget End
Support Year: 4
Fiscal Year: 2013
Total Cost: $192,448
Indirect Cost

Institution

Name: National Institute of Environmental Health Sciences
Department
Type
DUNS #

City
State
Country
Zip Code

Related projects


NIH 2019 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / National Institute of Environmental Health Sciences
NIH 2018 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / U.S. National Inst of Environ Hlth Scis
NIH 2017 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / U.S. National Inst of Environ Hlth Scis
NIH 2016 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / U.S. National Inst of Environ Hlth Scis
NIH 2015 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / U.S. National Inst of Environ Hlth Scis
NIH 2014 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / U.S. National Inst of Environ Hlth Scis
NIH 2013 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / National Institute of Environmental Health Sciences	$192,448
NIH 2012 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / National Institute of Environmental Health Sciences	$323,905
NIH 2011 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / National Institute of Environmental Health Sciences	$139,446
NIH 2010 ZIA ES	Analysis of Quantitative High Throughput Screening Data Shockley, Keith / National Institute of Environmental Health Sciences	$80,095

Publications

Shockley, Keith R (2016) Estimating Potency in High-Throughput Screening Experiments by Maximizing the Rate of Change in Weighted Shannon Entropy. Sci Rep 6:27897

Pei, Ying; Peng, Jun; Behl, Mamta et al. (2016) Comparative neurotoxicity screening in human iPSC-derived neural stem cells, neurons and astrocytes. Brain Res 1638:57-73

Shockley, Keith R (2015) Quantitative high-throughput screening data analysis: challenges and recent advances. Drug Discov Today 20:296-300

Chen, Shiuan; Hsieh, Jui-Hua; Huang, Ruili et al. (2015) Cell-Based High-Throughput Screening for Aromatase Inhibitors in the Tox21 10K Library. Toxicol Sci 147:446-57

Ray, Mitas; Shockley, Keith; Kissling, Grace (2014) Minimizing Systematic Errors in Quantitative High Throughput Screening Data Using Standardization, Background Subtraction, and Non-Parametric Regression. J Exp Second Sci 3:

Huang, Ruili; Sakamuru, Srilatha; Martin, Matt T et al. (2014) Profiling of the Tox21 10K compound library for agonists and antagonists of the estrogen receptor alpha signaling pathway. Sci Rep 4:5664

Shockley, Keith R (2014) Using weighted entropy to rank chemicals in quantitative high-throughput screening experiments. J Biomol Screen 19:344-53

Teng, Christina; Goodwin, Bonnie; Shockley, Keith et al. (2013) Bisphenol A affects androgen receptor function via multiple mechanisms. Chem Biol Interact 203:556-64

Shockley, Keith R (2012) A three-stage algorithm to make toxicologically relevant activity calls from quantitative high throughput screening data. Environ Health Perspect 120:1107-15

Comments

Be the first to comment on Keith Shockley's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: