Data Science Core ABSTRACT The objective of the Texas A&M Superfund Research Center is to explore and develop descriptive models and tools that can predict the possible hazardous outcomes of chemical exposure during environmental emergencies and to produce powerful solutions which can mitigate the negative effects on human health. The ultimate goal of the Center is to contribute to decision-making capabilities for planning and control in emergency environmental contamination events. The Data Science Core is one of the essential components of the Center that will contribute to achieving the goals of the Center by supporting the work of four challenging Research Projects. The projects will produce high-dimensional data that requires comprehensive analysis and expertise in state-of-the-art data science methodologies in order to translate raw experimental data into actionable insights and predictive models. Directed by Dr. Christodoulos A. Floudas and in collaboration with Co-investigator Dr. Fred A. Wright, the Data Science Core will provide numerous methods and services to the Center researchers under three specific aims: (i) by sharing expertise and providing support via advanced methodologies in data science and statistics; (ii) by developing high-performance, novel methods for simultaneous regression or classification with dimensionality reduction and data integration; and (iii) by constructing and maintaining a computational platform that will enable collaboration across the Center and facilitate dissemination of knowledge to the wider community and key stakeholders. Research Project 1 will characterize exposure pathways of contaminated sediments that are vulnerable to movement and re- deposition due to storm activity; the Data Science Core will provide services for experimental design, hypothesis testing, and regression for contaminated sediment binding experiments. Project 2 will study the mitigation of adverse health effects of chemicals through broad-acting sorption materials; the Data Science Core will utilize predictive modeling of sorption activity via advanced regression and simultaneous dimensionality reduction with nonlinear kernels to guide experimental design and material property identification. Project 3 will investigate the inter-tissue and inter-individual variability in response to complex environmental mixtures; the Data Science Core will apply composite classification and clustering strategies for characterization of chemical mixtures. Project 4 will develop single-cell, high-throughput platforms to quantify the endocrine disruptor potential of environmental contaminants and mixtures; the Data Science Core will aid in predicting the activity of multiple endocrine receptors through model construction and reduction of predictive models. Furthermore, the Data Science Core will maximize productivity within the Center by establishing an ideal environment for data sharing and collaboration via a computational platform service. The platform will also disseminate the results of the Center, including access to the final high-performance predictive models and tools, by providing interactive interfaces amenable for use by the scientific community.

Public Health Relevance

Data Science Core PROJECT NARRATIVE The Data Science Core of the Texas A&M Superfund Research Center serves as basis for translating the raw experimental data produced by the Research Projects into useful knowledge to the community via data collection, integration, quality control, analysis, and model generation. The Core will utilize state-of-the-art methods in data science, optimization and machine learning, develop and apply novel dimensionality reduction techniques, and establish a computational platform for collaboration within the Center and data dissemination to the Center stakeholders.

Agency
National Institute of Health (NIH)
Institute
National Institute of Environmental Health Sciences (NIEHS)
Type
Hazardous Substances Basic Research Grants Program (NIEHS) (P42)
Project #
1P42ES027704-01
Application #
9257876
Study Section
Special Emphasis Panel (ZES1)
Project Start
Project End
Budget Start
2017-09-01
Budget End
2018-03-31
Support Year
1
Fiscal Year
2017
Total Cost
Indirect Cost
Name
Texas A&M University
Department
Type
DUNS #
020271826
City
College Station
State
TX
Country
United States
Zip Code
77845
Wignall, Jessica A; Muratov, Eugene; Sedykh, Alexander et al. (2018) Conditional Toxicity Value (CTV) Predictor: An In Silico Approach for Generating Quantitative Risk Estimates for Chemicals. Environ Health Perspect 126:057008
Avraamidou, Styliani; Beykal, Burcu; Pistikopoulos, Ioannis P E et al. (2018) A hierarchical Food-Energy-Water Nexus (FEW-N) decision-making approach for Land Use Optimization. Int Symp Process Syst Eng 44:1885-1890
Avraamidou, Styliani; Milhorn, Aaron; Sarwar, Owais et al. (2018) Towards a Quantitative Food-Energy-Water Nexus Metric to Facilitate Decision Making in Process Systems: A Case Study on a Dairy Production Plant. ESCAPE 43:391-396
Guyton, Kathryn Z; Rusyn, Ivan; Chiu, Weihsueh A et al. (2018) Re: 'Application of the key characteristics of carcinogens in cancer hazard evaluation': response to Goodman, Lynch and Rhomberg. Carcinogenesis :
Rusyn, Ivan; Kleeberger, Steven R; McAllister, Kimberly A et al. (2018) Introduction to mammalian genome special issue: the combined role of genetics and environment relevant to human disease outcomes. Mamm Genome 29:1-4
Alves, Vinicius M; Borba, Joyce; Capuzzi, Stephen J et al. (2018) Oy Vey! A comment on ""Machine Learning of Toxicological Big Data Enables Read-Across Structure Activity Relationships (RASAR) Outperforming Animal Test Reproducibility"". Toxicol Sci :
Orton, Daniel J; Tfaily, Malak M; Moore, Ronald J et al. (2018) A Customizable Flow Injection System for Automated, High Throughput, and Time Sensitive Ion Mobility Spectrometry and Mass Spectrometry Measurements. Anal Chem 90:737-744
Thiagarajan, Manasvini; Newman, Galen; Van Zandt, Shannon (2018) The Projected Impact of a Neighborhood-scaled Green Infrastructure Retrofit. Sustainability 10:
Onel, Melis; Kieslich, Chris A; Guzman, Yannis A et al. (2018) Simultaneous Fault Detection and Identification in Continuous Processes via nonlinear Support Vector Machine based Feature Selection. Int Symp Process Syst Eng 44:2077-2082
Low, Yen S; Alves, Vinicius M; Fourches, Denis et al. (2018) Chemistry-Wide Association Studies (CWAS): A Novel Framework for Identifying and Interpreting Structure-Activity Relationships. J Chem Inf Model 58:2203-2213

Showing the most recent 10 out of 56 publications