The unprecedented progress in the area of technologies for generating genomic data has led to an imbalance where efforts to analyze these data is now becoming the bottleneck. Common methods in the statistician?s toolbox often falter in the face of these datasets which are massive not only in the number of data points but the dimension of parameters to be estimated. Each of the four projects will be faced with these challenges. It will be the responsibility of Core C to collaborate with project researchers in developing novel computational methods and tools that scale well. As an example, Project 1 will rely heavily on MCMC and high-dimensional regression. Fitting parameters with these statistical models entail massive number of iterations, so development of innovative approaches such as data-parallel algorithms for Graphics Processing Units will be a critical activity of the core. Other projects involve deploying extensive simulations that explore a constellation of model parameterizations, assumptions about disease effects, false discovery rates, etc. To this end, we will streamline such processes with re-usable code that can be easily tailored for specific simulation experiments.

Public Health Relevance

The High Performance Computing and Simulations Core (Core C) will create pipelines for simulations and high performance software libraries and also assist project investigators with implementations. The Core will also develop new user-friendly web applications for users to quickly deploy and test new simulations.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Program Projects (P01)
Project #
5P01CA196569-02
Application #
9359367
Study Section
Special Emphasis Panel (ZCA1)
Program Officer
Rotunno, Melissa
Project Start
Project End
Budget Start
2017-07-01
Budget End
2018-06-30
Support Year
2
Fiscal Year
2017
Total Cost
Indirect Cost
Name
University of Southern California
Department
Type
DUNS #
072933393
City
Los Angeles
State
CA
Country
United States
Zip Code
90033
Moss, Lilit C; Gauderman, William J; Lewinger, Juan Pablo et al. (2018) Using Bayes model averaging to leverage both gene main effects and G?×? E interactions to identify genomic regions in genome-wide association studies. Genet Epidemiol :
Ryser, Marc D; Min, Byung-Hoon; Siegmund, Kimberly D et al. (2018) Spatial mutation patterns as markers of early colorectal tumor cell mobility. Proc Natl Acad Sci U S A 115:5774-5779
Liu, Jie; Liang, Gangning; Siegmund, Kimberly D et al. (2018) Data integration by multi-tuning parameter elastic net regression. BMC Bioinformatics 19:369
Ritz, Beate R; Chatterjee, Nilanjan; Garcia-Closas, Montserrat et al. (2017) Lessons Learned From Past Gene-Environment Interaction Successes. Am J Epidemiol 186:778-786
Gauderman, W James; Mukherjee, Bhramar; Aschard, Hugues et al. (2017) Update on the State of the Science for Analytical Methods for Gene-Environment Interactions. Am J Epidemiol 186:762-770
Thomas, Duncan C (2017) Estimating the Effect of Targeted Screening Strategies: An Application to Colonoscopy and Colorectal Cancer. Epidemiology 28:470-478
Rao, D C; Sung, Yun J; Winkler, Thomas W et al. (2017) Multiancestry Study of Gene-Lifestyle Interactions for Cardiovascular Traits in 610 475 Individuals From 124 Cohorts: Design and Rationale. Circ Cardiovasc Genet 10:
The Gene Ontology Consortium (2017) Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res 45:D331-D338
Mi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya et al. (2017) PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Res 45:D183-D189
Gref, Anna; Merid, Simon K; Gruzieva, Olena et al. (2017) Genome-Wide Interaction Analysis of Air Pollution Exposure and Childhood Asthma with Functional Follow-up. Am J Respir Crit Care Med 195:1373-1383

Showing the most recent 10 out of 28 publications