The unprecedented progress in the area of technologies for generating genomic data has led to an imbalance where efforts to analyze these data is now becoming the bottleneck. Common methods in the statistician?s toolbox often falter in the face of these datasets which are massive not only in the number of data points but the dimension of parameters to be estimated. Each of the four projects will be faced with these challenges. It will be the responsibility of Core C to collaborate with project researchers in developing novel computational methods and tools that scale well. As an example, Project 1 will rely heavily on MCMC and high-dimensional regression. Fitting parameters with these statistical models entail massive number of iterations, so development of innovative approaches such as data-parallel algorithms for Graphics Processing Units will be a critical activity of the core. Other projects involve deploying extensive simulations that explore a constellation of model parameterizations, assumptions about disease effects, false discovery rates, etc. To this end, we will streamline such processes with re-usable code that can be easily tailored for specific simulation experiments.

Public Health Relevance

The High Performance Computing and Simulations Core (Core C) will create pipelines for simulations and high performance software libraries and also assist project investigators with implementations. The Core will also develop new user-friendly web applications for users to quickly deploy and test new simulations.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Program Projects (P01)
Project #
5P01CA196569-02
Application #
9359367
Study Section
Special Emphasis Panel (ZCA1)
Program Officer
Rotunno, Melissa
Project Start
Project End
Budget Start
2017-07-01
Budget End
2018-06-30
Support Year
2
Fiscal Year
2017
Total Cost
Indirect Cost
Name
University of Southern California
Department
Type
DUNS #
072933393
City
Los Angeles
State
CA
Country
United States
Zip Code
90033
Ryser, Marc D; Min, Byung-Hoon; Siegmund, Kimberly D et al. (2018) Spatial mutation patterns as markers of early colorectal tumor cell mobility. Proc Natl Acad Sci U S A 115:5774-5779
Liu, Jie; Liang, Gangning; Siegmund, Kimberly D et al. (2018) Data integration by multi-tuning parameter elastic net regression. BMC Bioinformatics 19:369
Moss, Lilit C; Gauderman, William J; Lewinger, Juan Pablo et al. (2018) Using Bayes model averaging to leverage both gene main effects and G?×? E interactions to identify genomic regions in genome-wide association studies. Genet Epidemiol :
McAllister, Kimberly; Mechanic, Leah E; Amos, Christopher et al. (2017) Current Challenges and New Opportunities for Gene-Environment Interaction Studies of Complex Diseases. Am J Epidemiol 186:753-761
Raskin, Leon; Guo, Yan; Du, Liping et al. (2017) Targeted sequencing of established and candidate colorectal cancer genes in the Colon Cancer Family Registry Cohort. Oncotarget 8:93450-93463
Ritchie, Marylyn D; Davis, Joe R; Aschard, Hugues et al. (2017) Incorporation of Biological Knowledge Into the Study of Gene-Environment Interactions. Am J Epidemiol 186:771-777
Patel, Chirag J; Kerr, Jacqueline; Thomas, Duncan C et al. (2017) Opportunities and Challenges for Environmental Exposure Assessment in Population-Based Studies. Cancer Epidemiol Biomarkers Prev 26:1370-1380
Thomas, Paul D (2017) The Gene Ontology and the Meaning of Biological Function. Methods Mol Biol 1446:15-24
Manrai, Arjun K; Cui, Yuxia; Bushel, Pierre R et al. (2017) Informatics and Data Analytics to Support Exposome-Based Discovery for Public Health. Annu Rev Public Health 38:279-294
Marconett, Crystal N; Zhou, Beiyun; Sunohara, Mitsuhiro et al. (2017) Cross-Species Transcriptome Profiling Identifies New Alveolar Epithelial Type I Cell-Specific Genes. Am J Respir Cell Mol Biol 56:310-321

Showing the most recent 10 out of 28 publications