University of Pennsylvania has trained computational genomicists for the past 20 years supported by the NHGRI T32 program, training 52 predoctoral and 13 postdoctoral trainees the majority of whom have gone on to careers in research and development. Here we propose to continue our Computational Genomic Training program with eight predoctoral and two postdoctoral trainees focused on the theme of Data Science and Machine/Statistical Learning methods as applied to genomics data. Our training program concentrates on a rigorous course-based curriculum supported by courses in multiple graduate groups. In addition, our training also involves 13-16 hours of Responsible Conduct of Research (RCR) and Scientific Rigor and Reproducibility (SRR) training, Individual Development Plan, and utilization of Electronic Notebooks and code repositories. Research training is enhanced by a dual mentorship model whenever possible. Our program is supported by a greater genomics training program consisting of three NHGRI T32 programs in Computational Genomics (this program), Genomic Medicine, and ELSI, as well as a NHGRI R25 Diversity Action Plan (DAP). In particular, the DAP program recruits URM undergraduate and postbacc trainees focused on the Greater Philadelphia Area whose many institutions serve the urban URM population. This vertical regional integration will allow us to develop a regional pipeline of URM students with undergraduate research experience in genomics. Our program consists of 31 trainers of which 14 are female scientists and 2 are URM scientists. Twenty one of our 31 trainers have an active computational genomics research program. The expertise of the trainers span disease genomics, genomic technologies, multidimensional statistics, algorithms, data sciences, and machine learning. Our training environment is enhanced by key facilities including large biobanks, high-throughput genomics core, high-performance computing core, and a unique immersive data visualization facility. Penn overall hosts more than 60 NIH training programs with strong institutional administrative support for managing the training programs including an Office of Biomedical Postdoctoral Programs, Office of Diversity and Inclusion, combined Biomedical Graduate Studies, among others. Success of our training program will help train the next generation of genomic workforce in the skills and knowledge necessary to apply state-of-art computational techniques to genomics and develop new techniques for novel genomic data.

Public Health Relevance

This project is to create a training program for graduate students and post-graduate students as well as MD/PHD students to acquire skills to apply Data Science and Artificial Intelligence techniques to genomics of human diseases. Success of the program will contribute to training the next generation of biomedical workforce in the application of state-of-art computation to genomics data.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Institutional National Research Service Award (T32)
Project #
2T32HG000046-21
Application #
9936694
Study Section
Special Emphasis Panel (ZHG1)
Program Officer
Gatlin, Tina L
Project Start
1999-07-16
Project End
2025-04-30
Budget Start
2020-05-01
Budget End
2021-04-30
Support Year
21
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of Pennsylvania
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
042250712
City
Philadelphia
State
PA
Country
United States
Zip Code
19104
Zheng, Qi; Bartow-McKenney, Casey; Meisel, Jacquelyn S et al. (2018) HmmUFOtu: An HMM and phylogenetic placement based ultra-fast taxonomic assignment and OTU picking tool for microbiome amplicon sequencing studies. Genome Biol 19:82
SanMiguel, Adam J; Meisel, Jacquelyn S; Horwinski, Joseph et al. (2018) Antiseptic Agents Elicit Short-Term, Personalized, and Body Site-Specific Shifts in Resident Skin Bacterial Communities. J Invest Dermatol 138:2234-2243
Shields, Emily J; Sheng, Lihong; Weiner, Amber K et al. (2018) High-Quality Genome Assemblies Reveal Long Non-coding RNAs Expressed in Ant Brains. Cell Rep 23:3078-3090
Way, Gregory P; Greene, Casey S (2018) Extracting a biologically relevant latent space from cancer transcriptomes with variational autoencoders. Pac Symp Biocomput 23:80-91
Meisel, Jacquelyn S; Sfyroera, Georgia; Bartow-McKenney, Casey et al. (2018) Commensal microbiota modulate gene expression in the skin. Microbiome 6:20
Way, Gregory P; Sanchez-Vega, Francisco; La, Konnor et al. (2018) Machine Learning Detects Pan-cancer Ras Pathway Activation in The Cancer Genome Atlas. Cell Rep 23:172-180.e3
Knijnenburg, Theo A; Wang, Linghua; Zimmermann, Michael T et al. (2018) Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome Atlas. Cell Rep 23:239-254.e6
DuBois, Steven G; Mody, Rajen; Naranjo, Arlene et al. (2017) MIBG avidity correlates with clinical features, tumor biology, and outcomes in neuroblastoma: A report from the Children's Oncology Group. Pediatr Blood Cancer 64:
Mellis, Ian A; Gupte, Rohit; Raj, Arjun et al. (2017) Visualizing adenosine-to-inosine RNA editing in single mammalian cells. Nat Methods 14:801-804
Beagan, Jonathan A; Duong, Michael T; Titus, Katelyn R et al. (2017) YY1 and CTCF orchestrate a 3D chromatin looping switch during early neural lineage commitment. Genome Res 27:1139-1152

Showing the most recent 10 out of 95 publications