Copy number variants (CNV) are common and can be measured on a genomic scale using high throughput genotyping platforms. However, copy number estimates from existing algorithms are often inaccurate and imprecise. During the mentored phase of this Award, first generation algorithms for the locus-level estimation of copy number and hidden Markov models to identify regions of copy number gain and loss were developed. During the ROO phase these algorithms will be further improved by generalizations that broaden the scope of the problems and final versions of open source software. Specific methodologic areas areas that will be targeted during the ROO phase of this Award include improved estimates of uncertainty that distinguish between outliers and CNV and statistical models that 'borrowstrength' across loci and across samples. In parallel with the development of these methods, we will explore new approaches for CNV-phenotype inference that accommodate features of the study design (e.g., unrelated subjects versus case-parent trios) and biological characteristics of the disease. For instance, we expect that extensions of segmentation methods and principal component analysis will benefit the study of cancer genomes. Activities during the K99 phase of this Award will be helpful for achieving these goals. First, 1 successfully identified a tenure-track faculty position at the rank of Assistant Professor. Faculty in the Department of Oncology are supportive of the development of these methods. Secondly, I developed new collaborations during the K99 phase that will spur the development of statistical methods in the above areas. Finally, I participated in workshops, statistical conferences, and completed coursework in subject-relevant areas that will be helpful as I transition to the independent phase. I look forward to competing for R01 funding opportunities as new technologies and applications for genomic research emerge.

Public Health Relevance

Approximately 10% of the genome is thought to be variable in copy number. Statistical approaches to measure this variation on a genomic scale and assess its contribution to physical traits are needed.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Transition Award (R00)
Project #
5R00HG005015-04
Application #
8258323
Study Section
Special Emphasis Panel (NSS)
Program Officer
Brooks, Lisa
Project Start
2010-06-11
Project End
2013-09-30
Budget Start
2012-04-01
Budget End
2013-09-30
Support Year
4
Fiscal Year
2012
Total Cost
$242,663
Indirect Cost
$94,698
Name
Johns Hopkins University
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
001910777
City
Baltimore
State
MD
Country
United States
Zip Code
21218
Younkin, Samuel G; Scharpf, Robert B; Schwender, Holger et al. (2015) A genome-wide study of inherited deletions identified two regions associated with nonsyndromic isolated oral clefts. Birth Defects Res A Clin Mol Teratol 103:276-83
Scharpf, Robert B; Mireles, Lynn; Yang, Qiong et al. (2014) Copy number polymorphisms near SLC2A9 are associated with serum uric acid concentrations. BMC Genet 15:81
Scharpf, Robert B; Beaty, Terri H; Schwender, Holger et al. (2012) Fast detection of de novo copy number variants from SNP arrays for case-parent trios. BMC Bioinformatics 13:330
Scharpf, Robert B; Irizarry, Rafael A; Ritchie, Matthew E et al. (2011) Using the R Package crlmm for Genotyping and Copy Number Estimation. J Stat Softw 40:1-32
Halper-Stromberg, Eitan; Frelin, Laurence; Ruczinski, Ingo et al. (2011) Performance assessment of copy number microarray platforms using a spike-in experiment. Bioinformatics 27:1052-60
Scharpf, Robert B; Ruczinski, Ingo; Carvalho, Benilton et al. (2011) A multilevel model to address batch effects in copy number estimation using SNP arrays. Biostatistics 12:33-50
Leek, Jeffrey T; Scharpf, Robert B; Bravo, Héctor Corrada et al. (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11:733-9