Two forms of genetic variation are common and can be measured on a genomic scale using recent high throughput genotyping platforms: single nucleotide polymorphisms (SNPs) and copy number variants (CNVs). Unlike high throughput genotyping algorithms that are highly accurate, copy number estimates are very imprecise and tools for estimating copy number and inferring regions of CNV are still under development. My immediate scientific goals are to provide first generation algorithms for each of the following tiers of estimation problems: (i) By locus: estimate the raw copy number at each locus on the array and quantify the uncertainty, (ii) By sample: infer regions of CNV, and (iii) Between samples: assess the contribution of CNV to disease susceptibility. My long term goal is to establish an interdisciplinary research lab in biostatistics and human genetics that supports creative computational and statistical solutions to high throughput genomic data. This Award will facilitate the necessary training and skills to transition to independent research through formal coursework in statistical genetics and computational biology, leadership opportunities in structured career development activities, such as the GWAs@JohnsHopkins working group, new collaborations from multiple research institutes, and presentations at national conferences, including epidemiological (American Heart Association), methodological (Joint Statistical Meetings), and topical (e.g., a copy number variant workshop). A scientific advisory panel of internationally recognized experts will oversee my research. New technologies and applications for genomic research developed during the course of this Award will lead to exciting new opportunities for biostatistical research, as well as R01 funding opportunities that I will actively pursue.
|Younkin, Samuel G; Scharpf, Robert B; Schwender, Holger et al. (2015) A genome-wide study of inherited deletions identified two regions associated with nonsyndromic isolated oral clefts. Birth Defects Res A Clin Mol Teratol 103:276-83|
|Scharpf, Robert B; Mireles, Lynn; Yang, Qiong et al. (2014) Copy number polymorphisms near SLC2A9 are associated with serum uric acid concentrations. BMC Genet 15:81|
|Scharpf, Robert B; Beaty, Terri H; Schwender, Holger et al. (2012) Fast detection of de novo copy number variants from SNP arrays for case-parent trios. BMC Bioinformatics 13:330|
|Scharpf, Robert B; Ruczinski, Ingo; Carvalho, Benilton et al. (2011) A multilevel model to address batch effects in copy number estimation using SNP arrays. Biostatistics 12:33-50|
|Scharpf, Robert B; Irizarry, Rafael A; Ritchie, Matthew E et al. (2011) Using the R Package crlmm for Genotyping and Copy Number Estimation. J Stat Softw 40:1-32|
|Halper-Stromberg, Eitan; Frelin, Laurence; Ruczinski, Ingo et al. (2011) Performance assessment of copy number microarray platforms using a spike-in experiment. Bioinformatics 27:1052-60|
|Leek, Jeffrey T; Scharpf, Robert B; Bravo, Héctor Corrada et al. (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11:733-9|