Based on recent studies published by our group and others, it was discovered that large-scale DMA copy number variants invisible at the cytogenetic level (CNVs), are a ubiquitous characteristic of the human genome. Our findings indicated that, on average, two individuals differ by a dozen CNVs involving 3 Mb or approximately 0.1 % of the genome. This is comparable to the 0.1 % of genetic difference that is due to single nucleotide polymorphisms (SNPs). However, in contrast to nucleotide sequence variants such as SNPs, structural variation in the genome has not been well characterized. Much remains to be learned about the genomic locations, frequency, and stability of these structural variants and their importance in human evolution and genetic disease. To enable further research in this are it is necessary to expand the current knowledge of copy number variation by characterizing a large sample of individuals and constructing a database of validated CNVs. A comprehensive catalog of CNVs will facilitate large-scale studies of (1) the association of CNVs with disease risk (2) the effects of CNVs on response to drug treatment, and (3) the role of structural variation in human evolution. We propose to collect a data resource on genome copy number variation on 270 individuals from the international HapMap project using a powerful high-resolution CNV discovery method, Representational Oligonucleotide Microarray Analysis (ROMA). We will perform ROMA scans using a 380,000 probe array that provides a resolution of 8 kb. In addtion, we will integrate our data with CNV information obtain using other CNV discovery methods. We will select a set of 600 common CNVs (minor allele frequency >= 1%) for fine-scale characterization, and the boundaries of common CNVs will be defined at higher resolution using a tiling path Oligonucleotide array with a resolution of one probe every 5 bp. For a further subset of deletions and duplications, we will characterize the CNV junctions at the sequence level. Lastly, in order to integrate CNVs into the context the SNP-based HapMap, we will identify SNP markers that are in linkage disequilibrium with CNVs. All information on copy number variation will be made available through dbSNP and raw microarray data will be made available from

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Biotechnology Resource Grants (P41)
Project #
Application #
Study Section
Special Emphasis Panel (ZHG1-HGR-M (J1))
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Diego
Schools of Medicine
La Jolla
United States
Zip Code
Malhotra, Dheeraj; Sebat, Jonathan (2012) CNVs: harbingers of a rare variant revolution in psychiatric genetics. Cell 148:1223-41
Lin, Chang-Yun; Lo, Yungtai; Ye, Kenny Q (2012) Genotype copy number variations using Gaussian mixture models: theory and algorithms. Stat Appl Genet Mol Biol 11:5
Malhotra, Dheeraj; McCarthy, Shane; Michaelson, Jacob J et al. (2011) High frequencies of de novo CNVs in bipolar disorder and schizophrenia. Neuron 72:951-63
Vacic, Vladimir; McCarthy, Shane; Malhotra, Dheeraj et al. (2011) Duplications of the neuropeptide receptor gene VIPR2 confer significant risk for schizophrenia. Nature 471:499-503
Nord, Alex S; Roeb, Wendy; Dickel, Diane E et al. (2011) Reduced transcript expression of genes affected by inherited and de novo CNVs in autism. Eur J Hum Genet 19:727-31
1000 Genomes Project Consortium; Abecasis, Gonçalo R; Altshuler, David et al. (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061-73
Wang, Tao; Lin, Chang-Yun; Rohan, Thomas E et al. (2010) Resequencing of pooled DNA for detecting disease associations with rare variants. Genet Epidemiol 34:492-501
Kim, Wonkuk; Gordon, Derek; Sebat, Jonathan et al. (2008) Computing power and sample size for case-control association studies with copy number polymorphism: application of mixture-based likelihood ratio test. PLoS One 3:e3475
Kusenda, M; Sebat, J (2008) The role of rare structural variants in the genetics of autism spectrum disorders. Cytogenet Genome Res 123:36-43