We propose to develop novel statistical methods and software tools for disease association testing with rare variants, with particular application to autism. Although genome-wide association studies have led to the discovery of many common variants reproducibly associated with various complex traits, these variants have small effect sizes and overall explain only a small fraction of the total estimated trait heritability. Recent advances in next-generation sequencing technologies allow for the first time an objective assessment of the importance of rare variants in complex diseases. Over the past few years it has become clear from numerous empirical studies that rare variants are an important contributor to disease risk. This is especially compelling for psychiatric diseases, such as schizophrenia and autism, where common disease susceptibility variants have been more difficult to identify. Traditional association testing strategies that have worked well for common variants have low power for the analysis of rare variants, mostly due to the large number of such variants in any genetic region and their low frequency counts in datasets of realistic sizes. Therefore development of powerful methods for rare variant analysis is greatly needed in order to efficiently extract information from the many sequencing datasets currently being generated. In this application we propose novel methods for both population- and family-based designs to identify rare genetic variants that influence risk to complex diseases, with particular application to autism. In particular, we focus on methods development in the following areas: family-based testing strategies for rare variants, unified testing strategies to efficiently combine family-base and population-based studies, and refinement strategies to identify causal rare variants once an overall association at a gene- or region-level has been established. We will implement the new methods in a comprehensive software package to be made available to the scientific community. Furthermore we will apply these methods to whole-exome data from 1000 autism cases, 1000 matched controls, and 500 autism trios. We believe the proposed research is very timely and has the potential to be of great public health importance through direct application to autism, and more broadly to other complex diseases.

Public Health Relevance

Autism and other psychiatric diseases are major public health problems. The proposed statistical methodology with direct application to autism will help in the identification of genetic variants influencing autism risk, with important implications for public health.

National Institute of Health (NIH)
National Institute of Mental Health (NIMH)
Research Project (R01)
Project #
Application #
Study Section
Behavioral Genetics and Epidemiology Study Section (BGES)
Program Officer
Addington, Anjene M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Columbia University (N.Y.)
Biostatistics & Other Math Sci
Schools of Public Health
New York
United States
Zip Code
Ionita-Laza, Iuliana; Xu, Bin; Makarov, Vlad et al. (2014) Scan statistic-based analysis of exome sequencing data identifies FAN1 at 15q13.3 as a susceptibility gene for schizophrenia and autism. Proc Natl Acad Sci U S A 111:343-8
Ionita-Laza, Iuliana; Lee, Seunggeun; Makarov, Vladimir et al. (2013) Family-based association tests for sequence data, and comparisons with population-based association tests. Eur J Hum Genet 21:1158-62
De, Gourab; Yip, Wai-Ki; Ionita-Laza, Iuliana et al. (2013) Rare variant analysis for family-based design. PLoS One 8:e48495
Krebs, Catharine E; Karkheiran, Siamak; Powell, James C et al. (2013) The Sac1 domain of SYNJ1 identified mutated in a family with early-onset progressive Parkinsonism with generalized seizures. Hum Mutat 34:1200-7