Software for the analysis of large-scale genotyping and sequencing studies

Purcell, Shaun

Abstract

The objective of this project is to develop software for the analysis of data from large- scale genotyping and sequencing genetic studies, building on the existing software package PLINK. PLINK, a software tool to manipulate and analyze whole-genome SNP datasets that has been actively developed over the past four years and has a wide base of users.
The specific aims are to significantly upgrade core capacities, the interface, auxiliary resources and user-support: Core capacities: significantly adapt and upgrade data-storage capacities to handle a) order-of-magnitude larger datasets than can fit into memory and b) a more generic, unified representation of different types of genetic variation data and meta-information. Interface: extend the existing interface to provide a) a looser coupling between data storage and analysis components, via multiple interfaces in external languages, including standard bioinformatics tools such as R and Perl, and b) features designed to facilitate reproducible research and parallel processing. Auxiliary resources: package standard existing resources, including the functional annotation of variants, reference genome sequences and gene assemblies, pathways and ontologies, in a manner that allows seamless integration between genomic resources and user data. Support: create a high-quality collection resources to support users, via online documentation and tutorials, including user-generated wiki pages, e-mail support and an annual training course. Particular attention will be paid to ensure interoperability with other major software, file-formats and resources that are generated by the broader genetics community.

Public Health Relevance

This Project is to develop software for the analysis of large datasets from modern genetic studies. New high-throughput genotyping and sequencing technologies are capable of producing vast amounts of data, but there is a need for analytic tools that biomedical researchers can use. These studies have the potential to uncover genetic determinants for a large number of diseases and traits, which can be relevant for prediction of risk, and give insight into novel targets for treatments.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 1R01HG005827-01
Application #: 7934359
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Brooks, Lisa

Project Start: 2010-09-27
Project End: 2013-06-30
Budget Start: 2010-09-27
Budget End: 2011-06-30
Support Year: 1
Fiscal Year: 2010
Total Cost: $319,350
Indirect Cost

Institution

Name: Massachusetts General Hospital
Department
Type
DUNS #: 073130411

City: Boston
State: MA
Country: United States
Zip Code: 02199

Related projects


NIH 2016 R01 HG	Software for the analysis of large-scale genotyping and sequencing studies Purcell, Shaun M. / Brigham and Women's Hospital	$370,000
NIH 2015 R01 HG	Software for the analysis of large-scale genotyping and sequencing studies Purcell, Shaun M. / Icahn School of Medicine at Mount Sinai
NIH 2014 R01 HG	Software for the analysis of large-scale genotyping and sequencing studies Purcell, Shaun M. / Icahn School of Medicine at Mount Sinai	$370,000
NIH 2012 R01 HG	Software for the analysis of large-scale genotyping and sequencing studies Purcell, Shaun M. / Icahn School of Medicine at Mount Sinai	$335,610
NIH 2011 R01 HG	Software for the analysis of large-scale genotyping and sequencing studies Purcell, Shaun M. / Massachusetts General Hospital	$95,568
NIH 2011 R01 HG	Software for the analysis of large-scale genotyping and sequencing studies Purcell, Shaun M. / Icahn School of Medicine at Mount Sinai	$244,092
NIH 2010 R01 HG	Software for the analysis of large-scale genotyping and sequencing studies Purcell, Shaun M. / Massachusetts General Hospital	$319,350

Publications

Grinde, Kelsey E; Qi, Qibin; Thornton, Timothy A et al. (2018) Generalizing polygenic risk scores from Europeans to Hispanics/Latinos. Genet Epidemiol :

Ruderfer, Douglas M; Charney, Alexander W; Readhead, Ben et al. (2016) Polygenic overlap between schizophrenia risk and antipsychotic response: a genomic medicine approach. Lancet Psychiatry 3:350-7

Ruderfer, Douglas M; Hamamsy, Tymor; Lek, Monkol et al. (2016) Patterns of genic intolerance of rare copy number variation in 59,898 human exomes. Nat Genet 48:1107-11

Rees, E; Kirov, G; Walters, J T et al. (2015) Analysis of exome sequence in 604 trios for recessive genotypes in schizophrenia. Transl Psychiatry 5:e607

Sham, Pak C; Purcell, Shaun M (2014) Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet 15:335-46

Purcell, Shaun M; Moran, Jennifer L; Fromer, Menachem et al. (2014) A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506:185-90

Fromer, Menachem; Pocklington, Andrew J; Kavanagh, David H et al. (2014) De novo mutations in schizophrenia implicate synaptic networks. Nature 506:179-84

Fromer, Menachem; Purcell, Shaun M (2014) Using XHMM Software to Detect Copy Number Variation in Whole-Exome Sequencing Data. Curr Protoc Hum Genet 81:7.23.1-21

Fromer, Menachem; Moran, Jennifer L; Chambert, Kimberly et al. (2012) Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet 91:597-607

Ruderfer, D M; Kirov, G; Chambert, K et al. (2011) A family-based study of common polygenic variation and risk of schizophrenia. Mol Psychiatry 16:887-8

Comments

Be the first to comment on Shaun Purcell's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: