The objective of this project is to develop software for the analysis of data from large- scale genotyping and sequencing genetic studies, building on the existing software package PLINK. PLINK, a software tool to manipulate and analyze whole-genome SNP datasets that has been actively developed over the past four years and has a wide base of users.
The specific aims are to significantly upgrade core capacities, the interface, auxiliary resources and user-support: Core capacities: significantly adapt and upgrade data-storage capacities to handle a) order-of-magnitude larger datasets than can fit into memory and b) a more generic, unified representation of different types of genetic variation data and meta-information. Interface: extend the existing interface to provide a) a looser coupling between data storage and analysis components, via multiple interfaces in external languages, including standard bioinformatics tools such as R and Perl, and b) features designed to facilitate reproducible research and parallel processing. Auxiliary resources: package standard existing resources, including the functional annotation of variants, reference genome sequences and gene assemblies, pathways and ontologies, in a manner that allows seamless integration between genomic resources and user data. Support: create high-quality collection resources to support users, via online documentation and tutorials, including user-generated wiki pages, e-mail support and an annual training course. Particular attention will be paid to ensure interoperability with other major software, file-formats and resources that are generated by the broader genetics community.
This Project is to develop software for the analysis of large datasets from modern genetic studies. New high-throughput genotyping and sequencing technologies are capable of producing vast amounts of data, but there is a need for analytic tools that biomedical researchers can use. These studies have the potential to uncover genetic determinants for a large number of diseases and traits, which can be relevant for prediction of risk, and give insight into novel targets for treatments.
|Ruderfer, Douglas M; Hamamsy, Tymor; Lek, Monkol et al. (2016) Patterns of genic intolerance of rare copy number variation in 59,898 human exomes. Nat Genet 48:1107-11|
|Ruderfer, Douglas M; Charney, Alexander W; Readhead, Ben et al. (2016) Polygenic overlap between schizophrenia risk and antipsychotic response: a genomic medicine approach. Lancet Psychiatry 3:350-7|
|Rees, E; Kirov, G; Walters, J T et al. (2015) Analysis of exome sequence in 604 trios for recessive genotypes in schizophrenia. Transl Psychiatry 5:e607|
|Fromer, Menachem; Purcell, Shaun M (2014) Using XHMM Software to Detect Copy Number Variation in Whole-Exome Sequencing Data. Curr Protoc Hum Genet 81:7.23.1-21|
|Purcell, Shaun M; Moran, Jennifer L; Fromer, Menachem et al. (2014) A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506:185-90|
|Fromer, Menachem; Pocklington, Andrew J; Kavanagh, David H et al. (2014) De novo mutations in schizophrenia implicate synaptic networks. Nature 506:179-84|
|Sham, Pak C; Purcell, Shaun M (2014) Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet 15:335-46|
|Fromer, Menachem; Moran, Jennifer L; Chambert, Kimberly et al. (2012) Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet 91:597-607|
|Ruderfer, D M; Kirov, G; Chambert, K et al. (2011) A family-based study of common polygenic variation and risk of schizophrenia. Mol Psychiatry 16:887-8|