The goal of the Integrated Biostatistics and Bioinformatics Analysis Core (IBBAC;pronounced """"""""eye-back"""""""") is to provide analysis tools, data analyses, and access to state-of-the-field tools, expertise, and leadership for the integrated or combined analysis of data arising from the Clinical Phenotype: Recruitment and Clinical Assessment (B) and Clinical Phenotype: Treatment Response (C) Cores as well as the four proposed projects. The IBBAC will take advantage of both existing analysis methods and tools for high-dimensional data types (e.g., partial least squares, support vector machines, cluster analysis techniques, etc.) as well as novel methods and extensions of existing approaches for analyzing the data generated by the UCSD ACE researchers. The ultimate aim of this research is to identify a unique set of clinical, subclinical (e.g., imaging-based phenotypes), and genomic endpoints (or """"""""fingerprints"""""""") that are correlated with Autism Spectrum Disorder (ASD) and/or Developmental Delay (DD) and are distinct from features found in typically-developing children. The biological-meaning of the identified """"""""fingerprints"""""""" of ASD and DD emerging from these analyses will be a major consideration in assessing their validity;i.e., consistency of these fingerprints with the main motivating hypothesis of the center, which is that early postnatal brain overgrowth is the hallmark of ASD/DD pathogenesis. The need for novel multivariate data analysis methods in neuropsychiatric and behavioral genetics research of the type proposed has grown considerably with the introduction of data intensive technologies such as large-scale genotyping assays and gene expression microarrays. In addition, information-intensive phenotyping assays such as imaging technologies, multiplex behavorial assessments/elaborate psychometric exams, and large-scale endophenotype and/or cognitive assessment strategies - that could be used to complement genomic technologies - have been introduced which create further needs for appropriate multivariate analysis methods. Although there is considerable research in the development of mathematical models of multiparameter biological processes (e.g., gene transcription) as well as data mining/pattern discovery strategies for genomic technologies, there is less research on, and actual implementation of, the development of hypothesis-oriented multivariate data analysis methodologies that consider the information produced by genomic and multiplex phenotyping technologies either in isolation or in combination. The proposed IBBAC activity will consider the development, deployment, and interpretation of novel multivariate analysis methods appropriate for drawing meaningful inferences from the high-dimensional genomic and phenotypic data generated as part of the proposed UCSD ACE research. Some of the proposed data analysis methodologies build off and extend a few fundamental multivariate techniques (e.g., the analysis of similarity and distance, multivariate regression, and variance component models).
Showing the most recent 10 out of 74 publications