TGen and ASU scientists are working on a variety of research projects funded through the NIH and other sources that develop and examine molecular profiles of human diseases and fundamental pathways involved in disease states. The focus is to discern complex or simple sets of biomarkers useful for disease diagnosis and prognosis, as well as to develop molecular classification for directing optimal therapeutic choice and identifying new targets. The molecular profile datasets being analyzed cover Alzheimer's, Autism, Diabetes, Coronary Heart Disease, Malignant Gliomas, Melanoma, Pancreatic Cancer, Prostate Cancer, Colon Cancer, Multiple Myeloma, and Breast Cancer. TGen and ASU scientists examine molecular profiles from several computational perspectives, using mathematical models with varying degrees of complexity, all of which attempt to identify genes and gene networks that play crucial roles in the molecular pathology. Each of the perspectives involves some aspect of gene-gene interactions and their regulatory networks, creating a combinatorial problem where computational solutions are limited by the computer's processing power, memory size, and I/O bandwidth. Both TGen and ASU have active groups of computational biologists, bioinformaticians, biostatisticians, scientific programmers and engineers, who are working closely with biomedical and clinical scientists to develop various computational and statistical tools that address complex biomedical questions. Examples of the tools developed at TGen and ASU include the following: pooling-based analysis for genome wide association studies; SNP linkage and coverage analysis; strong feature selection algorithm; inference of gene regulatory networks with prior knowledge; Markov Chain-based simulation of gene regulatory networks; selection of cellular context based on microarray and clinical data; clustering of large microarray data; robust ? error estimation of classification and feature selection algorithms; permutation tests for significance analysis; and visualization of tandem array blocks among two or more whole genomes. In addition to the in-house developed tools, TGen and ASU scientists use a variety of open-source computational tools, such as PLINK, Haploview, STRUCTURE, MUMmer, mpiBLAST, NAMD, etc. A variety of commercial software tools, such as GeneGo MetaCore, Ingenuity, Gene Spring, Varia, Sequencer, and Mutation Surveyer, are also being used by the scientists. The data-types that are being analyzed include gene expression arrays, SNPs, CGH, siRNA, sequences, and clinical features. Many of the in-house and open-source computational tools require scalable parallel processing power and large amount of processor memory to examine the enormous complexity of the solution space. Conventional uniprocessor or vector processing machines do not provide the computational performance needed to solve many of complex research problems, which TGen and ASU scientists face, within a practical computer wall-clock time. As high-throughput measurement instruments allow scientists to collect increasingly large amounts of finegrained data, a scalable parallel supercomputer system is required for biomedical scientists to explore large volumes of complex data space and perform systems analyses that are more realistic. A scalable parallel supercomputer system with a 64-bit memory architecture and high-speed I/O allows scientists to employ optimal analytical approaches, whereas suboptimal analytical approaches are employed on a conventional computer system to avoid prohibitively protracted computer analysis time needed for optimal mathematical models and computational algorithms. For many computational problems, reading large input data files from and writing a large amount of processed results to a disk storage subsystem is a source of bottleneck in the overall computer analysis time. A 64-bit parallel cluster computing architecture with a high-bandwidth I/O and ? high-speed storage subsystem will allow efficient development and use of computational models and ? algorithms that can take a full advantage of parallel computing power. The success of TGen and ASU scientists to date has come at the sacrifice of time. However, individuals affected with disease do not have the luxury of time. The requested parallel cluster-computing instrument will optimize TGen and ASU researchers' ability to meet their data analysis and systems modeling needs efficiently, fostering timely and effective biomedical discovery for improved human health. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Biomedical Research Support Shared Instrumentation Grants (S10)
Project #
1S10RR025056-01
Application #
7497855
Study Section
Special Emphasis Panel (ZRG1-BST-G (30))
Program Officer
Tingle, Marjorie
Project Start
2008-07-01
Project End
2009-06-30
Budget Start
2008-07-01
Budget End
2009-06-30
Support Year
1
Fiscal Year
2008
Total Cost
$1,996,810
Indirect Cost
Name
Translational Genomics Research Institute
Department
Type
DUNS #
118069611
City
Phoenix
State
AZ
Country
United States
Zip Code
85004
Liang, Winnie S; Aldrich, Jessica; Nasser, Sara et al. (2014) Simultaneous characterization of somatic events and HPV-18 integration in a metastatic cervical carcinoma patient using DNA and RNA sequencing. Int J Gynecol Cancer 24:329-38
Burgos, Kasandra Lovette; Javaherian, Ashkan; Bomprezzi, Roberto et al. (2013) Identification of extracellular miRNA in human cerebrospinal fluid by next-generation sequencing. RNA 19:712-22
Mooney, Marie; Bond, Jeffrey; Monks, Noel et al. (2013) Comparative RNA-Seq and microarray analysis of gene expression changes in B-cell lymphomas of Canis familiaris. PLoS One 8:e61088
Liang, Winnie S; Craig, David W; Carpten, John et al. (2012) Genome-wide characterization of pancreatic adenocarcinoma patients using next generation sequencing. PLoS One 7:e43192
Yousefi, Mohammadmahdi R; Dougherty, Edward R (2012) Performance reproducibility index for classification. Bioinformatics 28:2824-33
Weiss, Glen J; Liang, Winnie S; Izatt, Tyler et al. (2012) Paired tumor and normal whole genome sequencing of metastatic olfactory neuroblastoma. PLoS One 7:e37029
Holley, Tara; Lenkiewicz, Elizabeth; Evers, Lisa et al. (2012) Deep clonal profiling of formalin fixed paraffin embedded clinical samples. PLoS One 7:e50586
Egan, Jan B; Shi, Chang-Xin; Tembe, Waibhav et al. (2012) Whole-genome sequencing of multiple myeloma from diagnosis to plasma cell leukemia reveals genomic initiating events, evolution, and clonal tides. Blood 120:1060-6
Sima, Chao; Braga-Neto, Ulisses M; Dougherty, Edward R (2011) High-dimensional bolstered error estimation. Bioinformatics 27:3056-64
Yousefi, Mohammadmahdi R; Hua, Jianping; Dougherty, Edward R (2011) Multiple-rule bias in the comparison of classification rules. Bioinformatics 27:1675-83

Showing the most recent 10 out of 13 publications