TGen and ASU scientists are working on a variety of research projects funded through the NIH and other sources that develop and examine molecular profiles of human diseases and fundamental pathways involved in disease states. The focus is to discern complex or simple sets of biomarkers useful for disease diagnosis and prognosis, as well as to develop molecular classification for directing optimal therapeutic choice and identifying new targets. The molecular profile datasets being analyzed cover Alzheimer's, Autism, Diabetes, Coronary Heart Disease, Malignant Gliomas, Melanoma, Pancreatic Cancer, Prostate Cancer, Colon Cancer, Multiple Myeloma, and Breast Cancer. TGen and ASU scientists examine molecular profiles from several computational perspectives, using mathematical models with varying degrees of complexity, all of which attempt to identify genes and gene networks that play crucial roles in the molecular pathology. Each of the perspectives involves some aspect of gene-gene interactions and their regulatory networks, creating a combinatorial problem where computational solutions are limited by the computer's processing power, memory size, and I/O bandwidth. Both TGen and ASU have active groups of computational biologists, bioinformaticians, biostatisticians, scientific programmers and engineers, who are working closely with biomedical and clinical scientists to develop various computational and statistical tools that address complex biomedical questions. Examples of the tools developed at TGen and ASU include the following: pooling-based analysis for genome wide association studies; SNP linkage and coverage analysis; strong feature selection algorithm; inference of gene regulatory networks with prior knowledge; Markov Chain-based simulation of gene regulatory networks; selection of cellular context based on microarray and clinical data; clustering of large microarray data; robust ? error estimation of classification and feature selection algorithms; permutation tests for significance analysis; and visualization of tandem array blocks among two or more whole genomes. In addition to the in-house developed tools, TGen and ASU scientists use a variety of open-source computational tools, such as PLINK, Haploview, STRUCTURE, MUMmer, mpiBLAST, NAMD, etc. A variety of commercial software tools, such as GeneGo MetaCore, Ingenuity, Gene Spring, Varia, Sequencer, and Mutation Surveyer, are also being used by the scientists. The data-types that are being analyzed include gene expression arrays, SNPs, CGH, siRNA, sequences, and clinical features. Many of the in-house and open-source computational tools require scalable parallel processing power and large amount of processor memory to examine the enormous complexity of the solution space. Conventional uniprocessor or vector processing machines do not provide the computational performance needed to solve many of complex research problems, which TGen and ASU scientists face, within a practical computer wall-clock time. As high-throughput measurement instruments allow scientists to collect increasingly large amounts of finegrained data, a scalable parallel supercomputer system is required for biomedical scientists to explore large volumes of complex data space and perform systems analyses that are more realistic. A scalable parallel supercomputer system with a 64-bit memory architecture and high-speed I/O allows scientists to employ optimal analytical approaches, whereas suboptimal analytical approaches are employed on a conventional computer system to avoid prohibitively protracted computer analysis time needed for optimal mathematical models and computational algorithms. For many computational problems, reading large input data files from and writing a large amount of processed results to a disk storage subsystem is a source of bottleneck in the overall computer analysis time. A 64-bit parallel cluster computing architecture with a high-bandwidth I/O and ? high-speed storage subsystem will allow efficient development and use of computational models and ? algorithms that can take a full advantage of parallel computing power. The success of TGen and ASU scientists to date has come at the sacrifice of time. However, individuals affected with disease do not have the luxury of time. The requested parallel cluster-computing instrument will optimize TGen and ASU researchers' ability to meet their data analysis and systems modeling needs efficiently, fostering timely and effective biomedical discovery for improved human health. ? ? ?
Showing the most recent 10 out of 13 publications