This subproject is one of many research subprojects utilizing theresources provided by a Center grant funded by NIH/NCRR. The subproject andinvestigator (PI) may have received primary funding from another NIH source,and thus could be represented in other CRISP entries. The institution listed isfor the Center, which is not necessarily the institution for the investigator.A Study of the Interoperability between TeraGrid and CNGrid by Experimenting Real-size Bioinformatics Applications 1. Introduction TeraGrid is known as the world's largest grid infrastructure for supporting open scientific research in the US. Current deployment of TeraGrid provides extremely large computing and storage capacity for multi-disciplinary scientific and education communities [1]. The China National Grid (CNGrid) [2], serving as a nation-scale testbed for grid technology research and development in China in its first phase(2002-2005), aims to become a production environment for supporting open e-Science activities in its second phase(2007-2010). It is sponsored by Chinese government under the Hi-tech 863 program, with an award of $13 million for first-phase and $36 million for second-phase. Current CNGrid environment has been built around the interconnection of eight national-wide supercomputing centers with a capacity of 20Tflops/s and 200TB storage. Interoperation of these two grid testbeds is beneficial to bi-lateral collaborative scientific applications, in terms of larger scale of pooled resources for sharing, uniform access interfaces for disparate underlying software stacks, and potential better quality of service. 2. Objective The main purpose of this project is to study interoperability issues and solutions between TeraGrid and CNGrid by experimenting real-size bioinformatics applications that run cross these two testbeds. Experiences gained from this project serve as a basis for CNGrid Interoperability Activity, one of sub tasks of next-phase CNGrid software [3]. Furthermore, this project could also contribute to future interoperating of world-wide main-stream grid projects, which is coined as GIN [4] at OGF. The deliverable results will reside on two levels: a) At the application level, we will explore what benefits could be gained on international testbeds for typical grid applications; b) At the middleware level, we will give proof-of-concept solution of technical issues regarding interconnection of common services, including authentication and authorization, data management, job submission, resource discovery and monitoring, and so on. 3. Tasks Two bioinformatics applications, RAPTOR [5] and Treeback [6], will be firstly taken as benchmarking applications for evaluating the interoperability of two testbeds. RAPTOR is one of the best protein structure prediction program. TreePack (called SCATD before) is a side-chain prediction program based on tree-decomposition of a protein backbone structure. Both of them can run on a SMP machine, a cluster system and a massively parallel processing platform. The RAPTOR will be made public available for use by bio-scientists via a web portal that serving a front-end to grid testbeds. After gaining enough experiences from initial experiments, PI and Co-PI of this project will work together with CNGrid application partners, to conduct large-scale bioinformatics research activities. TeraGrid employs CTSS as the middleware which consists of Globus Toolkit (v2 and v4), Condor and other utilities. CNGrid has its own software stack, named GOS [3], which adopts a service oriented approach compliant to many open standards including WS-I basic profile, WS-Security and SAML. Actually CNGrid software puts focus on VO-level management services that could take TeraGrid sites as members of its VOs. Concretely, this project could be divided into the following tasks with an expected schedule: (1) Application deployment on both selected TeraGrid and CNGrid sites and investigating how applications are executed on both software stacks ( CTSS and CNGrid GOS). (M1) (2) Develop application-level gateway (e.g. via a CNGrid portal or a desktop application) that allows CNGrid nodes to access TeraGrid bioinformatics resources, based on CNGrid software client libraries but using TeraGrid account. ( M2 ) (3) CAs of CNGrid join international PKI federations like PMA. Explore an approach for authentication and authorization with respect to CNGrid certificates on TeraGrid sites. Successfully submit and monitor bioinformatics jobs to TeraGrid sites with CNGrid certificates. (M3) (4) Doing the vice versa part of step 2,3 , to allow the access of CNGrid resources from TeraGrid (M4) (5) Conducting a series of benchmark tests on performance and overhead of applications run cross sites both from two testbeds, identify and consolidate requirements, benefits and issues for international grid testbeds. (M5-M6) 4. Justification for Resource Requests 5,000 SUs needed for executing parallel bioinformatics experimental applications. 10G storage for applications together with genome databases, 10G scratch space for test data. 5. References [1] TeraGrid, www.teragrid.org/ [2] CNGrid, www.cngrid.org/ [3] X Xie, N Xiao, Z Xu, L Zha, W Li, H Yu, CNGrid Software 2: Service Oriented Approach to Grid Computing, the proceedings of the UK e-Science All Hands Meeting, 2005 - allhands.org.uk [4] GIN, http://forge.gridforum.org/sf/go/projects.gin/wiki [5] Jinbo Xu, Ying Xu, Dongsup Kim, Ming Li. RAPTOR: Optimal Protein Threading by Linear Programming, the inaugural issue, Journal of Bioinformatics and Computational Biology, April 2003 [6] Jinbo Xu. Rapid Protein Side-Chain Packing via Tree Decomposition. RECOMB 2005.

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Biotechnology Resource Grants (P41)
Project #
5P41RR006009-17
Application #
7601478
Study Section
Special Emphasis Panel (ZRG1-BCMB-Q (40))
Project Start
2007-08-01
Project End
2008-07-31
Budget Start
2007-08-01
Budget End
2008-07-31
Support Year
17
Fiscal Year
2007
Total Cost
$299
Indirect Cost
Name
Carnegie-Mellon University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
052184116
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Simakov, Nikolay A; Kurnikova, Maria G (2018) Membrane Position Dependency of the pKa and Conductivity of the Protein Ion Channel. J Membr Biol 251:393-404
Yonkunas, Michael; Buddhadev, Maiti; Flores Canales, Jose C et al. (2017) Configurational Preference of the Glutamate Receptor Ligand Binding Domain Dimers. Biophys J 112:2291-2300
Hwang, Wonmuk; Lang, Matthew J; Karplus, Martin (2017) Kinesin motility is driven by subdomain dynamics. Elife 6:
Earley, Lauriel F; Powers, John M; Adachi, Kei et al. (2017) Adeno-associated Virus (AAV) Assembly-Activating Protein Is Not an Essential Requirement for Capsid Assembly of AAV Serotypes 4, 5, and 11. J Virol 91:
Subramanian, Sandeep; Chaparala, Srilakshmi; Avali, Viji et al. (2016) A pilot study on the prevalence of DNA palindromes in breast cancer genomes. BMC Med Genomics 9:73
Ramakrishnan, N; Tourdot, Richard W; Radhakrishnan, Ravi (2016) Thermodynamic free energy methods to investigate shape transitions in bilayer membranes. Int J Adv Eng Sci Appl Math 8:88-100
Zhang, Yimeng; Li, Xiong; Samonds, Jason M et al. (2016) Relating functional connectivity in V1 neural circuits and 3D natural scenes using Boltzmann machines. Vision Res 120:121-31
Lee, Wei-Chung Allen; Bonin, Vincent; Reed, Michael et al. (2016) Anatomy and function of an excitatory network in the visual cortex. Nature 532:370-4
Murty, Vishnu P; Calabro, Finnegan; Luna, Beatriz (2016) The role of experience in adolescent cognitive development: Integration of executive, memory, and mesolimbic systems. Neurosci Biobehav Rev 70:46-58
Jurkowitz, Marianne S; Patel, Aalapi; Wu, Lai-Chu et al. (2015) The YhhN protein of Legionella pneumophila is a Lysoplasmalogenase. Biochim Biophys Acta 1848:742-51

Showing the most recent 10 out of 292 publications