The LoBoS high performance computing system continues to be developed as a resource for scientists within the Laboratory of Computational Biology and their collaborators. As in previous years, improvements to the LoBoS system are driven primarily by continuous improvements in the price-performance ratio of common off the shelf (COTS) computer hardware. In FY-2013, 12 new nodes were procured;these will be added to the cluster in FY-2014. Furthermore, substantial improvements in power and cooling capacity of the Laboratory of Computational Biologys two machine rooms are being made to accommodate new equipment being procured by the groups of Dr. Jose Faraldo-Gomez and Dr. Lucy Forrest. These groups have procured 256 nodes to be dedicated to their computational needs. In addition, two new nodes with powerful general purpose graphics processing unit (GP-GPU) capabilities based on Nvidia's """"""""Kepler"""""""" architecture were procured. As more molecular dynamics and simulation codes, including CHARMM, develop advanced features to take advantage of GP-GPU co-processors, the deployment of these nodes and continued presence of existing GPU hardware will allow lab staff to keep up to date with the latest computing developments. The new Kepler nodes will contain two GP-GPU coprocessors each, allowing for extensive testing and production runs using a multi-GPU setup. Developments in the CHARMM molecular simulation package are also ongoing. In the last year, substantial developments have been made to the replica exchange code, most notably the development of a FAST key-word for replica exchange and new code combining Hamiltonian replica exchange with temperature and SGLD replica exchange. The fast replica exchange code substantially reduces the parallel communication needed for simple temperature based calculations (which are the ones most commonly used), leading to a noticeable speed-up, especially when repeated or frequent exchange attempts are made. Finally, a development summit was held in the lab in August, 2013, that led to a substantial amount of code clean-up. Development is also active on the CHARMMing web interface to CHARMM. In FY-2013, a new version has been released with substantial front and back end improvements. The visualization system has been entirely revamped, with users now having the option of using JSMol or GLMol for structure visualization. Additionally, users may now graphically build custom ligands or import them from various databases. Automatic parameter generation tools using CGenFF and ParamChem developed by the MacKerell group have been fully integrated. Additionally, support for setting up multiscale modeling using MSCALE has been added to CHARMMing. This new functionality has been integrated with the new graphical atom selection developed for QM/MM. This allows for a seamless graphical setup of both additive and subtractive QM/MM calculation using CHARMMing. We have developed a web-based tool for SAR and QSAR modeling to add to the services provided by charmming.org. It is an implementation of one of the most recent advances in modern machine learning algorithms Random Forests. The tool allows a user to create his own models based on submitted training sd files which combine structures with activity information (either categorical or numerical), track the model generation process and run created models on the new data to predict activity. The whole process is presented in a straightforward, user-friendly manner with each step prompting the user for the next action so that even a first time visitor to the web service can feel confident on what stage of the process he or she is currently situated. The SAR/QSAR tool also automatically verifies new models by using well-known machine learning techniques such as cross-validation and y-randomization so users can immediately see whether the created model is able to calculate valid predictions. This is an important and often missed step in QSAR modeling. A user is presented with AUC measurements for the training set, for y-randomized set and an average AUC for 5-fold cross-validation for categorical modeling. A prediction score as well as active/inactive labels and the recommended threshold are the output of the prediction. The threshold is automatically recommended based on the balance between recall and precision. Recall and precision for the training set are also displayed for the user. For numerical modeling the process is similar except that R2 is used instead of AUC and there is no recall and precision of course. It is also easy to validate the model on an external validation set if the tool finds activity score already present in the prediction file it will automatically compute precision and recall measures or R2. The new SAR/QSAR tool can be used as a stand-alone utility or as a supporting filter for the docking procedure. Multipoles-Multipole interactions Current course grained modeling techniques of both proteins and lipids have difficulty accurately modeling electrostatic interactions of biomolecules. One source of error in CG models is that the partial charges present in all atom models are lost during the coarse graining process, leaving many electronically neutral CG beads behind. The resulting potentials are isotropic in nature, and have difficulty reproducing atomistic properties. The most dramatic example of this phenomenon comes from the widely used MARTINI CG water model, which often freezes at biological temperatures. To incorporate some of the electrostatic detail from all atom models, we proposed to augment the popular MARTINI model with dipole and quadrupole information. Towards this end we have performed an efficient, arbitrary order multipole implementation in CHARMM (work with Andrew Simmonett, VT) using a novel combination of the quasi-internal reference frame combined with a spherical harmonic expansion. We have also extended the particle mesh Ewald capabilities of CHARMM to allow for the calculation of multipole potentials and gradients in reciprocal space. This will allow our multipole implementation to have actual real world use capabilitity. The resulting work will not only allow us to pursue our CG model, but it will allow other multipole based models and forcefields to be used by the greater CHARMM community. We have recently begun work on implementing the SSDQO water model (Toshiko Ichiye, Georgetown) into the CHARMM package. The current SSDQO implementation is done through MSCALE, which is highly inefficient and does not parallelize well. Our implementation will supersede this work. Our implementation might also supersede the less general CHARMM package MTP (Markus Meuwly, University Basel). Discussions have begun on how to incorporate their functionality into our more efficient and fully featured framework. In the coming year we will extend our fixed multipole code to account to also compute polarizable dipole contributions to the potential and gradients. This added capability will allow the use of the highly accurate AMOEBA forcefield (Jay Ponder, Washington University) in CHARMM. Furthermore, our implementation will be much faster than that in the TINKER software package, and slightly faster than the AMBER implementation. Having the AMOEBA forcefield in CHARMM will give users a variety of options when choosing polarizable forcefields.
Showing the most recent 10 out of 15 publications