Getting started guide

Quick and dirty installation

In this section you will have short instructions to make a typical installation of rDock.

To get the full documentation of all rDock software package and methods, please go to the Full Documentation webpage.

Moreover, you can also check the following information:

  • Getting Started: installation and validation instructions for first-time users.

  • Validation Sets: instructions and examples for re-running the validation sets we have carried out.

  • Calculating ROC Curves: tutorial for generating ROC Curves and other statistics after running rDock docking jobs.

Installation in 3 steps

We have been able to compile rDock in the following Linux systems:

  • CentOS 5.5 64 bits

  • openSUSE 11.3 32 and 64 bits

  • openSUSE 12.3 32 and 64 bits

  • openSUSE 13.1 32 and 64 bits

  • Ubuntu 12.04 32 and 64 bits

Step 1

First of all, you will need to install several packages before compiling and running rDock:

  • gcc and g++ compilers version > 3.3

  • make

Note

For Ubuntu users:

If you are trying to use rDock in Ubuntu, please note that csh shell is not included in a default installation. We recommend to install csh in case some error arises (sudo apt-get install csh), even with all the above-stated dependencies installed.

Afterwards, download the source code compressed file or get it by SVN in Downloads section.

Step 2

Then, run the following commands:

tar -xvzf rDock_2013.1_src.tar.gz
cd rDock_2013.1_src/build/

and, for 32 bits computers:

make linux-g++

for 64 bits computers:

make linux-g++-64

Step 3

After compiling successfully, type the following command to make a test and check that your compiled version works good and the results are correct.

make test

If the test has succeed, you are done, enjoy using rDock!

Otherwise, please check your dependencies and all the previous commands or go to Support Section to ask for help.

Just as a concluding remark, don’t forget to set the necessary environmental variables for running rDock in the command line (for example, in bash shell):

export RBT_ROOT=/path/to/rDock/installation/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$RBT_ROOT/lib
export PATH=$PATH:$RBT_ROOT/bin

Overview

rDock is a fast and versatile open-source docking program that can be used against proteins and nucleic acids. It is designed for High Throughput Virtual Screening (HTVS) campaigns and Binding Mode prediction studies.

The rDock program was developed from 1998 to 2006 (formerly known as RiboDock [RiboDock2004]) by the software team at RiboTargets (subsequently Veralis (R&D) Ltd.). In 2006, the software was licensed to the University of York for maintenance and distribution. In 2012, Vernalis and the University of York agreed to release the program as open-source software. This version is licensed under GNU-LPGL version 3.0 with support from the University of Barcelona - rdock.sourceforge.net.

The major components of the platform now include fast intermolecular scoring functions (van der Waals, polar, desolvation) validated against protein and RNA targets, a Genetic Algorithm-based stochastic search engine, a wide variety of external Structure-Based Drug Discovery (SBDD) derived restraint terms (tethered template, pharmacophore, noe distance restraints), and novel Genetic Programming-based post-docking filtering. A variety of scripts are provided to perform automated validation experiments and to launch virtual screening campaings.

This introductory guide is aimed at new users of rDock. It describes the minimal set of steps required to build rDock from the source code distribution, and to run one of the automated validation experiments provided in the test suite distribution. The instructions assume that you are comfortable with simple Linux command line administration tasks, and with building Linux application from make files. Once you are familiar with these steps you should proceed to the User and Reference Guide for more detailed documentation on the usage of rDock.

Prerequisites

Compilers rDock is supplied as source code, which means that you will have to compile the binary files (run-time libraries and executable programs) before you use them. rDock has been developed largely on the Linux operating systems, most recently with GNU g++ compiler (tested under openSUSE 11.3). The code will almost certainly compile and run under other Linux distributions with little or no modification. For the momemnt, it has been tested in the latest Ubuntu and openSUSE releases for both 32 and 64 bits system architectures (by November 2013) and compilation was possible without any code modification. However, no other distributions or compilers have been tested extensively to date.

For full production use, you would typically compile rDock on a separate build machine and run the docking calculations on a cluster of compute machines. However, for the purposes of getting started, these instructions assume that you will be compiling rDock and running the initial validation experiments on the same machine.

Required packages Make sure you have the following packages installed on your machine before you continue. The versions listed are appropriate for openSUSE 11.3; other versions may be required for your particular Linux distribution.

Table 1 Required packages for building and running rDock

Package

Description

Required at

Version

gcc

GNU C compiler

Compile-time

>=3.3.4

g++

GNU C++ compiler

Compile-time

>=3.3.4

Unpacking the distribution files

The rDock source files and test suite files are provided as indepedent gzipped tar (.tar.gz) distributions. Depending on your requirements, the two distributions can be unpacked to entirely separate locations, or can be unpacked under the same location. In this example they are unpacked under the same location.

Table 2 rDock distribution files

File

Description

rDock_[CODELINE]_src.tar.gz

rDock source distribution

[TEST]_rDock_TestSet.tar.gz

Test suite data files and scripts

where [CODELINE], and [TEST] will vary depending on the release and test set. [CODELINE] represents the major version string (for example, 2013.1) and [TEST] represents the given dataset (ASTEX, RNA or DUD).

Procedure: Example unpacking procedure

Create a new directory for building rDock.

$ mkdir ~/dev

The directory you created is referred to as [BUILDDIR] in the subsequents steps.

Copy or download the distribution files to [BUILDDIR].

$ cp ~/mydownloads/rDock_2013.1_src.tar.gz ~/dev/

Extract the distributions.

$ cd ~/dev/
$ tar -xvzf rdock_2013.1_src.tar.gz

The distributions contain files with relative path names, and you should find the following subdirectories created under rDock_[CODELINE]_src. Note that the ./rDock_2013.1_src subdirectory may have a different name depending on the major version string (see above).

$ find . -type d
.
./fw
./src
./src/daylight
./src/lib
./src/exe
./src/GP
./build
./build/test
./build/test/RBT_HOME
./build/tmakelib
./build/tmakelib/linux-pathCC-64
./build/tmakelib/linux-g++-64
./build/tmakelib/linux-g++
./build/tmakelib/unix
./data
./data/filters
./data/sf
./data/pmf
./data/pmf/smoothed
./data/scripts
./lib
./import
./import/tnt
./import/tnt/include
./import/simplex
./import/simplex/src
./import/simplex/include
./docs
./docs/images
./docs/newDocs
./include
./include/GP
./bin

Make a note of the following locations for later use.

The rDock root directory is [BUILDDIR]/rDock_[CODELINE]_src and will be referred to as [RBT_ROOT] in later instructions. In this example, [RBT_ROOT] is /dev/rDock_2013.1_src/.

Building rDock

rDock is written in C++ (with a small amount of C code form Numerical Reciper) and makes heavy use of the C++ Standard Template Library (STL). The majority of the source code is compiled into a single shared library (libRbt.so). The executable programs themselves are relatively light-weight command-line applications linked with libRbt.so.

The tmake build systems (from Trolltech) is used to generate makefiles automatically for a particular build target (i.e. combination of operating system and compiler). The source distribution comes with tmake templates defining the compiler options and flags for three Linux build targets (linux-g++ and linux-g++-64). The build targets have been tested under openSUSE 11.3 (2.6.34.10-0.2 kernel) with GNU g++ (version 3.3.4, 4.5.0 and 4.7.2).

Table 3 Standard tmake build targets provided

Target Name

Architecture

Compiler

Compiler flags (release build)

linux-g++

32-bit

g++

-m32 -O3 -ffast-math

linux-g++-64

64-bit

g++

-m64 -O3 -ffast-math

Customising the tmake template for a build target. If none of the tmake templates are suitable for your machine, or if you wish to customise the compiler options, you should first customise one of the existing templates. The tmake template files are stored under [RBT_ROOT]/build/tmakelib/. Locate and edit the tmake.conf file for the build target you wish to customise. For example, to customise the linux-g++ build target, edit [RBT_ROOT]/build/tmakelib/linux-g++/tmake.conf and localise the values to suit your compiler.

Procedure: rDock build procedure

To build rDock, first go to the [RBT_ROOT]/build/ directory.

$ cd [RBT_ROOT]/build

Compile

Make one of the build targets listed below.

$ make linux-g++
$ make linux-g++-64

Test

Run the rDock unit tests to check build integrity. If no failed tests are reported you should be all set.

$ make test

Cleanup (optional)

To remove all intermediate build files from [RBT ROOT]/build/, leaving just the final executables (in [RBT ROOT]/bin/) and shared libraries (in [RBT ROOT]/lib/):

$ make clean

To remove the final executables and shared libraries as well, returning to a source-only distribution:

$ make distclean

Validation experiments

In this section (in rDock webpage) you will find the instructions about how to reproduce our validation experiments using different test sets. Three different sets were analyzed for three different purposes:

  • ASTEX set for binding mode prediction in Proteins.

  • RNA set for assess RNA-ligand docking.

  • DUD set for database enrichment.

Binding Mode Prediction in Proteins

First of all, please go to sourceforge download page to download a compressed file with the necessary data.

After downloading the file ASTEX_rDock_TestSet.tar.gz, uncompress the file with the following command, which will create a folder called ASTEX_rDock_TestSet:

tar -xvzf ASTEX_rDock_TestSet.tar.gz
cd ASTEX_rDock_TestSet/

Here you will have the instructions for one of the systems (1sj0), to run with the rest of the systems, just change the pdb code with the one desired. Then, make sure that the necessary environmental variables for running rDock are well defined and run the following commands for entering to the folder and running rDock with the same settings that we have used:

cd 1sj0/

#first create the cavity using rbcavity
rbcavity -r 1sj0_rdock.prm -was > 1sj0_cavity.log

#then use rbdock to run docking
rbdock -r 1sj0_rdock.prm -p dock.prm -n 100 -i 1sj0_ligand.sd \
-o 1sj0_docking_out > 1sj0_docking_out.log

#sdsort for sorting the results according to their score
sdsort -n -f’SCORE’ 1sj0_docking_out.sd > 1sj0_docking_out_sorted.sd

#calculate rmsd from the output comparing with the crystal structure of the ligand
sdrmsd 1sj0_ligand.sd 1sj0_docking_out_sorted.sd

Binding Mode Prediction in RNA

In a similar way of the section above, here you will find a brief tutorial on how to run rDock with the RNA TestSet used in the validation. As in the first section, please go to sourceforge download page to download a compressed file with the necessary data.

After downloading the file RNA_rDock_TestSet.tar.gz, uncompress the file with the following command, which will create a folder called RNA_rDock_TestSet:

tar -xvzf RNA_rDock_TestSet.tar.gz
cd RNA_rDock_TestSet/

Here you will have the instructions for one of the systems (1nem), to run with the rest of the systems, just change the pdb code with the one desired. Then, make sure that the necessary environmental variables for running rDock are well defined and run the following commands for entering to the folder and running rDock with the same settings that we have used (if you have run the previous set, the variables should already be correctly defined):

cd 1nem/

#first create the cavity using rbcavity
rbcavity -r 1nem_rdock.prm -was > 1nem_cavity.log

#then use rbdock to run docking
rbdock -r 1nem_rdock.prm -p dock.prm -n 100 -i 1nem_lig.sd \
-o 1nem_docking_out > 1nem_docking_out.log

#sdsort for sorting the results according to their score
sdsort -n -f’SCORE’ 1nem_docking_out.sd > 1nem_docking_out_sorted.sd

#calculate rmsd from the output comparing with the crystal structure of the ligand
sdrmsd 1nem_lig.sd 1nem_docking_out_sorted.sd

Database Enrichment (actives vs decoys - for HTVS)

In this section you will find a brief tutorial on how to run rDock with the DUD TestSet used in the validation and how to perform different analysis of the results. As in the sections above, please go to sourceforge download page to download a compressed file with the necessary data.

After downloading the file DUD_rDock_TestSet.tar.gz, uncompress the file with the following command, which will create a folder called DUD_rDock_TestSet:

tar -xvzf DUD_rDock_TestSet.tar.gz
cd DUD_rDock_TestSet/

Here you will have the instructions for one of the systems (hivpr), to run with the rest of the systems, just change the DUD system code with the one desired. Then, make sure that the necessary environmental variables for running rDock are well defined and run the following commands for entering to the folder and running rDock with the same settings that we have used (if you have run the previous sets, the variables should already be correctly defined):

cd hivpr/

#first create the cavity using rbcavity
rbcavity -r hivpr_rdock.prm -was > hivpr_cavity.log

As the number of ligands to dock is very high, we suggest you to use any distributed computing environments, such as SGE or Condor, and configure rDock to run in multiple CPUs. Namely, split the input ligands file in as many parts as desired (very easy using sdsplit tool) and run independent rDock docking jobs for each “splitted” input file. However, for this example purpose, you will have the instructions for running all set of actives and decoys in one docking job:

#uncompress ligand file
gunzip hivpr_ligprep.sdf.gz

#use rbdock to run docking
rbdock -r hivpr_rdock.prm -p dock.prm -n 100 -i hivpr_ligprep.sdf \
-o hivpr_docking_out > hivpr_docking_out.log

#sdsort with -n and -s flags will sort internally each ligand by increasing score and
#sdfilter will get only the first entry of each ligand.
sdsort -n -s -fSCORE hivpr_docking_out.sd | sdfilter -f’$_COUNT == 1’ > hivpr_1poseperlig.sd

#sdreport will print all the scores of the output in a tabular format and,
#with command awk, we will format the results
sdreport -t hivpr_1poseperlig.sd | awk ’{print $2,$3,$4,$5,$6,$7}’ > dataforR_uq.txt

At this point, you should have a file called hivpr docking out.sd with all docking poses written by rDock (100 * input ligands), a file called hivpr 1poseperlig.sd with the best scored docking pose for each ligand and a file called dataforR uq.txt that will be used for calculating ROC Curves using R. The next step is to calculate ROC Curves and other statistics. To do so, please visit section How to calculate ROC curves and jump to the subsection “R Commands for generating ROC Curves”.