Replica-Exchange Molecular Dynamics

    Replica-Exchange Molecular Dynamics (REMD) is a technique used to enhance sampling relative to a standard molecular dynamics simulations by allowing systems of similar potential energies to sample conformations at different temperatures. By doing so, energy barriers on the potential energy surface might be overcome, allowing for the exploration of new conformational space.

    The setup of a REMD simulation is actually quite straightforward. The following describes the steps that lead to have a REMD simulation running on a given system. The "success" of the simulation will depend entirely on the problem being addressed and the criterion used to judge it. Although REMD simulations are helping in increasing sampling they do not provide the ultimate answer. This should be kept in mind.

    Once the peptide and its surroundings have been determined, one needs to determine the range of temperature space to be sampled, the number of processors to use, and the length of time to simulate. The system, number of replicas, range of temperature space and the distribution of temperatures determine the average exchange probability between each replica. These should be constant across the replicas. To achieve this, given the expected increase in system potential energy with increasing temperature in the absence of known bottlenecks in the free energy space, an exponential distribution of replica temperatures should be used.

    There is good evidence to suggest that the period between exchange attempts for peptide/protein systems should not be under 1 ps. This will determine the average length of time a replica will spend exploring conformation space between exchanges, and this is more important than the actual exchange probability. In the GROMACS implementation, an exchange attempt of a particular pair will only happen at every second attempt, as the odd and even pairs attempt exchange on alternating occasions. See the GROMACS Manual.

    With all versions of GROMACS prior to 4.0, only one processor per replica is allowed. REMD with any version of GROMACS requires mdrun compiled with MPI (i.e. not threading), and that the number of processors be a multiple of the number of replicas. 

    General Steps

    1. Define the system eg: peptide + solvent (implicit or explicit).  Implicit solvent is only supported in GROMACS versions >= 4.5.
    2. Depending on the number of processors available and the range of temperature to sample (they are actually extremely dependent on each other), choose a distribution of the temperatures. Use an exponential distribution: Ti=T0*ek*i where k and T0 can be tuned to obtain reasonable temperature intervals to allow exchanges. The exponential allows an increase of temperature interval as the temperature increases. This is necessary because the distribution of the total energy increases with the temperature and thus the exchange rate increases. Keep the exchange rate constant with the temperature.
    3. Once you have the temperature distribution, equilibrate the systems at the N temperatures separately (using a separate .mdp file for each). Then construct a series of N run input files (.tpr) from the equilibrated structures, again using different .mdp files to generate the different .tpr files. If have N temperatures should have N .tpr files, named prefix_0.tpr, ... prefix_N-1.tpr.
    4. Then, run short REMD simulations to obtain an estimate of the exchange rate (can get a good estimate within ~100 ps) and modify the temperatures if it does not correspond to the desired exchange. Typically 0.2 to 0.3 is good. What is actually more important than the value of the exchange rate itself is the time that a replica will spend at a given temperature. This is given by the exchange rate x frequency of exchange. eg: an exchange rate of 0.2 with exchange trials every 2 ps gives an average time at a temperature of 10 ps. One point to consider is also the way exchanged. Try to exchange all pairs at each trial or only one pair chosen randomly. In the latter case should also take this into account, the residence time of replica has then to be multiplied by N-1, the number of pairs exchanged.
    5. The initial conformations at each temperature may be identical or different. This a choice and will depend on the reason for performing the REMD.

    A web-server that implements the step of choosing temperatures for T-REMD can be found here. It works based on known energy distributions for solvated proteins. You input the number of protein atoms, the number of water atoms, the temperature range and the desired exchange probability and then the algorithm generates temperatures.

    Execution Steps

    1. make a set of .mdp files, each specifying a different temperature being used. Number the output .tpr files according to index, from 0 to whatever (i.e., if have 10 different temperatures have prefix_0.tpr, prefix_1.tpr ... prefix_9.tpr). Prior to GROMACS 4.0, only a single processor may be used per replica, so either omit the -np flag to grompp or use -np 1. For GROMACS 4.0, grompp does not require the -np flag.
    2. the interval for replica exchange is given by the mdrun flag -replex. The number of cores must be a multiple of the number of replicas (given with -multi, which must equal the number of .tpr files i.e., 10 for the above general example using prefix_0.tpr through prefix_9.tpr). As above, that multiple must be 1 for GROMACS prior to version 4.0.  Nomenclature for the input files is critical to getting mdrun to work.
    3. For a command line example, see below. Number of steps is the number of steps required to give you the exchange attempt period desired. Other command line options may be required, e.g. -o prefix.
    GROMACS 3.x: mpirun -np 10 mdrun -s prefix_.tpr -np 10 -multi 10 -replex (number of steps) (followed by output options)
    GROMACS 4.x: mpirun mdrun -s prefix_.tpr -multi 10 -replex (number of steps) (followed by output options)

    Understanding REMD-related Output

    mdrun writes important information to your md.log file, you can extract it like this:

    % grep Repl md0.log gives:
    Initializing Replica Exchange
    Repl  There are 6 replicas:
    Repl      0     1     2     3     4     5
    Repl  T 300.0 350.0 410.0 480.0 560.0 650.0
    Repl  exchange interval: 1000
    Repl  random seed: 525106
    Repl  below: x=exchange, pr=probability
    Replica exchange at step 1000 time 2
    Repl 0 <-> 1  dE =  2.492e+00
    Repl ex  0    1    2    3    4    5
    Repl pr   .08       .00       .13
    Replica exchange at step 2000 time 4
    Repl ex  0    1    2    3 x  4    5
    Repl pr        .00       1.0

    This is indeed a very short explanation, but it should hopefully provide enough information to comprehend the results.



    GROMACS writes REMD trajectories that are continuous with respect to ensemble, but not with respect to simulation time. If you would like the latter, you can use the script scripts/ that will read your md0.log file (you can concatenate several if necessary) and produce a few output files. One of these is a .xvg file (replica_ndx.xvg) that can be passed to trjcat, along with the original trajectory files, in order to produce continuous trajectories. The other file (replica_temp.xvg) contains the temperatures for each replica, starting at the original temperature. So if your replica of interest starts at, say, 300 K, you can follow its trajectory through temperature space. It would be interesting to add some functionality to make histograms of temperature distributions for each replica, which according to most authors, should be flat. The demuxed trajectories have been used in an application (implemented in g_kinetics - not available in GROMACS 3.3.1 or earlier) to obtain protein folding kinetics from REMD trajectories Phys. Rev. Lett. 96, 238102 (2006).

    Page last modified 22:29, 2 Mar 2012 by mabraham