GROMACS: Fast, Free and Flexible MD
 
 
 
Is there any smart way to continue a run that crashed?
Friday, 09 September 2005
Yes, if the reason for the crash didn't have anything to doe with the algorithms, i.e. it was due to a system crash, a full disk, or a kill by the queuing system. Otherwise you'll have to use grompp and change the options. To really continue a simulation as if nothing had happened, you will need coordinates and velocities in full precision (i.e. .trr format). .xtc trajectories are in reduced precision (only 3 decimal places after the decimal point) and do not contain velocity information at all. Feed this trajectory and your origional .tpr file to tpbconv to obtain a new .tpr file, be sure to specify the one-but-last frame from your .trr file, since the very last frame is likely to be corrupted due to the crash. With the .tpr file tpbconv produces you can restart your simulation.

After the continuation run is finished, you will have your simulation split up in separate files, which you will probably want to combine. This can be done as follows (the same command works for xtc-files):

  trjcat -o whole.trr part1.trr part2.trr part3.trr  

The energy files can be concatenated in a similar manner:

  eneconv -o whole.edr part1.edr part2.edr part3.edr  

Since tpbconv sets the time in the continuation runs the files are automatically sorted and overlapping frames removed. If you have a mix of runs continued with tpbconv and grompp you might have to set the times yourself (see the manual pages for details).

It is of course possible to start a simulation from the coordinates in your xtc file, but in that case new velocities will have to be generated resulting in a 'kink' in the simulation. To prevent this you should write coordinates and velocities to a .trr file during your simulations. Do this by setting nstxout and nstvout in your .mdp file. You don't need these frames very often (every 10 ps or so), but remember that when mdrun crashes, everything calculated after the last frame in the .trr file, will have to be recalculated for a proper continuation.
 
Next >
 
Top! Top!
This page took 0.033541 seconds to load.