Gromacs

Programmer's Guide

    Introduction

    After debugging through the GROMACS sources for a while in order to add a specific effect I thought I could save others some trouble and create this summary of how GROMACS, or more specifically mdrun works. So if you want to know which data structure inside mdrun holds which data at runtime, how to add new parameters to the mdp input file or maybe just to get a general feeling about how mdrun propagates the system from one state to the other then this is the right place.

    Disclaimer

    This summary reflects the state of GROMACS 4.0.2 and does not claim to be complete. The information was gathered while adding code to the md code path without qm running single core and with mpi based domain decomposition. While this should be the most commonly used code path there might be other paths which exspect additional constrains on the data. Your mileage may vary.

    Notes

    If you are not looking for help on a specific data structure but want to see how mdrun works in general then look here!

    If you haven't been working with automake based projects before then you might apreciate a few hints to get you started quickly. I have tried most of the IDEs listed here and quite a few others on current ubuntu (8.10). Non of them actually is perfect and most of them miss quite a few basic features which I'm used to (mostly the features are claimed to be supported but they work only in special cases). The only IDE I found that handles automake projects gracefully (so you can compile distinct binaries instead of the whole project without much pain and also debug them) is kdevelop (3.5.3). You can import the gromacs sources into a project and handle compilation and debugging via the automake manager. Sadly, the auto completion of struct members doesn't work for the slightly archaic way the structs are defined in gromacs. This is supposed to be fixed in kdevelop 4 which is still beta.

    I added code examples where apropriate but you'll probably want to look at the gromacs source code while using this. Where possible I added comments to the individual struct members and a general comment on what's going on below the struct definition. If there is no helpful comment next to a member or below the struct then I didn't know any better. Please add new or improved comments where possible.

    Debugging

    A few hints to debug mdrun(gromacs kernel).

    • gromacs can be forced to use non assembly routines to do force calculations (easier to debug) by defining "NOASSEMBLYLOOPS=1" as environment variable (e.g. put "export NOASSEMBLYLOOPS=1" without the quotes in your /etc/profile)
    • Parallel simulations can be debugged easily with gdb (the GNU debugger) by making a script file and a debug command file. The debug script DEBUG something looks like:

    #!/bin/csh -f
    xterm -e gdb -x gdb_cmds /home/hessb/gmx/obj/g_x86_64/src/kernel/mdrun
    

     

    The gdb_cmds file can contain any GDB debug commands. One would usually want a breakpoint in the fatal error routine. If you want to debug in do_force, the command file could look like this:

    break _gmx_error
    break do_force
    run -v
    

     

    To debug 4 processes, simple run the script in parallel which will pop up 4 xterms with gdb output:

    mpirun -np 4 ./DEBUG
    

     

    • When you do not have access to gdb, parallel simulations can be debugged by adding something like the following to main():

    //NEW
    void wait_debugger()
    {
       int i = 0;
       char hostname[256];
       gethostname(hostname, sizeof(hostname));
       printf("PID %d on %s ready for attach\n", getpid(), hostname);
       fflush(stdout);
       while (0 == i)// break here
           sleep(5);
    }
    //END NEW
    ...
    int main(int argc,char *argv[])
    {
    //NEW
      bool bWaitForDebugger;
    //END_NEW
    ...
     //list of command line arguments to be parsed
     static t_pargs pa[] = {
       { "-pd",      FALSE, etBOOL,{&bPartDec},
         "Use particle decompostion" },
    ...
    //NEW; new command line argument:
       { "-mpidebug", FALSE, etBOOL, {&bWaitForDebugger},
         "mpidebug" }
    //END_NEW
      }; 
    ...
     parse_common_args(&argc,argv,PCA_Flags,
       NFILE,fnm,asize(pa),pa,asize(desc),desc,0,NULL);
    //NEW (must be after parse_common_args())
     if(bWaitForDebugger)
       wait_debugger();//if "-mpidebug 1" is passed as command line arg then this is called
    //END_NEW
    

     


    The code modifications indicated above cause all processes spawned by mpirun to enter an infinite loop after printing their process id and host if the new command line option -mpidebug is set to 1 (default 0). So you can start mdrun using mpi on your local system like this:

    mpirun -np 2 $GROMACS_PATH/mdrun -s input.tpr -c coords.gro -o out.trr -mpidebug 1
    

    This will spawn 2 processes, more are possible (even if you're on a single core machine).

    Afterwards you attach one debugger to each process id printed to stdout (the text might be somewhere in between the standard gromacs output). Then you break at the while loop ("break here" comment in the code above) and modify i = 1 using the debugger (FOR EACH NODE/PROCESS ID!). Now you can set additional breakpoints and start debugging. There are nicer ways to do this but for simple stuff this is ok.

    • There is a set of tests provided with gromacs. If you intend to submit your changes you should really make sure you have tested them with both MPI support and the tests so the likely hood of breaking gromacs is minimized.

    Data Structures

    See Data Structures for an introduction to the data structures used inside the simulation kernel and related applications.

    How does mdrun work?

    The program mdrun is the main program of GROMACS for molecular dynamics simulations and other purposes (energy minimization, normal mode analysis, NMR refinement, etc).

    What follows is a list of functions approximately in the order they are called by mdrun with a few descriptions on what they do and other notes. The source code (mainly md.c and mdrun.c) for mdrun can be found under the directory src/kernel.

    main()

    (mdrun.c) The main() function prepares a communication record cr, parses command line arguments, and opens new log files. Then it passes these information to mdrunner().

    • cr = init_par() : initialize the communication record cr.
    • parse_common_args(): parse program parameters.
    • init_multisystem(cr) : modify the communication record cr for multiple-system simulations (e.g., replica exchange).
    • prepare other parameters, and call mdrunner().

    mdrunner()

    (md.c)

    The mdrunner() function first prepares an input record, a topology record, a force record, as well as runtime information, such as local topology and system configuration, then calls an integrator such as do_md() to perform simulation.

    This calls one of ({ {do_md}, {do_steep}, {do_cg}, {do_md}, {do_md}, {do_nm}, {do_lbfgs}, {do_tpi}, {do_tpi}, {do_md} }) depending on the type of integrator defined in the mdp file. From here on we will assume that do_md was selected!

    do_md()

    (md.c)

    • dd_partition_system(): distribute atoms among clients (if MPI is enabled), creates an atom index
    • atoms2md(): translates the atom type information from gmx_moltype_t into the runtime structures (t_mdatoms)
    • update_mdatoms(): interpolates mass of perturbed atoms to the current lambda value, charges are done in PME code, currently I don't know where lj params are interpolated

    while(step<nsteps)
    {
     dd_partition_system(): redistribute atoms every neighbor search steps, calls atoms2md() with index
     do_force(): calculate forces (bonded and nonbonded) acting on the (home) atoms (no position update here)
     write_traj(): write current coordinates to trajectory file (gathers coordinates before hands if MPI-enabled)
     update(): tcoupl, constraints, second part of leap frog, update X coords
     step++;
    }
    

     


    Additional Important Functions

    The following functions did not really fit the "flow chart" style function listing for mdrun. Still they are too important to be left out.

    do_force()

    (sim_util.c)

    • dd_move_x(): synchronize X coords with neighbor clients (if MPI-enabled)
    • do_ns(): neighbor searching
    • do_force_lowlevel(): calc bonded and non bonded forces acting on the (home) atoms
    • dd_move_f(): synchronize forces with neighbor clients (if MPI-enabled)
    • calc_viral(): calc viral tensor for home atoms based on local atom coordinate and force
    • does periodic boundary conditions if domain decomposition is not enabled

    do_force_lowlevel()

    (force.c)

    • do_walls(): calc forces and viral for walls
    • do_nonbonded(): calc forces and viral from nonbonded interactions
    • calc_bonds(): calc bonds
    • gmx_pme_do(): particle mesh ewald (or similar function for different estatic type)

    do_nonbonded()

    (nonbonded.c)

    • uses nb_kernel_list[nblists->nlist_sr[i].il_code] (function pointer to kernels, normally optimized for architecture) to calc forces
    • nblists are constructed during neighbor search
    • gromacs can be forced to use non assembly routines to do these calculations (easier to debug) by defining "NOASSEMBLYLOOPS=1" as environment variable (e.g. put "export NOASSEMBLYLOOPS=1" without the quotes in your /etc/profile)

    atoms2md()

    (mdatoms.c)

    • copy the attributes of the (home) atoms from gmx_moltype_t to the runtime data structure t_mdatoms
    • if an index is supplied (by user or domain decomposition) then the index contains the global atom index. So if an atom has offset i in the global topology (global index) then with index the atom is at the position where index[j]==i (see index, local index).
    • here only the attributes in t_atom are copied, other interactions like lennard jones from gmx_ffparams_t are done in mk_nbfp

    mk_nbfp()

    (force.c)

    • copys the lennard jones or the BHAM parameters from gmx_ffparams_t into a symmetric matrix of size atnr²
    • interactions between type a and b can be found by matrix[b*atnr+a], code uses C6(),C12() and other macros
    • these interactions are used at runtime in the [assembly] kernels
    • the interactions are stored in the t_forcerec struct defined in mdrunner()

    dd_partition_system()

    (domdec.c)

    • only called if MPI and domain decomposition are enabled
    • does periodic boundary conditions
    • divides the system into spatial cells and distributes the atoms among them
    • creates the cell specific index and calls atoms2md(): on each client
    • only neighbor cells communicate with each other (MPI)
    • domain decomp data structures are in "cr->dd"
    • to find an atom from its see here

    write_traj()

    (stat.c)

    • first gathers all coordinate info from clients to master if MPI is enabled
    • then dumps current coords into trajectory file

    MPI

    For details on the MPI API see [1] and [2]. Generally, the domain decomposition and related communication happens inside domdec.c, domdec_con.c and domdec_network.c (here the mpi calls are made).

    communication

    Most of the communication going on requires all nodes which are part of a communication to perform the same (collective) communication operations in the same order. For example, all clients enter the write_traj() function after the same number of steps even though they don't actually write trajectories (only the master does). Within this function the clients prepare the data (atom coordinates) the master expects (what the master expects has been communicated beforehand) and call mpi_gather to send this data. The master enters write_traj() and sets up buffers to receive the different coordinate sets from the clients. Then it calls mpi_gather to receive the client data. So there is no dedicated communication protocol when sending or receiving data that tells the receiver what kind of data it is actually receiving. This information is implicit in that all nodes (master and clients) call the same collective MPI functions in the same order and frequency.

    The client nodes (which own one of the domain decomposition cells per node) can communicate only with the master and neighboring nodes. So if data from a node is required that is more than one cell away then this fails!

    The inter client communication takes place mostly using non blocking MPI_Irecv,MPI_Isend or MPI_Sendrecv calls. The synchronization of the master with the clients happens using gather and scatter.

    Only the initial setup communication in mdrunner() uses blocking mpi calls.

    local index

    The local atom indices are constructed by the master in dd_partition_system (the atoms are distributed based on spatial subdivision of the simulation box among the available nodes). Only the master retains a global state which is updated from the local state on each node before writing the current atom positions to the trajectory file using gather. Each node calculates forces and positions for its home atoms and communicates these results with the neighboring cells (dd_move_x() and dd_move_f() calls in do_force). Each node constructs its local state from the index supplied by the master in atoms2md. In order to find the local coordinate of an atom given its global index (so its global offset into X) the atom must be looked up in the global to local atom lookup table:


    int cell_id=cr->dd->ga2la[global_index].cell;
    int local_index=cr->dd->ga2la[global_index].a;
    if(0==cell_id)
    {
     rvec* atom_position=&local_state->x[local_index];
    }
    else
    {
     // atom is NOT on this node
     // if atom is on a neighbor node then its position can be acquired using special communication
    }
    

     


    Each node has access to a copy of the topology information, so that the global atom index can be found for each local atom.

    home

    The home atoms on a node are those atoms that the node is supposed to handle. On a single core run that will be all atoms. With MPI and domain decomposition each node only handles a few atoms (or only Ewald summation or similar). Which atoms are home atoms to a specific node is determined by the master in dd_partition_system() and the actual run time atom data is composed in the t_mdatoms structure in atoms2md(). So if you want to add another force term acting on the simulation atoms you must iterate over the local atoms like this (do_walls() in wall.c part of mdlib might be a good place to see how a simple additional force is handled):


    for(i=md->start;i<md->start+md->homenr;i++) //iterate over home atoms
    {
     rvec* atom_pos=&state->x[i];//get atom position
     float mass=md->massT[i];//get atom mass
     //must handle perturbation, how???
     /*
     float q=md->chargeA[i]
     int type=md->typeA[i];
     */
     f[i]+=F(atom_pos,q,mass,...);//calc force dependant on position and other stuff
     //f is the force array defined in mdrunner()
     //F() is a user defined function calculating the force
    }
    

     


    The viral contribution for forces in the force array are handled automatically in calc_viral().

    If anyhow possible then don't modify atom positions/forces in places where the original code doesn't do it. Quite likely you will break some assumptions about charge group center of mass or domain decomposition which will cause weird crashes or worse simple numerical hick ups which are a nightmare to find.

    gather

    All clients communicate their current local state to the master using MPI_Gather.

    scatter

    The master distributes data, like the current domain decomposition layout, to all clients using MPI_Scatter. The amount of data transmitted for each individual client has been communicated before hands (setup_dd_communication() in domdec.c)

    How do I add parameters to grompp (mdp file)

    In order to add new input parameters grompp, mdrun and a few tools must be modified:

    • increment file version: static const int tpx_version in tpxio.c
    • add the new variable to inputrec in inputrec.h (e.g. int new_option;).
    • add another section to the end of do_inputrec() in tpxio.c:

    ...
    if(file_version>=NEW_VERSION)//NEW_VERSION is the number you did set to tpx_version
    {
      //here add code to read/write new data
      do_int(ir->new_option);//do_int is for ints, do_real for reals etc
      //do_something takes care of reading and writing to and from the tpr file
    }
    

     


    • add code to parse new value from the mdp file to get_ir() in readir.c:

     ITYPE ("feature_name",   ir->new_option,  0);//ITYPE for ints, RTYPE for floats, etc; last param is default value
    

     

    Note: There is also EETYPE to translate strings like "yes", "no" ,... into a value; check the code inside tpxio.c and readir.c to find out how to handle these. Same goes for ndo_int() etc.

    • add input rec dump code to pr_inputrec() in txtdump.c
    • add input rec compare code to cmp_inputrec() in tpbcmp.c
    • add checks to verify input rec data to check_ir() in readir.c

    After recompiling AND rerunning both grompp and mdrun (make sure you also recompile the libraries required by these apps, the make files don't handle these dependencies very gracefully) the new option should be available inside mdrun inside inputrec.

    After having implemented the new option you might want to come back here and add documentation to this wiki as well as the Gromacs documentation that comes with the code.

    • add documentation for your new option to share/html/online/mdp_opt.html
    Page last modified 17:51, 15 Nov 2009 by JLemkul?