Gromacs

Data Structures

     

    Here you are introduced to the to the data structures and their connections as they are used inside mdrun and grompp. You might also want to take a look at the Programmer's Guide.

    t_inputrec

    (inputrec.h)

     int  eI;              /* Integration method 				*/
     int  nsteps;		/* number of steps to be taken			*/
    

    ...


    Basically just a big struct containing all the possible input parameters from the mdp files except the group options (like FREEZE, ENERGY, ...) which are in gmx_groups_t. See here if you want to add another option. The options are documented in more detail in the official Gromacs documentation. The inputrec is read from the tpr file by the master and scattered to all clients in mdrunner().

     

    gmx_ffparams_t

    (idef.h)

     int        ntypes; //interactions: atnr² lj + additional ones
     int        atnr; //number of atom types
     t_functype *functype; //array of ntypes describing the interaction type
     t_iparams  *iparams; //matrix of size atnr² with interaction params 
     real       fudgeQQ; //??
    

    All interactions between atoms not related to charge are described in this struct (so e.g. Lennard Jones, SETTLE, ...; see enum in idef.h). All inter two atom interactions are described in form of a symmetric atnr² sized matrix. To find the type of interaction (so is it LJ, BWHAM,...) between to atom types A and B (both integer) the functype must be looked up and the iparams must be accessed depending on the functype:


    //get type of atom i of molec type j
    int A=mtop->moltype[j].atoms.atom[i].type;
    //get type of atom l of molec type k
    int B=mtop->moltype[k].atoms.atom[l].type;
    t_functype interaction_type=mtop->ffparams.functype[B*atnr+A];
    switch(interaction_type)
    {
     case F_LJ:
      float c6=mtop->ffparams.iparams[B*atnr+A].lj.c6;
      float c12=mtop->ffparams.iparams[B*atnr+A].lj.c12;
      break;
     case F_MORSE:
      ....
    }
    

     


    Interactions that take more than two atoms make use of the t_ilist as defined in idef.h. These types of interactions are stored in the functype array after the two atom interactions so in the range functype[atnr*atnr .. ntypes]. The interaction types are indices into the t_ilist, where the atom indices required for the interaction are stored (except for two atom interactions, which don't use the t_ilist). mk_nbfp() translates the two body interaction parameters from this struct into run time data structures used by the kernels and update().

    gmx_moltype_t

    (topology.h)

     char          **name;         //molecule type name
     t_atoms	atoms;		//atom information
     t_ilist       ilist[F_NRE];   //interactions with more than two atoms
     t_block       cgs;            //charge groups
     t_blocka      excls;          //self exclusions
    

    Describes the complete molecule type except nonbonded two atom interactions (see gmx_ffparams_t). The charge groups define sets of atoms which are treated as a single particle (the center of mass of the atoms inside one group) for various spatial calculations (including domain decomposition, pbc, ...). Often each atom is an individual charge group. See t_block on how to define charge groups. This structure is used in several places to define groups of atoms. The exclusions probably define which intra molecular non bonded interactions are not calculated. All atom indices used inside gmx_moltype_t are relative to 0. So charge group and exclusion atom index 0 refers to the first atom of of an instance of this molecule type.

    gmx_molblock_t

    (topology.h)

     int           type;           //index into mtop->moltype
     int           nmol;           //number of molecule instances of this type
     int           natoms_mol;     //number of atoms
     int           nposres_xA;     //??
     rvec          *posres_xA;     //??
     int           nposres_xB;     //??
     rvec          *posres_xB;     //??
    

    gmx_moltype_t defines a molecule type (like a SPC water molecule) but not how many instances of it are present in the simulation. The molecule block structure describes how many instances of which molecule type are present and in which order the molecules are sorted (so e.g. how many SPC waters are there and what other kind of molecules are defined before and after those). Therefore each gmx_molblock_t can be used to describe nmol instances of molecules with the same molecule type (indicated by the structure member type as e.g., mtop->moltype[mtop->molblock[0].type]). The array of molecule blocks in gmx_mtop_t defines the order the molecules or atoms are sorted, and also defines the global atom index. For example, suppose we have five instances of molecules in molecule block 0 (the first block), mtop->molblock[0].nmol is equal to five; mtop->moltype[mtop->molblock[0].type] gives the molecule type of molecules in the block; mtop->moltype[mtop->molblock[0].type].atoms.nr gives the number of atoms in each molecule; the global index of the last atom in the block is mtop->moltype[mtop->molblock[0].type].atoms.nr * 5 - 1. The global atom index is used to find the global and local atom coordinates.

    t_atoms

    (atoms.h)

     int           nr;             //atom count in this molecule type
     t_atom	*atom;		//atom type information
    				//The following entries will not 	
    				//allways be used (nres==0)	 	
     char		***atomname;	//Array of pointers to atom name	
     char		***atomtype;	//Array of pointers to atom types	
     char		***atomtypeB;	//Array of pointers to B atom types	
     int		nres;		//Nr of residue names			
     char		***resname; 	//Array of pointers to residue names 	
     t_pdbinfo     *pdbinfo;       //PDB Information, such as aniso. Bfac 
    


    Lots of arrays of length nr. The atom array contains the important atom type informations like mass, charge, etc. The order of the atoms helps to define the global atom index.

    t_atom

    (atoms.h)

     real 		m,q;		//Mass and charge			
     real 		mB,qB;		//Mass and charge for Free Energy calc 
     unsigned short type;		//Atom type				
     unsigned short typeB;		//Atom type for Free Energy calc	
     int           ptype;		//??
     int 		resnr;		//Residue number (index into t_atoms::resname)			
     int           atomnumber;     //??
     unsigned char chain;          //??
    

    All parameters with "B" at the end of the name are used to interpolate between the original parameter (like "m") and the perturbed parameter (like "mB") using the current lambda value when doing free energy calculations. The type information is used to look up two body interactions in gmx_ffparams_t. The information contained in this struct is converted into the run time data structures in atoms2md().

    t_atomtypes

    (atoms.h)

     int           nr;             //number of atomtypes
     real         *radius;         //GBSA radius for each atomtype
     real         *vol;            //GBSA efective volume for each atomtype
     real         *surftens;       //implicit solvent surftens for each atomtype
     int          *atomnumber;     //???
    

    Atom type data for implicit generalized born solvent approximation.

    t_block

    (block.h)

     int nr; //number of blocks/groups	
     atom_id *index; //list of atom indices
     int nalloc_index; //length of the index list (normally nr+1)
    

    This structure defines groups of consecutive atoms. index is a list of atom index pairs. The first index of such a pair defines the starting atom index. The second index defines the end atom and the start of the next group. For example charge groups are defined like this.

    The following code iterates over the charge groups of molecule number i and checks whether any atom of a charge group is above a surface parallel to the x-y plane at z=1.0 (the variable surface_z):

    float surface_z=1.0f;
    
    t_block* cgs=&mtop->moltype[ mtop->molblock[i].type].cgs;
    
    t_atoms* matoms=&mtop->moltype[ mtop->molblock[i].type].atoms;
    
    for(k=0;k<mtop->molblock[i].nmol;k++)
    {
      for(l=0;l<cgs->nr;l++)//go through charge groups
      {			
        bool include_cg=FALSE;						
        for(j=cgs->index[l];j<cgs->index[l+1];j++)//go through the atoms in the charge group
        {
          int ind=mtop->mols.index[i]+k*matoms->nr+j;//global index
          if(state->x[ind][2]>surface_z)//global state
          {
            include_cg=TRUE;//one of the atoms in the current charge group is above the surface						
            break;
          }
        }
        if(include_cg)
        {
          //do something with the charge group...
          //e.g. include it in image charge calculations to simulate metal
        }
      }
    }
    

    This structure is reused many times to define different group types in a similar way.

    t_mdatoms

    (mdatom.h)

     real          tmassA,tmassB,tmass;  //unperturbed and perturbed and interpolated mass of the (partial) sytem
     int           nr;  //number of (local) atoms
     int           nalloc;  //allocation size of the arrays
     int           nenergrp;  //energy group count
     bool          bVCMgrps;  //??
     int           nPerturbed;  //number of perturbed atoms
     int           nMassPerturbed;  //number of mass perturbed atoms
     int           nChargePerturbed;  //number of charge perturbed atoms
     bool          bOrires;  //??
     real          *massA,*massB,*massT,*invmass; //arrays containing mass of each atom
     real          *chargeA,*chargeB; //arrays with (perturbed) charge of each atom
     bool          *bPerturbed;  // any perturbation going on?
     int           *typeA,*typeB;  //(perturbed) type of each atom
     unsigned short        *ptype; //??
     unsigned short        *cTC,*cENER,*cACC,*cFREEZE,*cVCM; //group info for each atom
     unsigned short        *cU1,*cU2,*cORF; //??
     bool          *bQM; //are we doing QM?
     int           start; // home atoms
     int           homenr; // home atoms
     real          lambda; // current interpolation value for free energy perturbation
    

    This contains the atom data used at run time (together with the forces and positions etc. in the state). The data is filled in at atoms2md() called from mdrunner() using the atom data from the topology. This also defines the home atoms. These are the atoms the node is supposed to process. If you want to add an additional force acting on the simulation atoms then see here.

    t_state

    (state.h)

    typedef struct
    {
     int           natoms;
     int           ngtc;
     int           nrng;
     int           nrngi;
     int           flags;  /* Flags telling which entries are present      */
     real          lambda; /* the free energy switching parameter          */
     matrix 	    box;    /* box vector coordinates                      	*/
     matrix     	box_rel; /* Relitaive box vectors to preserve shape    	*/
     matrix 	    boxv;   /* box velocitites for Parrinello-Rahman pcoupl */
     matrix        pres_prev; /* Pressure of the previous step for pcoupl  */
     real          *nosehoover_xi;  /* for Nose-Hoover tcoupl (ngtc)       */
     double        *therm_integral; /* for N-H/V-rescale tcoupl (ngtc)     */
     int           nalloc; /* Allocation size for x, v and sd_x when !=NULL*/
     rvec          *x;     /* the coordinates (natoms)                     */
     rvec          *v;     /* the velocities (natoms)                      */
     rvec          *sd_X;  /* random part of the x update for stoch. dyn.  */
     rvec          *cg_p;  /* p vector for conjugate gradient minimization */
    
     unsigned int  *ld_rng;  /* RNG random state                           */
     int           *ld_rngi; /* RNG index                                  */
    
     history_t     hist;   /* Time history for restraints                  */
    
     ekinstate_t   ekinstate; /* The state of the kinetic energy data      */
    
     energyhistory_t  enerhist; /* Energy history for statistics           */
    
     int           ddp_count; /* The DD partitioning count for this state  */
     int           ddp_count_cg_gl; /* The DD part. count for index_gl     */
     int           ncg_gl; /* The number of local charge groups            */
     int           *cg_gl; /* The global cg number of the local cgs        */
     int           cg_gl_nalloc; /* Allocation size of cg_gl;              */
    } t_state;
    

    Contains almost all of the non-static state of the system including position, velocity and charge group data for the home atoms. For further information on the component structures, see include/types/state.h.

    Under domain decomposition, t_state is used for both the local state and the global state in md.c. The latter is maintained ONLY by the master process (i.e. TRUE == DDMASTER(cr->dd)). Each node calculates the forces and position for its home atoms using the local state. Later the master gathers the locally updated data from all the clients and constructs an up to date global state before writing out the trajectory.

    For single-processor or particle-decomposition contexts, state == local_state, i.e. both have pointers to the same memory.

    Please fill in additional information

    t_forcerec

    (forcerec.h)

     bool bDomDec; // Are we using domain decomposition?
    
     int  ePBC; // What kind of PBC are we using?
     bool bMolPBC; //? Can molecules be infinite (e.g nanotubes)
     int  rc_scaling;
     rvec posres_com; //? centres of mass for position restraints
     rvec posres_comB;
    
     real rlist,rlistlong; // Radii of neighbourlists
     
     real zsquare,temp;
     real epsilon_r,epsilon_rf,epsfac;  // Dielectric constants, multiplication factor for charges (f in section 2.2 of the manual)
     
     real kappa,k_rf,c_rf; // Reaction field constants
    
     double qsum[2]; // Charge sum for topology A/B ([0]/[1]) for Ewald corrections
    
     /* The shift of the shift or user potentials */
     real enershiftsix;
     real enershifttwelve;
     /* Integrated differces for energy and virial with cut-off functions */
     real enerdiffsix;
     real enerdifftwelve;
     real virdiffsix;
     real virdifftwelve;
     /* Constant for long range dispersion correction (average dispersion)
      * for topology A/B ([0]/[1]) */
     real avcsix[2];
     /* Constant for long range repulsion term. Relative difference of about 
      * 0.1 percent with 0.8 nm cutoffs. But hey, it's cheap anyway...
      */
     real avctwelve[2];
     
     /* Fudge factors */
     real fudgeQQ;
    
     bool bcoultab; // Do we have a coulomb table?
     bool bvdwtab; // Do we have a VDW table?
     /* The normal tables are in the nblists struct(s) below */
     t_forcetable tab14; /* for 1-4 interactions only */
    
     /* PPPM & Shifting stuff */
     real rcoulomb_switch,rcoulomb;
     real *phi;
    
     /* VdW stuff */
     real rvdw_switch,rvdw;
     real bham_b_max;
    
     /* Free energy ? */
     int  efep;
     real sc_alpha;
     int  sc_power;
     real sc_sigma6;
     bool bSepDVDL;
    
     /* NS Stuff */
     int  eeltype; // The type of the electrostatic interaction for looking up kernel functions
     int  vdwtype; // The type of the VDW interaction for looking up kernel functions
     int  cg0,hcg;
     /* solvent_opt contains the enum for the most common solvent
      * in the system, which will be optimized.
      * It can be set to esolNO to disable all water optimization */
     int  solvent_opt;
     int  nWatMol;
     bool bGrid;
     int  *cginfo_global;
     int  *cginfo;
     rvec *cg_cm;
     int  cg_nalloc;
     rvec *shift_vec;
    
     /* The neighborlists including tables */
     int  nnblists;
     int  *gid2nblists;
     t_nblists *nblists;
    
     /* The wall tables (if used) */
     int  nwall;
     t_forcetable **wall_tab;
    
     /* This mask array of length nn determines whether or not this bit of the
      * neighbourlists should be computed. Usually all these are true of course,
      * but not when shells are used. During minimisation all the forces that 
      * include shells are done, then after minimsation is converged the remaining
      * forces are computed.
      */
     /* bool *bMask; */
    
     /* Twin Range stuff. */
     bool bTwinRange;
     int  nlr;
     int  f_twin_n;
     int  f_twin_nalloc;
     rvec *f_twin;
     rvec *fshift_twin;
    
     /* Forces that should not enter into the virial summation:
      * PPPM/PME/Ewald/posres
      */
     bool bF_NoVirSum;
     int  f_novirsum_n;
     int  f_novirsum_nalloc;
     rvec *f_novirsum;
    
     /* Long-range forces and virial for PPPM/PME/Ewald */
     gmx_pme_t pmedata;
     tensor    vir_el_recip;
    
     /* PME/Ewald stuff */
     bool bEwald;
     real ewaldcoeff; // beta from section 4.9.1 of the manual
    
     /* Virial Stuff */
     rvec *fshift;
     rvec vir_diag_posres;
     real vir_wall_zz;
    
     /* Non bonded Parameter lists */
     int  ntype; /* Number of atom types */
     bool bBHAM;
     real *nbfp;
    
     int *egp_flags; // Energy group pair flags
     real fc_stepsize; // xmdrun flexible constraints
    
     /* Generalized born stuff */
     /* VdW radius for each atomtype (dim is thus ntype) */
     real *atype_radius;
     /* Effective radius (derived from effective volume) for each type */
     real *atype_vol;
     /* Implicit solvent - surface tension for each atomtype */
     real *atype_surftens;
    
     /* If > 0 signals Test Particle Insertion,
      * the value is the number of atoms of the molecule to insert
      * Only the energy difference due to the addition of the last molecule
      * should be calculated.
      */
     bool n_tpi;
    
     gmx_ns_t ns; // Neighbor searching stuff
     bool         bQMMM; // Are we doing QMMM?
     t_QMMMrec    *qr; // Data for QMMM
     t_nblist QMMMlist_sr; // QM-MM short-range neighborlist
     t_nblist QMMMlist_lr; // QM-MM long-range neighborlist (not needed, one QMMM list suffices)
    
     real print_force; // Limit for printing large forces, negative is don't print
    
     /* User determined parameters, copied from the inputrec */
     int  userint1;
     int  userint2;
     int  userint3;
     int  userint4;
     real userreal1;
     real userreal2;
     real userreal3;
     real userreal4;
    

    Contains a lot of run time data. The struct is quite large and for most members I don't know exactly what they are good for.

    Contains pointer to the runtime two body interaction data constructed in mk_nbfp().

    Please fill in additional information

    t_commrec

    (commrec.h)

     int sim_nodeid; // processor index within a simulation
     int nnodes;     // number of processors
     int npmenodes;  // number of processors devoted to PME
     int threadid,nthreads;  // not actually used in the current version, 0 and 1 repsectively
     int nodeid;     // (global) processor index
     MPI_Comm mpi_comm_mysim, mpi_comm_mygroup; // MPI communicators
     gmx_nodecomm_t nc;  // for inter- and intra- node communication
     gmx_domdec_t *dd; // for domaint decomposition
     gmx_partdec_p_t *pd;  // for particle decomposition
     gmx_multisim_t *ms; // for a multiple-simulation run, e.g., replica exchange
    

    Required for mpi communication for both domain decomposition (gmx_domdec_t) and particle decomposition. Used in conjunction with the macros defined in commrec.h to figure out the parallel setup (who is master, what kind of decomposition, ...).

    The gmx_nodecomm_t structure nc is used to optimize communication for processors within a node or between different nodes.

    The sim_nodeid is usually identical to nodeid. However, for a multiple-simulation run with MPI disabled, sim_nodeid is the processor index within a simulation, while nodeid is the global processor index.

    gmx_domdec_t

    (commrec.h)

     ...
     rvec cell_x0; //cell box lower left
     rvec cell_x1; //cell box upper right
     ...
     gmx_ga2la_t *ga2la; //lookup table to translate a global index to a local index
     ...
    

    Contains everything required to do domain decomposition. When doing domain decomposition then atoms must not move more than one cell size out of the current cell. This can lead to problems with PBC when you're moving atoms with your own code rather than having the gromacs kernel do it. Because of the eighth-shell domain decomposition, only coordinates of certain atoms in the forward directions are available in addition to the local atoms. For non-bonded and two-body bonded interactions, these are atoms in the zones in the forward direction. Each zone consists of one or more domain decomposition cells. For multi-body bonded interactions only atoms up to one cell away (within a distance given by the -rdd option of mdrun) in the forward directions are available. For constraints and virtual extra communication is required which does not fall into the eighth-shell scheme. This communication happens in both directions, not more than one cell away. See the GROMACS 4.0 paper for more details on the domain decomposition.

    gmx_ga2la_t

     int     cell; //cell id (if 0 then atom is on this node, otherwise not)
     atom_id a; //local atom index
    

    Used to translate a global index into a local index when using domain decomposition. See here on how to do this.

    gmx_groups_t

    (topology.h)

     t_grps         grps[egcNR];   //index (group types (egcFREEZE,..)) defined in topolgy.h
     int            ngrpname;      //number of group names
     char           ***grpname;	//list of group names
     int            ngrpnr[egcNR]; //number of atoms within each group type
     unsigned char  *grpnr[egcNR]; //global atom indices of the atoms inside the group
    

    Here's how you'd add another freeze group (the way it is done here this MUST happen after the tpr file was loaded but before atoms2md() was called because we operate on global data:

    //first: group specific options in the inputrec:
    
    ir->opts.ngfrz++;
    int group_id=ir->opts.ngfrz-1; //index into ir->opts.nFreeze
    srenew(ir->opts.nFreeze,ir->opts.ngfrz); 
    //freeze in 3D:
    ir->opts.nFreeze[ ir->opts.ngfrz-1 ][0]=1;
    ir->opts.nFreeze[ ir->opts.ngfrz-1 ][1]=1;
    ir->opts.nFreeze[ ir->opts.ngfrz-1 ][2]=1;
    
    //second: add general group data:
    
    //new group name
    mtop->groups.ngrpname++;
    srenew(mtop->groups.grpname,mtop->groups.ngrpname); //resize groupname array
    mtop->groups.grpname[ mtop->groups.ngrpname-1]=put_symtab(&mtop->symtab,"new_group_name");
    mtop->groups.grps[egcFREEZE].nr++; //this is a new freeze group
    srenew(mtop->groups.grps[egcFREEZE].nm_ind,mtop->groups.grps[egcFREEZE].nr);
    mtop->groups.grps[egcFREEZE].nm_ind[ mtop->groups.grps[egcFREEZE].nr-1]=mtop->groups.ngrpname-1; //index into grpname
    
    //third:set atoms to group
    
    for(j=start_index;j<end_index;j++)
    {
      //freeze all atoms with global index between start_index and end_index:
      mtop->groups.grpnr[egcFREEZE][j]=group_id;
    }
    

    Most group types (see enum in topology.h) have group specific options which are NOT stored inside gmx_groups_t but mostly inside t_inputrec::opts.

    For example energy exclusion uses a symmetric ir->opts.ngener² sized matrix ir->opts.egp_flags[A*ir->opts.ngener+B] to define exclusions between different energy groups A and B.

    F

    Array of rvec (3 floats or doubles making a vector) defined in mdrunner() and passed down to all functions calculating forces on the simulation atoms. The offset of a simulation atom inside this array is defined by the local atom index. See here on how to add additional forces.

    X

    Array of rvec (3 floats or doubles making a vector) contained in the state. The offset of a simulation atom inside this array is defined by the local atom index.

    global index

    This section describes the global atom coordinate index into X. So the location of the atom attributes and the coordinate of the original atoms as supplied to grompp. When using domain decomposition (dd_partition_system) or some other options then mdrun uses a local index at runtime.

    There is no index from the atom type information in gmx_moltype_t to the coordinate array X. The coordinates are stored in the same order as the atoms are defined inside the topology using the gmx_molblock_t and gmx_moltype_t. As an additional aid the mols structure contains the offset (into X) of the first atom coordinate of each molecule. So to find the (global) coordinate of the i-th atom inside the j-th molecule


    state->x[ mtop->mols.index[j]+i] 
    

     


    must be accessed.

    To enumerate all global atoms and their coordinates the following code fragment can be used (remember, these are loaded from the tpr file and are converted to runtime coords using atoms2md; so modifying these after the have been converted doesn't make much sense):


    int index=0;
    for(j=0;j<mtop->nmolblock;j++) //enum all molecule types
    {
     for(k=0;k<mtop->molblock[j].nmol;k++) //enum all instances of each molecule type
     {
       for(l=0;l<mtop->moltype[ mtop->molblock[j].type].atoms.nr;l++) //enum atoms
       {
         t_atom* atom_info=&mtop->moltype[ mtop->molblock[j].type].atoms.atom[l];
         rvec* atom_position=&state->x[index];
         index++;
       }
     }
    }
    

     


    index

    Page last modified 17:50, 15 Nov 2009 by JLemkul?