Multiple Chains

    To generate a topology for a simulation of multiple distinct molecules (i.e. chains in PDB-speak), we first differentiate two cases - those where the chains are topologically identical, and those where they are not.

    Identical chains

    This is the easy case. Take a structure file with a single chain and use pdb2gmx on it. (Note, as normal, it is not necessary for this structure to be in PDB format.) Take a good look at the resulting .top file, and at the bottom you will see a section like

    [ molecules ] 
    Protein            1

    and earlier this molecule will have been named "Protein" (or whatever). You can just increase that counter from 1 to whatever you require. Then, use this .top file with a different coordinate file (whose number of chains corresponds to the modified .top file) as input to grompp.

    You can use editconf to create translated copies of the chains to concatenate later in a single file, or genconf to generate replicates of a whole box.

    Non-identical chains

    This is trickier.  One can use a coordinate file format that allows you to identify different chains, such as the PDB format, which permits you to assign a letter (chain identifier) in column 22.  The pdb2gmx program can recognize different chain identifiers, and will write them as different molecules (and thus to separate topologies).  For instance, chain A will correspond to a topology named topol_A.itp, chain B will be topol_B.itp, etc.  Note that in the resulting .top file, each chain will be written as a separate topology, even if the chains actually represent identical molecules (i.e., a homodimeric protein).

    In GROMACS 4.5, the -chainsep argument to pdb2gmx permits the user to control whether the PDB TER record should also be considered to break a chain into a new [ moleculetype ] section.

    Are seperate chains necessary?

    Note that it is not necessary to use separate chain identifiers to separate chains into different molecules in order to conduct simulations or to run analysis in GROMACS. GROMACS doesn't really care what is contained in a [ moleculetype ] section, except that you can't have bonded interactions between different such sections.  You can create an index file with residue numbers that specify the chains in terms of their numbered residues.  For example, if you have a dimeric protein with each chain consisting of 200 amino acids, chain A might correspond to residues 1-200, and chain B 201-400.  Thus, at the make_ndx prompt, you can enter

    r 1-200

    for the first chain, and

    r 201-400

    for the second chain.  This way, you can conduct analysis on each chain separately without worrying about chain identifiers.

    Page last modified 01:30, 16 Sep 2010 by mabraham