Gromacs

g select

    The g_select program is similar to make_ndx in that it is used to create index groups.  In contrast to make_ndx, however, g_select is capable of using much more advanced syntax and can create dynamic indices.  While make_ndx produces a group of fixed size that is applied to whatever coordinate file or trajectory file the user may have, dynamic indices can vary in size over various frames.  Their utility in GROMACS programs is currently very limited, though this is an aspect of current development.

    Understanding the Input

    All of the input files that g_select accepts are listed as optional, the reason being that the invocation of the program and correct options depends upon what the user intends to do.  For processing a single coordinate file, nothing more than the -s flag is needed.  For processing a trajectory with -f, the user must also supply a structure file to the -s flag for proper mapping of residue and atom names and numbers.  The -n flag allows the user to supply groups including and beyond those that are default, thus expanding the amount of groups that can be created.  This topic will be discussed in greater detail later on this page.  The -sf flag allows the user to store their selection syntax in a text file that is read by g_select.  This method is useful in the case of very complex selections, or in the case where multiple selections need to be made to arrive at the desired result.

    Understanding the Output

    Please refer to g_select -h for a discussion on available output files.  This information will not be recapitulated here.

    Selection Syntax

    To view the instructions for creating selections and to access several examples, issue the following command:

    g_select -select 'help all'

    The documentation provides a large amount of information.  Of the examples provided, there are several points that will be discussed here.  First, it should be stated that selections provided to the -select flag must be enclosed in single quotes (' ') to be interpreted properly.  Without single quotes, the argument provided is interpreted as multiple arguments and thus the program will fail.  Now, to break down one of the examples provided from the above command:

    "Close to protein" resname LIG and within 0.5 of group "Protein"

    To use this selection on the command line, enclose the entire string in single quotes, i.e.:

    '"Close to protein" resname LIG and within 0.5 of group "Protein"'

    Now what does this selection do?  The first part ("Close to protein") sets a name that will be written to the output index file (if specified).  The remainder of the string specifies how the selection is made.  Atoms in a residue named LIG ("resname LIG") that are within 0.5 nm of a group called "Protein" are then written to the index group.  The use of the "group" keyword implies that the name that follows is present in an index file supplied with -n.  Similarly, one could create a selection of waters close to aspartate residues with the following:

    "Close to Asp" resname SOL and within 0.5 of resname ASP

    The use of "resname ASP" does not require an index group, as g_select will simply parse through the input coordinate file(s) and look for residues named ASP.

    Using selection.dat

    In some cases, selections are more complex and cannot be written in a single statement.  In these cases, providing a text file to the -sf flag of g_select is very useful.  There are two main differences in the selection syntax relative to using -select.  The first is that strings do not need to be enclosed in single quotes.  The second is that each line must end with a semicolon (;).  Let's say you need to create an index group of water oxygen atoms that are within a certain distance of two proteins, named Protein_A and Protein_B.  These proteins will be defined in an index file, allowing us to call them by group name.  Our selection.dat file would look something like:

    waterO = group "SOL" and name OW;
    between = waterO and within 0.5 of group "Protein_A" and within 0.5 of group "Protein_B";
    between;

    The first line assigns all OW atoms in water ("SOL") to a new selection called "waterO" that can be used later.  On the second line, a new selection using the "waterO" group (called "between") is created by assigning it all the waterO atoms that fall within 0.5 nm of Protein_A and also Protein_B.  The last line returns the selection so that g_select knows which group we want written to our output, much like a function returns a value in the source code of a program.

    Page last modified 17:40, 14 Apr 2012 by JLemkul?