Versions 4.6.x

    Redirected from About Gromacs/Release Notes/Versions 4.6.x

    Release notes for fixes in the git repository made since 4.6.7 (but not released)

    • Fixed issue with vsiten and verlet buffers. A loop counter for a loop over vsite-n did not take into account that multiple entries make up one vsite-n particle. #1579.
    • Fixed to use of CUDA stream priorities #1594
    • Fixed warning message about incorrect usage of dihedral type 9 The warning printed the wrong type number: 4 instead of 9. Also it didn't clarify that 9 only combines consecutive lines.
    • Added warnings for ewald-geometry and surface-epsilon ewald-geometry and surface-epsilon require the system dipole, which will be incorrect when charge groups with net charge cross periodic boundary conditions. grompp now checks and warns for this. #1645.
    • Fixed triclinic 1xNx1 domain decomposition With the Verlet scheme. 1D triclinic domain decomposition along y dimension produces incorrect bounding boxes for the non-bonded grid. This led to a lot of missing non-bonded interactions, which quickly crashes any simulation affected by this. #1631 #1656.
    • Fixed problem with mixed affinity mask on different nodes. If task distribution (with slurm for instance) causes both fully allocated and not-fully allocated nodes to be assigned to the job then there may be tasks with a all-cores affinity mask and tasks with a not-all-cores affinity masks. #1613
    • Re-fixed PME bug with high OpenMP thread count. PME energies and forces could be incorrect with combined MPI+OpenMP parallelization, when pmegrids->nthread_comm[YY] >= 2, which can only occur with high OpenMP thread count with multiple prime factors that are large with respect to the grid size. #1572.
    • Fixed issue with GPU non-local forces in systems with partially empty boxes (and only rarely in such cases). #1721.
    • Fixed g_energy average/RMSD bug Made g_energy produce correct output for energy files from continued and appended runs with nstcalcenergy=nstenergy. #1342.
    • Fixed binary exact continuation for trajectories, by removing inadvertent extra call to remove COM motion. #1342.
    • Fixed edr appending and exact continuation The nsteps field was not written to checkpoint files when nstcalcenergy=nstenergy. This caused differences in nsteps in appended energy files, which in turn caused issues in averages and RMSD in g_energy (which is now fixed by another patch). #1342.
    • Fixed mdrun -confout sometimes affecting final .edr frame This makes a two-part run write .edr files that can be concatenated to be identical to that from a one-part run. Otherwise, a single-domain run might make molecules whole, do an update with the modified x vector, and write a slightly different final .edr frame, even though the .trr and .cpt (and thus the restart) were still fine.
    • Fixed issues with long quotes in quote database
    • Fix single-precision numerical issues with soft-core and sc_power==48, but it can also improve stability of lower-power softcore free energy. #1580. #1306.
    • Fixed CUDA inter-stream synchronization issue. With the introduction of multiple hardware queues in CC 3.5 and later NVIDIA GPUs, the implicit dependency between tasks in the local and non-local kernel got eliminated. This could (rarely) lead to non-local interactions being calculated using coordinates (and charges) from the previous step.
    • Prevented possible execution of zero-sized work units in CUDA kernels, which would have led to wrong results, but does not seem to have occured in practice. #1767.
    • Fixed handling of frequency for expanded ensemble free-energy evaluation.
    • Fixed missing COM removal for md-vv #1651

    Release notes for 4.6.7

    • Fixed PME bug with high OpenMP thread count PME energies and forces could be incorrect with combined MPI+OpenMP parallelization. This would, only, happen when pmegrids->nthread_comm[YY] >= 2, which can only occur with high OpenMP thread count with multiple large prime factors. It's unlikely that this issue affected production runs. #1572.
    • Backported from 5.0 a fix that avoids a stack overflow on Windows with CMake > CMake used to add "/STACK:10000000" to the default linker flags. That was removed in version 2.8.11-rc1. The default value used by MSVC is apparently too small because mdrun crashes with a stack overflow when built on Windows with MSVC or ICC and CMake newer than
    • Fixed output of eigenvalues in g_covar. #1575 Commit 972032bfb8cd38 introduced a bug that would lead to eigenvalues only written to .xvg file if "-last" is explicitly stated on the command line. Otherwise no eigenvalues would appear in the .xvg file. The eigenvalues are written in a loop from '0' to 'end', but since 'end' is initialized with '-1', the loop would never be executed. This patch moves the code that computes 'end' one block upwards before the output to file.
    • Fixed bugs in vsiteN with OpenMP #1579.
    • Fixed two PME issues with MPI+OpenMP Change 272736bc partially fixed #1388, but broke the more general case of multiple MPI communication pulses in PME. Change 272736bc incorrectly changed tx1 and ty1. This change has been reverted. Change 27189bba fixed the incorrect PME grid reduction with multiple thread grid overlap in y. But it broke the, much more common, case where the y-size of the PME grid is not divisible by the domains in y. This change, incorrectly, changed buf_my. Now buf_my is set to the correct value, which solves both issues. #1578. #1388 and #1572.

    Release notes for 4.6.6

    • Fixed constraint virial with multiple time stepping #1400
    • Fixed shift and switch modifiers, particularly for free-energy

      When using tabulated interactions (historically with PME-Switch), the previous free-energy kernels used tabulated interactions which gave correct results. However, as we have moved to using the new interaction modifiers, Ewald short-ranged interactions are computed analytically. To extend the range over which we apply the soft-core interaction, the free-energy kernels evaluated interactions by subtracting the reciprocal-space component, and then applying the free-energy evaluation to the Coulomb (1/r) short-range interaction. This works fine for vanilla PME, but led to problems when combined with a switch modifier, since we are switching a different function compared to the non-free-energy kernels. This could lead to large artefacts where the free energy was 100x off if we were applying the cutoff to r while the switch was applied to the scaled soft-core radius.

      This patch modifies the free-energy kernel so that the vanilla, shift, and exact-cutoff versions still use the compensation trick, while the switch modifier always operates on the traditional short-range Ewald functional form.

      The (very small) Ewald shift has also been added when computing free energy in combination with Ewald summation and potential-shift modifiers. As the perturbation goes to zero, the interaction will also approach the non-free-energy interactions. Tested to match the non-free-energy kernel to with 1e-8 in the fully coupled state, it conserves energy, and produces reasonable free energies for ethanol in water.

      This also modifies table-generation, table-usage, and dispersion-correction code to use shift/switch forms (and correctly), when that has been selected in the interaction modifiers. This provides much more accurate results for our new shifted interactions. Correct (unmodified) tables are now generated for 1-4 interactions in a few corner cases in the presence of modifiers for non-bonded interactions. Code paths for using exact cutoffs now work correctly when rcoulomb-switch != rvdw-switch, or if only one kind of switch is active.

      Free-energy calculations using a plain Coulomb interaction now incorporate a potential shift if one exists. The GMX_NB_GENERIC environment variable can now be used to specify the use of the generic kernel even with shifts or switches active. #1463
    • Fixed bug in parallel v/f constraining with 3 or more decomposition domains in one or more dimensions could lead to modification of communicated v and f components by the box size for inter charge-group constraints. #1462
    • Added cut-off checks for triclinic domain decomposition, where with two decomposition cells in a trilinic dimension, the cut-off could be longer than the size of the communicated domains. This could lead to some pairs close to cut-off distance to be ignored in the force/energy calculations. #1467
    • Fixed real->int truncation of nrdf in v-rescale thermostat #1218
    • Fixed sign error in dvdl for position restraints #1408
    • Made sure water optimization is disabled for esoteric interactions

    • Added some tweaks and some (temporary) suppressions for gcc4.8 warnings and Address Sanitizer errors
    • Fixed essential dynamics (ED) continuation from .cpt for reference=average ED runs
    • Passsed on default value of radstep in make_edi -radfix
    • Made cmake -DGMX_BUILD_OWN_FFTW work without fortran compiler #1412
    • Added cmake FFTW_URL to allow easy offline build
    • Permitted GROMACS to build and run on Google Native Client
    • Fixed precision in g_energy thermal expansion coefficient calculation.
    • Clarified OpenMP-related things in mdrun help/man
    • Issue fatal errors rather than use broken shell code #1429
    • Reinstate shell code with DD #1429
    • Added fatal errors for normal modes with virtual sites or shells. #879
    • Made shells work with the Verlet scheme #1429
    • Added check in grompp for shells and some combinations of inputrec settings do not work with shells, in particular nstcalcenergy > 1 or use of a twin range cutoff. #1376
    • Reinstated shells with particle decomposition #1429
    • Implemented fatal error for nstcalcenergy!=1 with shells. #1376
    • Added check for domain decomposition and shells. #1376
    • Added safety check for the number of atoms in the fitting group in g_anaeig.
    • Avoid mdrun crash when RDTSCP is not supported in hardware, and also made CMake advanced cache option for GMX_USE_RDTSCP. #1428
    • Fixed an aligned store to unaligned memory in PME
    • Added gmx_is_{single,double}_precision to allow easy third-party detection of the precision
    • Added discussion of portability to install guide #1428
    • Permitted warning-free use of Andersen thermostat
    • Added fatal error for Andersen+constraints+DD
    • Improved CUDA non-bonded kernel performance
    • Fixed incorrect grid cell size in g_sas -nopbc #1445
    • Made g_tune_pme honour the requirement that the van der Waals radius must equal the Coulomb radius with Verlet cut-off scheme #1460
    • Fixed a rare cross product with zero vector in rotational pulling. #1431
    • Prevented generating a .tpr for leapfrog + MTTK. Added note about deprecation and planned removal of MTTK + constraints. #1292
    • Fixed g_rmsdist NOE calculation #1463
    • Fixed memory error in ASCII-format IO
    • Enabled shared libraries by default on Hurd
    • Fixed clang 3.5 warnings regarding abs() family
    • Prohibited AVX_256 with buggy gcc 4.6.1 #1259
    • Fixed a complicated bug in g_anaeig and g_covar, if the number of frames for a covariance analysis is fewer than the number of degrees of freedom.
    • Added a note about unsupported Verlet cutoff + Buckingham. #1192
    • Added a note about using direction-periodic pulling. #1352
    • Improved the g_lie help description. #1353
    • Added fatal errors for VV and twin-range multiple-time stepping #1400.
    • Fixed memory issue in genbox #1499
    • Fixed g_energy -vis Einstein viscosity bug introduced in version 4.5. #1516
    • Updated C-/N-terminal partial charges in Amber03.ff. #1466
    • Fixed typo in description of conversion factors.
    • Issued a warning for using gmx_rms -prev with large trajectories. #716
    • Processed negative sigma correctly with combination rule 2 or 3. #1391
    • Avoided writing xvgr formatting with -xvg none #1407, #1479
    • Mapped HIS1 to HSD in pdb2gmx for charmm27. #1133
    • Made sure genrestr uses the disre_up2 parameter #1357
    • Made gmxcheck -rmsd work again with forces
    • Fixed perturbed wall interactions #1501
    • Added dummy mass for charmm TYR aromatic vsites #587
    • Forced g_bond to use structure file to get unbroken molecular connectivity #834
    • Fixed memory-usage error in g_chi #1503
    • Made file-appending error message more helpful #1497
    • Made sure figure legends are adapted to xmgr/xmgrace #783
    • Fixed no-impact bug in PME with #OpenMP-threads a large prime #1388
    • Removed rounding issue in nbnxn Ewald table
    • Fixed that XDR-format file skipping fails if target is 2nd frame #1154
    • Updated mdrun -npme documentation #1374
    • Printed replica-exchange time step information in higher precision. #1486
    • Fixed a replica-exchange output problem when running NPT simulations
    • Disabled replica exchange when T not in order #1377
    • Fixed CMake 2.8.12+ CUDA dylib bugs and warnings on OS X #1471
    • Fixed error with reference distance in constraint pull code
    • Fixed that XTC-format seeking was wrong if frame time was modified #1405, #1406
    • Fixed typo in oplsaa ASPH charges. #1395
    • Fixed g_mindist with pbc=XY #1189
    • Made more clear message for undefined pullgroups. #1446
    • Added tip5p to the list of watermodels in some force fields. #1348
    • Extended the version checking and warnings for checkpoint continuation #1230
    • Fixed the sign of mass perturbed contribution #1527
    • Fixed MacOSX rpath
    • Fixed segmentation fault in g_current #1532
    • Fixed detection of i386 in thread-MPI #1533

    Release notes for 4.6.5

    • Fixed GPU-load balancing and GPU-sharing bugs introduced in a bug fix in 4.6.4 #1385
    • Fixed possible Verlet-scheme memory issue with SIMD on 32-bit builds
    • Fixed variable name in documentation
    • Fixed harmless bug with useless sorting of bonded interactions in free-energy\ calculations #1387

    Release notes for 4.6.4

    • Implemented plain-C SIMD macros for testing, reference, and development #1173
    • Introduced general 4-wide SIMD support for PME spread+gather and the NBNxN searching
    • Minor optimization to generic SIMD invsqrt #1333
    • Fixed bug with the SIMD padding in the case of very long group-scheme neighbor lists #1341.
    • Fixed rare division by zero in SIMD angles and dihedrals #1351
    • Fixed the NBNxN buffer size estimate to be more correct in the presence of vsites, and be less conservative overall. This enhances performance, particularly with GPUs.
    • Consolidated and refactored NBNxN SIMD kernel utility routines, and bounding box data structures. Split up Verlet SIMD kernels for faster compilation.
    • Added BlueGene/Q Verlet cut-off scheme kernels, enhancements to CMake handling, support for bgclang (but latest compiler does not yet work with OpenMP), support for A2 core and QPX SIMD in CPU detection, updates to install guide.
    • Removed (harmless) left-over in NBNxN SIMD kernels, improving performance of PME + pressure coupling by about 5%.
    • Fixed atom sorting with NBNxN kernels with bonded interactions at long ranges
    • Reorganized GPU detection, and selection. Clarified reporting on and documentation of GPU usage.
    • Fixed GPU detection to occur only once per physical node #1358
    • Enabled GPU sharing among tMPI ranks
    • Corrected dynamic load balancing when MPI ranks share GPUs
    • Added support for using CUDA texture objects, using CUDA stream priorities, and compilation that can optimize for CUDA compute capabaility 3.5
    • Fixed tMPI_Atomic_memory_barrier for Xeon Phi
    • Enhanced configure-time support for testing for the presence of atomic operations. #1355
    • Fixed issue where OpenMP threads could be pinned to the same cores #1360
    • Fixed multiple distance restraints with OpenMP #1316
    • mdrun without OpenMP with thread-MPI now uses all cores #1317
    • Corrected volume with serial NPT replica exchange #1362
    • Fixed restart from checkpoint re-initializin Wang-Landau weights in expanded ensemble calculations. #1350
    • Disabled expanded ensemble for all integrators but md-vv #1321
    • Fixed problem caused by numerical overflow in expanded ensemble #1314
    • Fixed a problem with type 2 pair interactions in free-energy calculations #1315
    • Fixed reaction-field free energy bug #1318
    • Fixed logic for free energies with mdrun -rerun #1330
    • Fixed sc-coul description in manual #1331
    • Made free energy PME kernel 40% faster
    • Fixed minor admin issues (pkg-config, install guide, date and version stamps, removing intermediate and output files from repo, cmake warning and error messages, cmake functionality, ctest usage, compiler warnings, fixing typos)
    • Fixed possible dereference of null pointer on compute nodes that lack getpwuid() #1301
    • Fixed pdb2gmx -vsite hydrogen -o conf.pdb
    • Removed buggy -seppot output #1294
    • Fixed a half bin misalignment in gmx_vanhove -or
    • Fixed limitations in number of frames g_cluster can handle
    • Various fixes to g_tune_pme (no assumption that MPI is available, better error messages) #1319
    • Fixed reading history_t from checkpoint #1174
    • Fixed total time measurement with separate PME nodes #1325
    • Fixed mdrun -version to finalize MPI properly #1313
    • Fixed parallel normal modes with PME. Also fixed some EM/NM output layout. #1308
    • Fixed GMX_DD_DUMP_GRID output
    • Fixed minor wallcycle output issues
    • Corrected grompp constraint/DOF warning with vsites #1322
    • Fixed uninitialized error in g_enemat #1312
    • Updated configure-time management of linear algebra libraries #771,#1186
    • Enabled finding mkl.h on more icc versions #1110
    • Updated dlist.c to help analysis tools like g_chi recognize more atom names.

    Release notes for 4.6.3

    • Fixed concurrency issues with thread affinity settings with or without MPI #1270, #1254. Note that #1254 issue 3 seems to be an OpenMPI bug
    • Split cmake thread detection into core thread and thread_mpi
    • Fixed deadlock on Mac OS X during thread affinity check
    • Fixed inconsistent locality_order array for thread affinity
    • Fixed issues with code for atomic operations with XLC and on K-computer #1284, #1274
    • Really fixed SD and BD integrator OpenMP performance #1121
    • Fixed deadlock with replica exchange and -maxh #1273
    • Fixed DD internal state corruption during energy minimization #1272
    • Fixed syntax for FAHCORE in CMakeLists.txt
    • Fixed flops table to report CUDA analytical Ewald contribution
    • Fixing a problem when dhdl and replex are not multiples
    • Added CUDA compiler flags to version header
    • Fixed mdrun build to work with ORCA #1286
    • Corrected definition of EEL_USER #1289
    • Removed genion non-random insertion of ions, also stopped writing useless log file, and added warning about -conc option over-riding -nn or -np. #1236, #615, and #1208
    • Fixed minor g_bar output issues
    • Fixed -npstring command line argument in g_tune_pme -npstring
    • Added GMX_NO_CREDITS environment variable to disable showing credits. #1267
    • Removed unused variables
    • Fixed a CMake typo #1280
    • Fixed various CMake and compiler issues for BlueGene/Q #1281, #1282, #1283
    • Minor tweaks to qm_orca.c
    • Fixed some inconsistencies in pkg-config files

    Release notes for 4.6.2

    • Fixed linking with FFTW when using MKL for BLAS & LAPACK #1067
    • Made FFT detection more quiet when nothing was changing
    • Updated mechanism and documentation for linking to MKL for FFT, BLAS and LAPACK #1110 #1186
    • Added compiler flags to suppress over-zealous warnings from gcc 4.8
    • Added workaround for clang compiler on AMD FMA processors #1099
    • Fixed automated download of correct regression tests when working with git version
    • Added install guide section for BLAS & LAPACK #1186
    • Corrected grompp rvdw charge-group radii check #1164
    • Updated grompp cut-off and PME-Switch checks if the switch distance is too large #1179
    • Allowed PME to work with different numbers of OpenMP threads available on different MPI ranks #1171
    • Fixed PME time printing with -ntomp not equal to -ntomp_pme #1158
    • Improved PME load balancing by checking for grid restrictions
    • Improved load balancing with Verlet cut-off scheme
    • Fixed mdrun walltime reporting in .log file #1210
    • Added acceleration paths suitable for Fujitsu Sparc64, particularly group kernels for the K computer.
    • Added SIMD acceleration for angles and dihedrals
    • Refactored NxN kernels to use generic SIMD operations
    • Added CUDA PME kernels with analytical Ewald correction
    • Fixed mdrun -nsteps to handle large numbers of steps #1224
    • Fixed v-rescale thermostat to work with tau-t >= 0
    • Fixed copious issues with free-energy calculations, including 1-4 interactions #1225, Urey-Bradley angles in CHARMM27 #1115, perturbations of mass #1232, reruns from non-trr input #1240, bond constraints #1255
    • Added work-around for no-PBC, infinite-cutoff so that these work correctly (but are slower than 4.5-era code) #1249 #1095
    • Introduced fatal error for generalized Born free-energy calculations, since these do not work #1237
    • Fixed memory leak in expanded ensemble code #1265, and other fixes
    • Fixed reading old .tpr files with dihedral restraints #1194
    • Fixed enforced rotation with Verlet cut-off scheme #1155
    • Fixed g_hydorder and PBC #1238
    • Fixed incorrect scaling of cos-acceleration viscosity #1244
    • Improvements to OpenMM build (but still broken)
    • Fixed bugs in g_select subexpression handling #1216 #1219
    • Fixed g_sgangle legends
    • Fixed legend in essential dynamics output file
    • Fixed bug with g_mindist #1183
    • Fixed g_chi -omega to follow IUPAC definitions of dihedrals #953
    • Clarified g_gyrate -h #934
    • Fixed error in g_select online help #1262
    • Fixed -neutral option in genion, and -nq and -pq with ions of charge > 1 when using -neutral
    • Added GROMOS96 54A7 files from ATB website #773
    • Many minor enhancements to ThreadMPI
    • Fixed potential race conditions in ThreadMPI thread creation with pthreads #1254
    • Fixed thread offset & stride check
    • Improved thread-pinning informational messages
    • Fixed a thread-safety issue in affinity layout detection #1254
    • Improved logic about whether to switch to polling GPU wait
    • Fixed several remaining reference to outdated license

    Release notes for 4.6.1

    • increased shared object major version to 8 (should have done this for 4.6, sorry) #1147
    • updates to HTML manual, install guide, PDF manual, shell completions
    • copious minor bug fixes
    • various build system upgrades and fixes #1143
    • new and enhanced error messages
    • fixes for AdResS bugs (neighbour list construction, flop accounting, multiple tf tables)
    • fixed PME timing counter issues #1125
    • fixed PME load balance reporting
    • fixed forcerec to work with tools like genion and g_disre #1136
    • various GPU performance enhancements
    • fixed sd integrator with OpenMP threading #1138
    • various minor fixes for interacting with CUDA for GPUs
    • fixes for g_tune_pme to cope with new mdrun behaviour and changed command-line options (for both g_tune_pme and mdrun)
    • more checks for system support for setting thread affinities
    • removed inter-flag dependency in g_order
    • fixed issues with free-energy pertubation soft-core and cut-offs #1146
    • fixed issues with md-vv + nose-hoover + (nstcalcenergy > nsttcouple) #1129
    • incorporated new changes from release 4.5.x branch
    • prevented building with icc 11.1 and SSE4.1 because of known problems #1126
    • adding warning about not building with icc version < 12 #1126
    • fixed bug sorting atoms with GPUs introduced since 4.6 #1153
    • fixed issues with automated download of regression tests #1150
    • fixed bug with DD cut-off check and PME dynamic load balancing #1169

    Release notes for 4.6 (2013-01-19)

    New features
    • New Verlet non-bonded scheme which, by default, uses exact cut-off's and a buffered pair-list.
    • Multi-level hybrid parallelization (MPI + OpenMP + CUDA):
      • full OpenMP multithreading with the Verlet scheme;
      • OpenMP mulitthreading for PME-only nodes with the group scheme;
      • native GPU acceleration using CUDA (supporte NVIDIA hardware).
    • New x86 SIMD non-bonded kernels for the usual cut-off scheme, called group scheme and the new verlet scheme, use x86 SIMD intrinsics (no more assembly code):
      • SSE2
      • SSE4.1
      • AVX-128-FMA (for AMD Bulldozer/Piledriver)
      • AVX-256 (for Intel Sandy/Ivy Bridge)
    • Improved PME spread, gather, solve and FFT communication, total improvement ~25% on x86.
    • Automated OpenMP thread count choice to use all available cores.
    • Automated CPU affinity setting: locking processes or threads to cores.
    • Automated PP-PME (task) load-balancing: balancing non-bonded force and PME mesh workload when the two are executed on different compute-resources (i.e CPU and GPU or different CPUs). This enables GPU-CPU and PP-PME process load balancing by shifting work from the mesh to the non-bonded calculation.
    • PPPM/P3M with analytical derivative at the same cost and with the same features as PME.
    • New, advanced free energy sampling techniques.
    • AdResS adaptive resolution simulation support.
    • Enforced rotation ("rotational pulling")
    • Build configuration now uses CMake, configure+autoconf/make no longer supported. (The CMake build system features with a lot of automation and cleverness under the hood and we know that the it might not always prove to be as rock-solid as the old one. However, far more advanced and complex, so bear with us while we iron out issues that come up along the way.)
    • Improved regressiontests; these can now be run directly from the build tree using make check
    • g_hbond now utilizes OpenMP.

    No critical bugfixes. This version is based on 4.5.6 and all important fixes are "inherited"  and therefore documented in the 4.5.6 release notes.

    Changes that might affect your results

    None for simulations set up with the traditional group cut-off scheme.

    When switching from the group scheme to the Verlet scheme, integration of the equations of motion can get more accurate due to the exact cut-off treatment and buffering (this will, of course, depend on the original cut-off settings used). See the section Cut-off schemes for details.

    Other important changes compared to 4.5
    mdrun does now thread affinity setting
    This means that when runing multiple mdrun processes on the same machine, one has to either provide a core "pin offset" using the -pinoffset command line option, or turn off internal affinities and take the performance hit (or alternatively manage affinities externally).
    The choice of compiler matters more
    With the switch to SIMD intrinsics, up-to-date SIMD CPU acceleration support, OpenMP, the compiler used matters more both in terms the ability to compile GROMACS correctly  and from the point of view of mdrun performance. The recommended compilers that are known to work (=compile GROMACS correctly) and provide good performance on x86/AMD64 are: gcc 4.5 and later, Intel Compilers 12.0 and later and clang 3.1 (note the lack of OpenMP support which can cause 30%+ performance loss). In all cases you are strongly advised to use the most recent patch level available. GROMACS makes extensive use of compiler intrinsics to get the most out of your hardware, so if you use a compiler that is older than your hardware you are asking for trouble, because all the compilers have had bugs in their intrinsics implementations. For further details see ???.
    Bugfixes and improvements since beta3
    • fixed performance bug with SD+BD integration and OpenMP multi-threading (#1121)
    • fixes to free energy code, output & g_bar compatibility (#1090)
    • fixed multi-threading with hybrid GPU+CPU mode (#1100)
    • fixed GB interactions (#1096)
    • fixed issues with pressure control and infrequent evaluation
    • fixes for md-vv and rerun
    • fixed resetting states with parallel Verlet scheme
    • fixed nbnxn no LJ comb.rule AVX256 PME kernel
    • fix for compiler flag handling (#1038, #1040)
    • fixed bug with Verlet + DD + bonded atom communication
    • fixed SSE/AVX compilation under Windows (#1092, #1093, #1068)
    • fixed a bug with multiple exchanges
    • fixed nbnxn AVX-256 Ewald table pointer alignment
    • fixed GMXRC so we are not polluting standard shell variables
    • thread-MPI fixes for i386 llvm & simplification of atomics
    • added work-around for gcc bug in AVX intrinsincs formal parameter


    • regressiontests can be run from the build tree now (make check)
    • efficiency improvements for PP-PME load balancing + DD DLB
    • using topology information for thread affinity setting
    • added mdp option 'calc-lambda-neighbors'
    • made g_tune_pme work correctly again with thread-MPI mdrun
    • Blue Gene build system support and documnetation
    • made SHAKE work again with particle decompostion
    • g_sans - add trajectory avereging


    4.6-beta3 (2012-12-22)

    Bugfixes and improvements
    • fixed pressure in MTTK when using constrants and dispersion correction (#1061)
    • fixed expanded ensemble and Hamiltonian replica exchange
    • fixed Andersen temperature coupling use of random number generator in parallel
    • fixed Andersen temperature coupling when constraints present
    • fixed LINCS with virtual sites - there were bugs with how LINCS did constraining based upon the forces introduced when some parallelization was added previously
    • forced c++ linking of GPU utility routines
    • fixed GPU pair search when not using x86 CPU acceleration (#1042, #1062)
    • fixed compilation with PGI compiler
    • fixes to pull code (#1071)
    • fixed license details reported by GROMACS tools
    • completed removal of Fortran kernels (we are not aware of any systems where these would run faster than the corresponding non-accelerated C kernels by enough to be worth our effort, and probably the new force-only C kernels will be faster than the old Fortran kernels on any system where the disparity between Fortran and C compiler optimization is noticeable; speak up if any of this is a problem for you!)
    • completed removal of Power6 accelerated kernels (currently we lack the resources to implement accelerated kernels for Power architectures, and probably the new force-only C kernels will show results comparable with the old accelerated Power kernels; speak up if any of this is a problem for you - particularly if you have resources to offer to fix it!)
    • re-implemented Verlet kernels on AVX-256 hardware for better performance
    • prepared Verlet kernels for future development of non-x86 SIMD support (e.g. BlueGene SIMD acceleration support is planned)
    • improvements to functioning and documentation of mechanism to specify GPU IDs to mdrun 
    • improved communication when using P-LINCS and only constraints on bonds to hydrogen
    • GROMACS provides template code for user implementations of a custom GROMACS tool in share/template, which now builds by default (and correctly!), but does not install
    • fixed header file for use by external code linking to GROMACS
    • fixes to the Reference CMake build type we used for generating reference versions of our regression tests
    • added CMake option to disable printing of GROMACS cool quotes
    • minor improvements to CMake messages to user
    • CMake cleanup

    4.6-beta2 (2012-12-06)

    Bugfixes and improvements
    • re-enabled AdResS feature (only generic kernels for now);
    • improved OpenMP parallelization performance of non-bonded force calculation with Verlet scheme;
    • fixed segv in Verlet pair-search with trilinic domain-decomposition;
    • fixed incorrect virial with virtual sites and OpenMP;
    • fixed labelling of g_hbond plots;
    • fixed compilation issue with cmake 2.8.10 and GPU acceleration;
    • fixed issues with multi-sim runs and GPU-acceleration.

    4.6-beta1 (2012-11-30)

    First beta, yay*! See the release notes above.

    (*No previous version in the 4.6 series so no list of bugfixes and improvements here.)

    Page last modified 14:00, 14 Oct 2015 by mabraham