Gromacs

Handling Errors

    Table of contents
    No headers

    Version as of 00:37, 25 Aug 2019

    to this version.

    Return to Version archive.

    View current version

     

    To make Gromacs behave like a proper library, we need to change the way errors etc. are handled. Basically, the library should not print out anything to stdio/stderr unless it is part of the API specification, and even then, there should be a way for the user to suppress the output. Also, the library should never terminate the program without the user having control over this. There are two different cases, which are discussed separately below. Currently, these issues are under discussion, and there are no concrete guidelines yet.

    1) In cases when the library meets an error after which it does not make sense to continue processing, it should return an error code and let the caller decide what to do.

    • In this case, there should be a global list of possible error codes, and the library should return one of these. In addition, it should call an error handler with a more detailed description of the reason for the error. The default error handler could still abort the program, but the user could replace the error handler if there was need for it. The callback mechanism could be something very similar to what is currently in gmx_error.h, but the return codes should be defined there as well. The global error codes should include at least:
      • Out of memory (without exceptions we are likely forced to abort() on these cases with C++, but the return value should still be there)
      • File not found
      • Other OS I/O error
      • Invalid user input
      • Simulation instability
      • Invalid API call/value/internal error (we can also have a policy that the program should assert in such cases)
    • If the error is in violation of the API for the function (e.g., the programmer passes an invalid value), it is OK to assert(). The basic rule is that assert() is OK for conditions that should never happen unless there is a programming error; it is not OK to assert() if user input is incorrect.

    2) The library wants to report a warning or a non-fatal error, but is still able to continue processing. For example, what grompp does now with notes, warnings, and errors.

    • One way would be to have a common reporting interface for such cases. All library functions that potentially need it, would take as an extra parameter an object that implements this interface, and could then call functions in the interface to report warnings. We could have a default implementation that would still write out everything to stderr.

    Points for discussion:

    • How to handle functions that may fail as part of normal operation? E.g., a function that accesses data, and by design should also be callable when no data is available. These should not call the error handler, but should they return 0 and use another variable for reporting whether the call was successful? Or should we have a designated error code(s), e.g. all negative values, for such cases?
    • How to handle cases when the reason for the error is detected within a relatively deep call graph, but there is not enough information in that context to print an error message that's useful to the user? Five options:
      • Don't do anything, live with cryptic error messages.
      • Signal errors from the inner scopes with return values only and call the error handler only from an outer scope where enough context is known. Can make debugging harder, because the original reason for the error is no longer accessible if one breaks in the error handler. If this becomes a problem, could have a separate macro (ugh) that is used in the inner scope to call the error handler in development versions, but expands to nothing in release versions.
      • Pass enough information through all the function calls. If this information would not be otherwise needed for anything, it will make the code less modular.
      • Call the error handler from both scopes (with different values so that calls from different scopes can be recognized). Will make the error handler itself more complex.
      • Use the facilities from 2) above in such cases. Can easily result in overly complex code for handling simple errors.
    • Should the error handler be global, or thread-local?
    Page last modified 13:13, 26 Aug 2010 by tmurtola