/**\page devguide Developer Guide This page presents instructions on how to develop code using \imp. Developers who wish to contribute code back to \imp or distribute their code should also read the \ref contributing "Contributing code to IMP" page. \tableofcontents \section gettingaround Getting around The input files in the \imp directory are structured as follows: - \c tools contains various command line utilities for use by developers. They are \ref scripts "documented below". - \c doc contains inputs for general \imp overview documentation (such as this page), as well as configuration scripts for \c doxygen. - \c applications contains various applications implementing using a variety of \imp modules. - \c kernel and the subdirectories of \c module/ each defines a module and have the same structure. The directory for module \c name has the following structure - \c include contains the C++ header files - \c src contains the C++ source files - \c bin contains C++ source files each of which is built into an executable - \c pyext contains files defining the Python interface to the module as well as Python source files (in \c pyext/src) - \c test contains test files. When \c scons \c test or \c scons \c name-test is run each file in this directory named \c test_ is executed (after being built if it is a .cpp file) - \c doc contains the overview documentation for the file (in the \c SConscript file) as well as any other documentation that is provided via \c .dox files - \c examples contains examples, in Python as well as any data needed for examples - \c data contains any data files needed by the module When \imp is built, a number of directories are created in the build directory. They are - \c include which includes all the headers. The headers for the \c kernel are placed in \c include/IMP and those for module \c name are placed in \c include/IMP/name - \c lib where the C++ and Python libraries are placed. Module \c name is built into a C++ library \c lib/libimp_name.so (or \c .dylib on a mac) and a Python library with Python files located in \c lib/IMP/name and the binary part in \c lib/_IMP_name.so. - \c doc where the html documentation is placed in \c doc/html and the examples in \c doc/examples with a subdirectory for each module - \c data where each module gets a subdirectory for its data. When \imp is installed, the structure from the \c build directory is moved over more or less intact except that the C++ and Python libraries are put in the (different) appropriate locations. \section usage Writing new functions and classes The easiest way to start writing new functions and classes is to create a new module using the \ref make_module "make-module script". This creates a new module in the \c modules directory, complete with %example code. We highly recommend using a revision control system such as \svn or \external{http://git-scm.com/, GIT} to keep track of changes to your module. If, instead, you choose to add code to an existing module you need to consult with the person who people who control that module. Their names can be found on the module main page. When designing the interface for your new code, you should - search \imp for similar functionality and, if there is any, adapt the existing interface for your purposes. For %example, the existing IMP::atom::read_pdb() and IMP::atom::write_pdb() functions provide templates that should be used for the design of any functions that create particles from a file or write particles to a file. Since IMP::atom::BondDecorator, IMP::algebra::Segment3D and IMP::display::Geometry all use methods like IMP::algebra::Segment3D::get_point() to access the endpoints of a segment, any new object which defines similar point-based geometry should do likewise. - think about how other people are likely to use the code. For %example, not all molecular hierarchies have atoms as their leaves, so make sure your code searches for arbitrary IMP::core::XYZDecorator particles rather than atoms if you only care about the geometry. - look for easy ways of splitting the functionality into pieces. It generally makes sense, for %example, to split selection of the particles from the action taken on them, either by accepting a IMP::ParticleRefiner, or a IMP::SingletonContainer or just an arbitrary IMP::Particles object. You may want to read \ref design_example "the design example" for some suggestions on how to go about implementing your functionality in \imp. \section module Managing your own module When there is a significant group of new functionality, a new set of authors, or code that is dependent on a new external dependency, it is probably a good idea to put that code in its own module. To create a new module, run the \ref make_module "make_module" script from the main \imp source directory, passing the name of your new module. The module name should consist of lower case characters and numbers and the name should not start with a number. In addition the name "local" is special and is reserved to modules that are internal to code for handling a particular biological system or application. eg \code ./tools/make_module mymodule \endcode The script adds a number of example classes to the new module, which can be read and deleted. The next step is to update the information about the module stored in \c modules/mymodule/doc/SConscript. This includes the names of the authors and descriptions of what the module is supposed to do. If the module makes use of external libraries, you can add code to check for the external library to \c modules/mymodule/SConscript. Add a check for a library called \c libextern.so (or \c libextern.dylib on a mac) which is accessed using the header \c extern.h, using a function call \c call_extern(1.0) - make sure the \c SConscript includes \code import scons_tools.checks \endcode - then add the check with \code scons_tools.checks.add_external_library(env, "Extern", "extern", "extern.h", body="call_extern(1.0);") \endcode - if the library requires that other libraries be linked in also, you can add the arguments \code extra_libs=["first_required_library", "second_required_library"] That adds a check for the library. To use it in your module, you need to decide if it is required or not. If it is required, add the argument \code required_dependencies=["Extern"] \endcode to the \c env.IMPModuleBuild call. This will make it so that your module is only built if the library is found. If the library is optional, instead add \code optional_dependencies=["Extern"] \endcode When writing C++ code that depends on extern, surround it with ``` \#if defined(IMP_MODULENAME_USE_EXTERN) some code using extern \#endif ``` And when writing Python code check if \c IMP.mymodule.use_extern is \c True. Each module has an auto-generated header called \c modulename_config.h. This header contains basic definitions needed for the module and should be included (first) in each header file in the module. In addition, there is a header \c module_version.h which contains the version info as preprocessor symbols. This should not be included in module headers or cpp files as doing so will force frequent recompilations. \section testing Debugging and testing your code Ensuring that your code is correct can be very difficult, so \imp provides a number of tools to help you out. The first set are assert-style macros: - IMP_USAGE_CHECK() which should be used to check that arguments to functions and methods satisfy the preconditions. - IMP_INTERNAL_CHECK() which should be used to verify internal state and return values to make sure they satisfy pre and post-conditions. See \ref assert "Error reporting/checking" page for more details. As a general guideline, any improper usage to produce at least a warning all return values should be checked by such code. The second is logging macros such as: - IMP_LOG() which allows controlled display of messages about what the code is doing. See \ref log "logging" for more information. Finally, each module has a set of unit tests. The tests are located in the \c modules/modulename/test directory. These tests should try, as much as possible to provide independent verification of the correctness of the code. Any file in that directory or a subdirectory whose name matches `test_*.{py,cpp}`, `medium_test_*.{py,cpp}` or `expensive_test_*.{py,cpp}` is considered a test. Normal tests should run in at most a few seconds on a typical machine, medium tests in 10 seconds or so and expensive tests in a couple minutes. Some tests will require input files or temporary files. Input files should be placed in a directory called \c input in the \c test directory. The test script should then call \command{self.get_input_file_name(file_name)} to get the true path to the file. Likewise, appropriate names for temporary files should be found by calling \command{self.get_tmp_file_name(file_name)}. Temporary files will be located in \c build/tmp. The test should remove temporary files after using them. \section coverage Code coverage To assist in testing your code, we report the coverage of all \imp modules and applications as part of the \salilab{imp/nightly/results/,nightly builds}. Coverage is basically a report of which lines of code were executed by your tests; it is then straightforward to see which parts of the code have not been exercised by any test, so that you can write new tests to test those parts. (Of course, lines of code that are never executed have no guarantee of working correctly.) Both the C++ and Python code coverage is reported. For C++ code, only the lines of code that were exercised are reported; for Python code, which conditional branches were taken are also shown (for example, whether both branches from an 'if' statement are followed). Typically, coverage reflects the lines of code in a module or application that were exercised only by running its own tests, rather than the tests of the entire \imp package, and generally speaking you should try to test a module using its own tests. However, some modules (like the kernel) provide many abstract classes and so are partly tested by other modules (such as core), and in these special cases we collect coverage for all of the modules together (e.g. core module tests affect the reported coverage of the kernel). (Note that this cannot be done for all of \imp as running the tools takes too long.) If you have code that for some reason you wish to exclude from coverage, you can add specially formatted comments to the code. For Python code, \external{http://nedbatchelder.com/code/coverage/excluding.html,add a "pragma: no cover"} comment to the line to exclude. For C++ code, an individual line can be excluded by adding LCOV_EXCL_LINE somewhere on that line, or a block can be excluded by surrounding it with lines containing LCOV_EXCL_START and LCOV_EXCL_STOP. \section codingconventions Coding conventions Make sure you read the \ref conventions "API conventions" page first. To ensure code consistency and readability, certain conventions must be adhered to when writing code for \imp. Some of these conventions are automatically checked for by source control before allowing a new commit, and can also be checked yourself in new code by running \command{scons standards} \subsection indent Indentation All C++ headers and code should be indented in 'Linux' style, with 2-space indents. Do not use tabs. This is roughly the output of Artistic Style run like \command{astyle --convert-tabs --style=linux --indent=spaces=2 --unpad=paren --pad=oper}. Split lines if necessary to ensure that no line is longer than 80 characters. \b Rationale: Different users have different-sized windows or terminals, and different tab settings, but everybody can read 80 column output without tabs. All Python code should conform to the \external{http://www.python.org/dev/peps/pep-0008/, Python style guide}. In essence this translates to 4-space indents, no tabs, and similar class, method and variable naming to the C++ code. You can ensure that your Python code is correctly indented by using the \command{tools/reindent.py} script, available as part of the \imp distribution. \subsection names Names See the names section of the \ref conventions "IMP conventions" page. In addition, developers should be aware that - all preprocessor symbols (things created by `#define`) must begin with \c IMP and no \imp code should depend on preprocessor symbols which do not start with IMP. - names of files that implement a single class should be named for that class; for %example the SpecialVector class could be implemented in \c SpecialVector.h and \c SpecialVector.cpp - files that provide free functions or macros should be given names \c separated_by_underscores, for %example \c container_macros.h - Functions which take a parameter which has units should have the unit as part of the function name, for %example IMP::atom::SimulationParameters::set_maximum_time_step_in_femtoseconds(). Remember the Mars orbiter. The exception to this is distance and force numbers which should always be in angstroms and kcal/mol angstrom respectively unless otherwise stated. . \b Rationale: This makes it easier to tell between class names and function names where this is ambiguous (particularly an issue with the Python interface). The Python guys also mandate CamelCase for their class names, so this avoids any need to rename classes between C++ and Python to ensure clean Python code. Good naming is especially important with preprocessor symbols since these have file scope and so can change the meaning of other people's code. \subsection datastorage Passing and storing data - When a class or function takes a set of particles which are expected to be those of a particular type of decorator, it should take a list of decorators instead. eg IMP::core::transform() takes a IMP::core::XYZ. This makes it clearer what attributes the particle is required to have as well as allows functions to be overloaded (so there can be an IMP::core::transform() which takes IMP::core::RigidBody particles instead). - IMP::Restraint and IMP::ScoreState classes should generally use a IMP::SingletonContainer (or other type of Container) to store the set of IMP::Particle objects that they act on. - Store collections of IMP::Object-derived objects of type \c Name using a \c Names. Declare functions that accept them to take a \c NamesTemp (\c Names is a \c NamesTemp). \c Names are reference counted (see IMP::RefCounted for details), \c NamesTemp are not. Store collections of particles using a \c Particles object, rather than decorators. \subsection display Display All classes must have a \c show method which takes an optional \c std::ostream and prints information about the object (see IMP::Object::show() for an example). The helper macros, such as IMP_RESTRAINT() define such a method. In addition they must have \c operator<< defined. This can be easily done using the IMP_OUTPUT_OPERATOR() macro once the show method is defined. Note that \c operator<< writes human readable information. Add a \c write method if you want to provide output that can be read back in. \subsection errors Errors Classes and methods should use \imp exceptions to report errors. See IMP::Exception for a list of existing exceptions. See \ref assert "a list of functions to aid in error reporting and detection". \subsection internal_ns Namespaces Use the provided \c IMPMODULE_BEGIN_NAMESPACE, \c IMPMODULE_END_NAMESPACE, \c IMPMODULE_BEGIN_INTERNAL_NAMESPACE and \c IMPMODULE_END_INTERNAL_NAMESPACE macros to put declarations in a namespace appropriate for module \c MODULE. Each module has an internal namespace, \c module_name::internal and an internal include directory \c modulename/internal. Any function which is - not intended to be part of the API, - not documented, - liable to change without notice, - or not tested should be declared in an internal header and placed in the internal namespace. The functionality in such internal headers is - not exported to Python - and not part of of documented API As a result, such functions do not need to obey all the coding conventions (but we recommend that they do). \section docs Documenting your code \imp is documented using \doxygen. See \external{http://www.doxygen.nl/docblocks.html, documenting source code with doxygen} to get started. We use \c //! and \c /** ... * / blocks for documentation. Python code should provide Python doc strings. All headers not in internal directories are parsed through \doxygen. Any function that you do not want documented (for %example, because it is not well tested), hide by surrounding with ``` \#ifndef IMP_DOXYGEN void messy_poorly_thought_out_function(); \#endif ``` We provide a number of extra Doxygen commands to aid in producing nice \imp documentation. The commands are used by writing \c \\commandname{args} or \c \\commandname if there are no arguments. - When you want to specify some command-line command do \verbatim \command{the command text}\endverbatim which produces \command{the command text} - To produce a link to a page on the Sali lab web site do \verbatim \salilab{imp, the IMP project}\endverbatim which produces \salilab{imp, the IMP project} - To produce a link to the outside world do \verbatim \external{http://boost.org, Boost}\endverbatim produces \external{http://boost.org, Boost} - When writing the name \imp do \verbatim \imp\endverbatim so that no link is produced (\imp as opposed to IMP). - Sections of documentation that are only for people developing \imp code should be marked with \verbatim \advanceddoc You can tweak this class in various ways in order to optimize its performance. \endverbatim Similarly advanced methods should be marked with \verbatim \advancedmethod\endverbatim To produce \advancedmethod. - General warning messages can be produced using \verbatim \warning Be afraid, be very afraid.\endverbatim which produces \warning Be afraid, be very afraid. - To mark that some part of the API has not yet been well planned at may change using \c \\unstable{Classname}. The documentation will include a disclaimer and the class or function will be added to a list of unstable classes. It is better to simply hide such things from \doxygen. - To mark a method as not having been well tested yet, use \c \\untested{Classname}. - To mark a method as not having been implemented, use \c \\untested{Classname}. - To note that a class supports comparisons (eg <, >, ==, != etc) use \c \\comparable and then hide the comparison functions from \doxygen (there are a lot of them and they aren't very interesting). \section scoring Scoring Restraints take the current conformation of the particles and return a score and, if requested, add to the derivatives of each of the particles used. Evaluation can be done each of two ways - whole model, where each of the particles is assumed to have changed using IMP::core::RestraintsScoringFunction and - incremental, where only a few of the particles are assumed to have changed using IMP::core::IncrementalScoringFunction In whole model evaluation, each restraint is called one at a time and given a change to computes its score based on the current conformation of the particles and adds to each particles derivatives. That is, if \f$R(P_i)\f$ is the score of the restraint on particle conformation \f$i\f$ and \f$R'(P_i)\f$ and there are no other restraints:
Stage | Score for R | Particle attribute | Particle derivative |
before model evaluation | undefined | \f$P_0\f$ | undefined |
before restraint evaluation | 0 | \f$P_0\f$ | 0 |
after restraint evaluation | \f$R(P_0)\f$ | \f$P_0\f$ | \f$R'(P_0)\f$ |