Difference between revisions of "Tutorial for adding relaxation dispersion models to relax"

Revision as of 17:11, 18 November 2015

Tip See the mail archive for the original post.

The following is a tutorial for adding new relaxation dispersion models for either CPMG-type or R1rho-type experiments to the software relax. This includes both the models based on the analytic, closed-form expressions as well as the models involving numeric solutions of the Bloch-McConnell equations.

The tutorial will follow the example of the addition of the models already present within relax, pointing to the relevant commits for reference. To see the commit message and the code changes in colour, click on the links found within these commit messages. The models in the reference commits sections are in reverse chronological order and therefore the top links will be the most recent and relevant.

The test suite

This step is normally performed first. This is the most important part that makes sure that the code not only works now, but will continue working for the entire lifetime of the relax project.

The idea is that real or synthetic data, for example as Sparky peak lists, is obtained or created for the model and added to the test suite directory test_suite/shared_data/dispersion/. This is then used in a system test to check that the code in relax can consistently reproduce the results.

Synthetic data

It is very important that the code added to the relax library is not used to create the synthetic data! This type of data is useful for checking that the known solution can be found by relax. The only issue is that the same mistake can be made in both relax and the script used to generated the synthetic data, in which case the buggy relax code will never be detected. To mitigate against this, testing against other software is recommended.

Measured data

An alternative is to use real measured relaxation dispersion data. This data should be added as peak lists containing peak intensities to test_suite/shared_data/dispersion/. As the real solution cannot be known a priori, the results from relax must be compared to results obtained from another software program (possibly directly from a publication). The steps required for using such data are:

Create a new directory name for the test data.

Add the original full peak lists to the directory.

Make truncated versions of these files (ending in _trunc.*) and add these as well. These will be used for the system test instead of the full data to allow the test to finish in a reasonable amount of time.

Add a script which performs the full analysis in relax for the model. Also a script which performs the analysis using only the R2eff model. See the test_suite/shared_data/dispersion/Hansen/*.py scripts for reference - these scripts should be copied to your data directory and modified (using the svn cp command). Once the scripts are functional, they can be copied and modified for the truncated data (again using the svn cp command).

Copy the full analysis script to test_suite/system_tests/scripts/relax_disp/ with an appropriate name (always using the svn cp command). This can then be used in a new system test. Better still, the final save file from the r2eff_calc.py script for the truncated data can be used to start the script. This is again to save a lot of computation time in the test. See the test_tp02_data_to_ns_r1rho_2site() system test in the test_suite/system_tests/relax_disp.py file for a template.

If you are not a relax developer, a support request can be submitted. You can attach files and add comments to that request for a relax developer to make the changes for you.

Reference commits

The MQ CR72 model at r21122
The MQ NS CPMG 2-site model at r21018 r21019 r21020 r21024 r21028 r21029 r21030 r21031
The TSMFK01 model at r20782
The NS R1rho 2-site model at r20729 r20739 r20732
The TP02 model at r20500 r20538 r20541 r20537
The M61 model at r19891 r19892 r19893 r19906 r19907

Creating a new experiment type

If the model being added is for a completely new data type, then support for this must be added. In almost all cases, the experiment type will already be supported.

Reference commits

The MQ NS CPMG 2-site model at r21058 r21060 r21061 r21063 r21064

Adding the model to the list

Firstly the model should be added to the lists of the specific_analyses.relax_disp.variables module. The model name is stored in a special variable which will be used throughout relax.

Reference commits

The MQ CR72 model at r21123
The MQ NS CPMG 2-site model at r21032
The NS R1rho 2-site model at r20721
The TP02 model at r20483
The DPL94 model at r19997
The M61 skew model at r19968
The M61 model at r19855
The No Rex model at r19836

The relax_disp.select_model user function front end

The next step is to add the model, its description, the equations for the analytic models, and all references to the relax_disp.select_model user function front end.

Reference commits

The MQ CR72 model at r21124
The MQ NS CPMG 2-site model at r21035
The NS R1rho 2-site model at r20722
The TP02 model at r20484
The DPL94 model at r19998
The M61 skew model at r19969
The M61 model at r19856
The No Rex model at r19836
The CR72 model at r19812

The relax_disp.select_model user function back end

Now the back end of the relax_disp.select_model user function for the model can be added. This involved identifying the model and constructing the parameter list.

Reference commits

The MQ CR72 model at r21125
The MQ NS CPMG 2-site model at r21055
The NS R1rho 2-site model at r20726
The TP02 model at r20493
The DPL94 model at r20002
The M61 skew model at r19976
The M61 model at r19866
The No Rex model at r19836

Adding support for the parameters

This is needed to enable the model. It involves modifying many of the modules in the specific_analyses.relax_disp package.

Reference commits

The MQ NS CPMG 2-site model at r21056
The CR72 model at r19817

The target function

The target function is used in optimisation and is a class method which takes as a single argument the parameter vector. This list is changed by the minimisation algorithm during optimisation. The target function should then return a single floating point number - the chi-squared value.

Again in this example, the code for the M61 is copied from the LM63 model and then modified.

Reference commits

The MQ CR72 model at r21126
The MQ NS CPMG 2-site model at r21067
The NS R1rho 2-site model at r20725
The TP02 model at r20492
The DPL94 model at r20001
The M61 skew model at r19974
The M61 model at r19860 r19904 r19905
The No Rex model at r19836
The CR72 model at r19815

The relax library

Now the dispersion function needs to be added to the relax library (in the lib.relax_disp package). This should be designed as a simple Python function which takes the dispersion parameters and experimental variables, and calculates the R_2,eff/R_1ρ values. The module can contain auxiliary functions for the calculation. Some auxiliary functions, if not specific to relaxation dispersion, may be better placed in other locations within the relax library. Remember to add all new modules to the lib.relax_disp.__init__ package __all__ list.

The relaxation dispersion functions in the library currently take as an argument a data structure for the back-calculated R_2,eff/R_1ρ values and populate this structure. This design is not essential if the target function, described in the next point, handles the library function appropriately. Just look at the files in lib/dispersion to get an idea of the design used.

The dispersion code in the relax library must be robust. This involves identifying parameter values or combinations which would cause failures in the mathematical operations (numerical issues not present in the mathematics must be considered). Note that parameter values of 0 are common within a grid search. It should be decided if the R_2,eff/R_1ρ value should be set to zero, to another value, or to something large (e.g. 1e100). For example:

Divisions - always catch zeros in the denominator with if statements, even if you believe that this will never be encountered. Square roots - make sure that the value inside is always > 0. Trigonometric functions - these should be tested for where they are not defined or where the software implementation can no longer handle certain values. For example try cosh(1000) in Python.

Reference commits

The MQ CR72 model at r21127
The MQ NS CPMG 2-site model at r21068
The NS R1rho 2-site model at r20723
The TP02 model at r20486 r20490 r20524
The DPL94 model at r20000
The M61 skew model at r19973
The M61 model at r19859
The CR72 model at r19814 r19816 r19819 r19833

Comparing to other software

It can happen that a bug present in the lib.dispersion package code is also replicated in the synthetic data. This is not uncommon. Therefore it is very useful to use other software with the test data from the test-suite step to see if the original parameters can be found. A good example can be seen in the test_suite/shared_data/dispersion/Hansen which contains Dr. Flemming Hansen's CPMG data (see the README file) and the results from different programs including NESSY, relax, CPMGFit, and ShereKhan. The comparison is in the file 'software_comparison'.

Once the relax code is able to find identical or better results than the dispersion softwares, then the values found in the test suite optimisation can be locked in. The assertEqual() and assertAlmostEqual() methods can be used to only allow the test to pass when the correct values are found.

Debugging

This step should not require an explanation. It goes hand-in-hand with the test suite and the comparison to other software.

The auto-analysis

The model variable in specific_analyses.relax_disp.variables needs to be imported into the auto_analyses.relax_disp module. This is then used in the write_results() method to output text files and Grace plots of the parameters. Be sure that the model variable is added to each part of this method corresponding to the parameters of the model.

Reference commits

The MQ CR72 model at r21128
The MQ NS CPMG 2-site model at r21072
The TP02 model at r20771
The NS R1rho 2-site model at r20770
The LM63 3-site model at r20427 r20433

The GUI

The model needs to also be added to the graphical user interface (GUI). This is in the gui.analyses.auto_relax_disp module. The model variable should first be imported. In the __init__() method, it should be decided if the model should be selected by default or if the user should manually select the model during the analysis. If the former, then it should be added to the ds.relax_gui.analyses[data_index].disp_models list.

For the model to be accessible via the GUI, it must be added to the Disp_model_list_cpmg or Disp_model_list_r1rho model list classes (at the bottom of the module). The model variable should be added to the models list, and the list of parameters added to the params list.

Reference commits

The MQ CR72 model at r21129
The MQ NS CPMG 2-site model at r21073
The NS R1rho 2-site model at r20755
The TP02 model at r20755
The LM63 3-site model at r20456
The IT99 model at r20006
The DPL94 model at r20006
The M61B model at r20006
The M61 model at r19861
The R2eff model at r19861
The No Rex model at r19836 r19861

The relax manual

The next step is to add the model, its description, the equations for the analytic models, and all references to the relaxation dispersion chapter of the relax manual (the source is the docs/latex/dispersion.tex file). The model could also be included in the script section of the chapter.

Reference commits

The MQ CR72 model at r21144
The MQ NS CPMG 2-site model at r21077
The NS R1rho 2-site model at r20727
The TP02 model at r20485 r20491 r20540
The LM63 3-site model at r20409
The NS CPMG 2-site expanded model at r20366
The NS CPMG 2-site star model at r20315
The M61 model at r20539
The CR72 model at r20321

The sample scripts

If the added model is to be presented to the user, it should also be added to the sample scripts. This includes all scripts in the sample_scripts/relax_disp/ directory. For example it can be included in the MODELS list in the cpmg_analysis.py script.

Reference commits

The NS CPMG 2-site expanded model at r20488

@@ Line 131: / Line 131: @@
 == The relax library ==
-Now the dispersion function needs to be added to the relax library (in the lib.relax_disp package).  This should be designed as a simple Python function which takes the dispersion parameters and experimental variables, and calculates the R2eff/R1rho values.  The module can contain auxiliary functions for the calculation.  Some auxiliary functions, if not specific to relaxation dispersion, may be better placed in other locations within the relax library.  Remember to add all new modules to the lib.relax_disp.__init__ package __all__ list.
+Now the dispersion function needs to be added to the relax library (in the <code>lib.relax_disp</code> package).  This should be designed as a simple Python function which takes the dispersion parameters and experimental variables, and calculates the {{:R2eff}}/{{:R1rho}} values.  The module can contain auxiliary functions for the calculation.  Some auxiliary functions, if not specific to relaxation dispersion, may be better placed in other locations within the relax library.  Remember to add all new modules to the <code>lib.relax_disp.__init__</code> package <code>__all__</code> list.
-The relaxation dispersion functions in the library currently take as an argument a data structure for the back-calculated R2eff/R1rho values and populate this structure.  This design is not essential if the target function, described in the next point, handles the library function appropriately.  Just look at the files in lib/dispersion to get an idea of the design used.
+The relaxation dispersion functions in the library currently take as an argument a data structure for the back-calculated {{:R2eff}}/{{:R1rho}} values and populate this structure.  This design is not essential if the target function, described in the next point, handles the library function appropriately.  Just look at the files in <code>lib/dispersion</code> to get an idea of the design used.
-The dispersion code in the relax library must be robust.  This involves identifying parameter values or combinations which would cause failures in the mathematical operations (numerical issues not present in the mathematics must be considered).  Note that parameter values of 0 are common within a grid search.  It should be decided if the R2eff/R1rho value should be set to zero, to another value, or to something large (e.g. 1e100).  For example:
+The dispersion code in the relax library must be robust.  This involves identifying parameter values or combinations which would cause failures in the mathematical operations (numerical issues not present in the mathematics must be considered).  Note that parameter values of 0 are common within a grid search.  It should be decided if the {{:R2eff}}/{{:R1rho}} value should be set to zero, to another value, or to something large (e.g. 1e100).  For example:
-Divisions - always catch zeros in the denominator with if statements, even if you believe that this will never be encountered.  Square roots - make sure that the value inside is always > 0.  Trigonometric functions - these should be tested for where they are not defined or where the software implementation can no longer handle certain values.  For example try cosh(1000) in Python.
+Divisions - always catch zeros in the denominator with if statements, even if you believe that this will never be encountered.  Square roots - make sure that the value inside is always > 0.  Trigonometric functions - these should be tested for where they are not defined or where the software implementation can no longer handle certain values.  For example try <code>cosh(1000)</code> in Python.
 === Reference commits ===
@@ Line 149: / Line 149: @@
 * The [[M61]] model at [http://article.gmane.org/gmane.science.nmr.relax.scm/17615 r19859]
 * The [[CR72]] model at [http://article.gmane.org/gmane.science.nmr.relax.scm/17570 r19814] [http://article.gmane.org/gmane.science.nmr.relax.scm/17572 r19816] [http://article.gmane.org/gmane.science.nmr.relax.scm/17575 r19819] [http://article.gmane.org/gmane.science.nmr.relax.scm/17589 r19833]
 == Comparing to other software ==

Difference between revisions of "Tutorial for adding relaxation dispersion models to relax"

Revision as of 17:11, 18 November 2015

Contents

The test suite

Synthetic data

Measured data

Reference commits

Creating a new experiment type

Reference commits

Adding the model to the list

Reference commits

The relax_disp.select_model user function front end

Reference commits

The relax_disp.select_model user function back end

Reference commits

Adding support for the parameters

Reference commits

The target function

Reference commits

The relax library

Reference commits

Comparing to other software

Debugging

The auto-analysis

Reference commits

The GUI

Reference commits

The relax manual

Reference commits

The sample scripts

Reference commits

See also

Navigation menu

Search