Open main menu

Background

Script inspiration

model-free : Script inspiration for setup and analysis

The distribution of relax includes a folder sample_scripts/model_free which contain a folder with scripts for analysis.

It can be seen here: https://github.com/nmr-relax/relax/tree/master/sample_scripts/model_free

Here is the current list

  • back_calculate.py. Back-calculate and save relaxation data starting from a saved model-free results file.
  • bmrb_deposition.py Script for creating a NMR-STAR 3.1 formatted file for BMRB deposition of model-free results.
  • cv.py Script for model-free analysis using cross-validation model selection.
  • dasha.py Script for model-free analysis using the program Dasha.
  • dauvergne_protocol.py Script for black-box model-free analysis.
  • diff_min.py Demonstration script for diffusion tensor optimisation in a model-free analysis.]
  • final_data_extraction.py Extract Data to Table
  • generate_ri.py Script for back-calculating the relaxation data.
  • grace_S2_vs_te.py Script for creating a grace plot of the simulated order parameters vs. simulated correlation times.
  • grace_ri_data_correlation.py Script for creating correlations plots of experimental verses back calculated relaxation data.
  • map.py Script for mapping the model-free space for OpenDX visualisation.
  • mf_multimodel.py This script performs a model-free analysis for the models 'm0' to 'm9' (or 'tm0' to 'tm9').
  • modsel.py Script for model-free model selection.
  • molmol_plot.py Script for generating Molmol macros for highlighting model-free motions
  • palmer.py Script for model-free analysis using Art Palmer's program 'Modelfree4'. Download from http://comdnmr.nysbc.org/comd-nmr-dissem/comd-nmr-software
  • remap.py Script for mapping the model-free space.
  • single_model.py This script performs a model-free analysis for the single model 'm4'.
  • table_csv.py Script for converting the model-free results into a CSV table.
  • table_latex.py Script for converting the model-free results into a LaTeX table.

Other script inspiration for checking

The distribution of relax includes a folder sample_scripts/ which contain a folder with scripts for analysis.

It can be seen here: https://github.com/nmr-relax/relax/tree/master/sample_scripts

R1 / R2 Calculation

The resultant plot is useful for finding bad points or bad spectra when fitting exponential curves determine the R1 and R2 relaxation rates. If the averages deviate systematically from zero, bias in the spectra or fitting will be clearly revealed. To use this script, R1 or R2 exponential curve fitting must have previously have been carried out the program state saved to the file 'rx.save' (either with or without the .gz or .bz2 ). The file name of the saved state can be changed at the top of this script.

NOE calculation

  • noe.py Script for calculating NOEs.

Test data

Severe artifacts can be introduced if model-free analysis is performed from inconsistent multiple magnetic field datasets. The use of simple tests as validation tools for the consistency assessment can help avoid such problems in order to extract more reliable information from spin relaxation experiments. In particular, these tests are useful for detecting inconsistencies arising from R2 data. Since such inconsistencies can yield artifactual Rex parameters within model-free analysis, these tests should be use routinely prior to any analysis such as model-free calculations. This script will allow one to calculate values for the three consistency tests J(0), F_eta and F_R2. Once this is done, qualitative analysis can be performed by comparing values obtained at different magnetic fields. Correlation plots and histograms are useful tools for such comparison, such as presented in Morin & Gagne (2009a) J. Biomol. NMR, 45: 361-372.

Other representations

  • angles.py Script for calculating the protein NH bond vector angles with respect to the diffusion tensor.
  • xh_vector_dist.py Script for creating a PDB representation of the distribution of XH bond vectors.
  • diff_tensor_pdb.py Script for creating a PDB representation of the Brownian rotational diffusion tensor.

Scripts - Part 2

We now try to setup things a little more efficient.

Relax is able to read previous results file, so let us divide the task up into:

  • 1: Load the data and save as state file. Inspect in GUI before running.
  • 2: Run the Model 1: local_tm.
  • 3: Here make 4 scripts. Each of them only depends on Model 1:
    • Model 2: sphere
    • Model 3: prolate
    • Model 4: oblate
    • Model 5: ellipsoid
  • 4: Make an intermediate 'final' model script. This will automatically detect files from above.

Prepare data

We make a new folder and try.

See commands
mkdir 20171010_model_free_2_HADDOCK
cp 20171010_model_free/*.dat 20171010_model_free_2_HADDOCK
cp 20171010_model_free/*.pdb 20171010_model_free_2_HADDOCK

# Get scripts
cd 20171010_model_free_2_HADDOCK
git init
git remote add origin git@github.com:tlinnet/relax_modelfree_scripts.git
git fetch
git checkout -t origin/master

And a new one, changing the NOE error

See commands
mkdir 20171010_model_free_3_HADDOCK
cp 20171010_model_free/*.dat 20171010_model_free_3_HADDOCK
cp 20171010_model_free/*.pdb 20171010_model_free_3_HADDOCK

# Get scripts
cd 20171010_model_free_3_HADDOCK
git init
git remote add origin git@github.com:tlinnet/relax_modelfree_scripts.git
git fetch
git checkout -t origin/master

# Change NOE error
sed -i 's/0.1*$/0.05/' NOE_600MHz_new.dat
sed -i 's/0.1*$/0.05/' NOE_750MHz.dat

And a new one, changing the NOE error, and deselecting N-terminal.
Consistency test, found that this stretch contained outliers.

See commands
mkdir 20171010_model_free_4_HADDOCK
cp 20171010_model_free/*.dat 20171010_model_free_4_HADDOCK
cp 20171010_model_free/*.pdb 20171010_model_free_4_HADDOCK

# Get scripts
cd 20171010_model_free_4_HADDOCK
git init
git remote add origin git@github.com:tlinnet/relax_modelfree_scripts.git
git fetch
git checkout -t origin/master

# Change NOE error
sed -i 's/0.1*$/0.05/' NOE_600MHz_new.dat
sed -i 's/0.1*$/0.05/' NOE_750MHz.dat

# Make deselection
echo "#" > deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t151" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t152" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t153" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t154" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t155" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t156" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t157" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t158" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t159" >> deselect.txt

And a new one, changing the NOE error, and deselecting spins found from consistency test.

See commands
mkdir 20171010_model_free_5_HADDOCK
cp 20171010_model_free/*.dat 20171010_model_free_5_HADDOCK
cp 20171010_model_free/*.pdb 20171010_model_free_5_HADDOCK

# Get scripts
cd 20171010_model_free_5_HADDOCK
git init
git remote add origin git@github.com:tlinnet/relax_modelfree_scripts.git
git fetch
git checkout -t origin/master

# Change NOE error
sed -i 's/0.1*$/0.05/' NOE_600MHz_new.dat
sed -i 's/0.1*$/0.05/' NOE_750MHz.dat

# Make deselection
echo "#" > deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t158" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t157" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t17" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t159" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t120" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t59" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t98" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t49" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t76" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t155" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t156" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t48" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -P "ArcCALD\t154" >> deselect.txt

And a new one, without changing the NOE error, and deselecting spins found from consistency test.

See commands
mkdir 20171010_model_free_6_HADDOCK
cp 20171010_model_free/*.dat 20171010_model_free_6_HADDOCK
cp 20171010_model_free/*.pdb 20171010_model_free_6_HADDOCK

# Get scripts
cd 20171010_model_free_6_HADDOCK
git init
git remote add origin git@github.com:tlinnet/relax_modelfree_scripts.git
git fetch
git checkout -t origin/master

# Make deselection
echo "#" > deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t158" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t157" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t17" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t159" >> deselect.txt

cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t59" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t98" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t76" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t155" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t156" >> deselect.txt 
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t120" >> deselect.txt

cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t49" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t48" >> deselect.txt
cat R1_600MHz_new_model_free.dat | grep -e "ArcCALD\t154" >> deselect.txt

11_read_data_GUI_inspect.py - Read data GUI inspect

This will read the data and save as a state.

The GUI can be a good place to inspect the setup and files.

See content of: 11_read_data_GUI_inspect.py

Run with

relax 11_read_data_GUI_inspect.py -t 11_read_data_GUI_inspect.log

To check in GUI

  • relax -g
  • File -> Open relax state
  • In folder "result_10" open "result_10_ini.bz2"
  • View -> Data pipe editor
  • Right click on pipe, and select "Associate with a new auto-analysis"

relax 11_test_consistency.py - Consistency test of our data

Before running the analysis, it is wise to run a script for consistency testing.

See here:

Highlights:

  • Comparing results obtained at different magnetic fields should, in the case of perfect consistency and assuming the absence of conformational exchange, yield equal values independently of the magnetic field.
  • avoid the potential extraction of erroneous information as well as the waste of time associated to dissecting inconsistent datasets using numerous long model-free minimisations with different subsets of data.
  • The authors prefer the use of the spectral density at zero frequency J(0) alone since it does not rely on an estimation of the global correlation time tc/tm, neither on a measure of theta, the angle between the 15N–1H vector and the principal axis of the 15N chemical shift tensor. Hence, J(0) is less likely to be affected by incorrect parameterisation of input parameters.

See content of: 11_test_consistency.py

relax 11_test_consistency.py -t 11_test_consistency.py.log

#Afterwards, go into the folder at plot data.
python plot_txt_files.py
./grace2images.py

12_Model_1_I_local_tm.py - Only run local_tm

Now we only run Model 1.

  • DIFF_MODEL = ['local_tm']
  • GRID_INC = 11 # This is the standard
  • MC_NUM = 0 # This has no influence in Model 1-5
  • MAX_ITER = 20 # Stop if it has not converged in 20 rounds

Normally between 8 to 15 multiple rounds of optimisation of the are required for the proper execution of this script.
This is can also be see here in Figure 2.

Relax should stop calculation, if a model does not converge.

See content of: 12_Model_1_I_local_tm.py

We use tmux to make a terminal-session, we can get back to, if our own terminal connection get closed.

Run with

# Make terminal-session
tmux new -s m1

relax 12_Model_1_I_local_tm.py -t 12_Model_1_I_local_tm.log

# or
tmux new -s m1_multi
mpirun -np 12 relax --multi='mpi4py' 12_Model_1_I_local_tm.py -t 12_Model_1_I_local_tm.log

You can then in another terminal follow the logfile by

less +F 12_Model_I_local_tm.log
  • To scroll up and down, use keyboard: Ctrl+c
  • To return to follow mode, use keyboard: Shift+f
  • To exit, use keyboard: Ctrl+c and then: q

13_Model_2-5 - Run Model 2 to 5

When Model 1 is completed, then make 4 terminal windows and run them at the same time.

These scripts do:

  • Read the state file from before with setup
  • Change DIFF_MODEL accordingly

13_Model_2_II_sphere.py

tmux new -s m2
relax 13_Model_2_II_sphere.py -t 13_Model_2_II_sphere.log
# Or
mpirun -np 4 relax --multi='mpi4py' 13_Model_2_II_sphere.py -t 13_Model_2_II_sphere.log

# When relax is running, push: Ctrl+b and then d, to disconnect without exit

13_Model_3_III_prolate.py

tmux new -s m3
relax 13_Model_3_III_prolate.py -t 13_Model_3_III_prolate.log
# Or
mpirun -np 4 relax --multi='mpi4py' 13_Model_3_III_prolate.py -t 13_Model_3_III_prolate.log

13_Model_4_IV_oblate.py

tmux new -s m4
relax 13_Model_4_IV_oblate.py -t 13_Model_4_IV_oblate.log
# Or
mpirun -np 4 relax --multi='mpi4py' 13_Model_4_IV_oblate.py -t 13_Model_4_IV_oblate.log

13_Model_5_V_ellipsoid.py

tmux new -s m5
relax 13_Model_5_V_ellipsoid.py -t 13_Model_5_V_ellipsoid.log
# Or
mpirun -np 4 relax --multi='mpi4py' 13_Model_5_V_ellipsoid.py -t 13_Model_5_V_ellipsoid.log

To join session

# List
tmux list-s

# Join either
tmux a -t m1
tmux a -t m2
tmux a -t m3
tmux a -t m4
tmux a -t m5

14_intermediate_final.py - Inspection during model optimization

During running of model 2-5, the current results can be inspected with this nifty scripts.

The script will ask for input of MC numbers. So just run it.

14_intermediate_final.py

tmux new -s final
relax 14_intermediate_final.py -t 14_intermediate_final.log

This does:

  • Option: Collect current best result from Model 2-5, and make MC simulations, and finalize to get current results files
  • Make a pymol file, that collects all of relax pymol command files into 1 pymol session
  • Option: Collect all chi2 and number of params k, for each iteration per model
    • Make a python plot file for plotting this results

Per iteration get: chi2, k, tm

Afterwards, plot the data.

python results_collected.py

Pymol macro

You also get a pymol folder.

See here for info how the macro is applied

Run with

pymol 0_0_apply_all_pymol_commands.pml

To run on Haddock

Have a look here, how to get standalone python Anaconda linux. Also have a look here OpenMPI.

# SSH in
ssh haddock

# Test with shell
mpirun -np 6 echo "hello world"

# Test with python
mpirun -np 6 python -m mpi4py helloworld

# Test with relax
mpirun -np 6 relax --multi='mpi4py'
# Look for: Processor fabric:  MPI 2.2 running via mpi4py with 5 slave processors & 1 master.  Using MPICH2 1.4.1.

Now we run 04_run_default_with_tolerance_lim.py with more power!
We use tmux to make a terminal-session, we can get back to, if our own terminal connection get closed.

  • start a new session: tmux
  • re-attach a detached session: tmux attach
# Make terminal-session
tmux

# Start relax
mpirun -np 20 relax --multi='mpi4py' 04_run_default_with_tolerance_lim.py -t 04_run_default_with_tolerance_lim.log

Useful commands to log file

While the analysis is running, these commands could be used to check the logfile for errors

### Check convergence 
# For chi2
cat 04_run_default_with_tolerance_lim.log | grep -A 10 "Chi-squared test:"

# For other tests
cat 04_run_default_with_tolerance_lim.log | grep -A 10 "Identical "
cat 04_run_default_with_tolerance_lim.log | grep -A 10 "Identical model-free models test:"
cat 04_run_default_with_tolerance_lim.log | grep -A 10 "Identical diffusion tensor parameter test:"
cat 04_run_default_with_tolerance_lim.log | grep -A 10 "Identical model-free parameter test:"

# To look for not converged errors
# For chi2
cat 04_run_default_with_tolerance_lim.log | grep -B 7 "The chi-squared value has not converged."

# For other tests
cat 04_run_default_with_tolerance_lim.log | grep -B 7 " have not converged."
cat 04_run_default_with_tolerance_lim.log | grep -B 7 "The model-free models have not converged."
cat 04_run_default_with_tolerance_lim.log | grep -B 7 "The diffusion parameters have not converged."
cat 04_run_default_with_tolerance_lim.log | grep -B 7 "The model-free parameters have not converged."

You can then inspect the logfile by less: 10-tips for less

less 04_run_default_with_tolerance_lim.log

To find pattern: We have to escape with \ for special character like: ()[] etc.

# Search forward
/Value \(iter 14\)
/The chi-squared value has not converged

n or N – for next match in forward / previous match in backward

  • To return to follow mode, use keyboard: Shift+f
  • To exit, use keyboard: Ctrl+c and then: q

rsync files

rsync files after completion to Sauron

When a run is completed, then sync files to Sauron file server.

Make a rsync_to_sbinlab.sh file with content

See file content
#!/bin/bash

read -p "Username on sauron :" -r

RUSER=$REPLY
SAURON=10.61.4.60
PROJ=`basename "$PWD"`

FROM=${PWD}
TO=${RUSER}@${SAURON}:/data/sbinlab2/${RUSER}/Downloads

# -a: "archive"- archive mode; equals -rlptgoD (no -H,-A,-X). syncs recursively and preserves symbolic links, special and device files, modification times, group, owner, and permissions.
# We want to remove the -o and -g options:
# -o, --owner                 preserve owner (super-user only)
# -g, --group                 preserve group
# -rlptD : Instead or
# -a --no-o --no-g  
# -z: Compression over network
# -P: It combines the flags --progress and --partial. The first of these gives you a progress bar for the transfers and the second allows you to resume interrupted transfers:
# -h, Output numbers in a more human-readable format.

# Always double-check your arguments before executing an rsync command.
# -n 

echo "I will now do a DRY RUN, which does not move files"
read -p "Are you sure? y/n :" -n 1 -r
echo ""

if [[ $REPLY =~ ^[Yy]$ ]]; then
  rsync -rlptDPzh -n ${FROM} ${TO} 
else
  echo "Not doing DRY RUN"
fi

echo ""

echo "I will now do the sync of files"
read -p "Are you sure? y/n :" -n 1 -r
echo ""

if [[ $REPLY =~ ^[Yy]$ ]]; then
  rsync -rlptDPzh ${FROM} ${TO}
else
  echo "Not doing anything"
fi

Make it executable and run

chmod +x rsync_to_sbinlab.sh

#run
./rsync_to_sbinlab2.sh

rsync files from BIO to home mac

To inspect from home mac.

Make a rsync_from_bio_to_home.sh file with content

See file content
#!/bin/bash
 
read -p "Username on bio:" -r
 
RUSER=$REPLY
BIO=ssh-bio.science.ku.dk

#PROJ=Desktop/kaare_relax
PROJ=Desktop/kaare_relax/20171010_model_free_HADDOCK
PROJDIR=`basename "$PROJ"`

FROM=${RUSER}@${BIO}:/home/${RUSER}/${PROJ} 
TO=${PWD}/${PROJDIR}

# -a: "archive"- archive mode; equals -rlptgoD (no -H,-A,-X). syncs recursively and preserves symbolic links, special and device files, modification times, group, owner, and permissions.
# We want to remove the -o and -g options:
# -o, --owner                 preserve owner (super-user only)
# -g, --group                 preserve group
# -rlptD : Instead or
# -a --no-o --no-g  
# -z: Compression over network
# -P: It combines the flags --progress and --partial. The first of these gives you a progress bar for the transfers and the second allows you to resume interrupted transfers:
# -h, Output numbers in a more human-readable format.
 
# Always double-check your arguments before executing an rsync command.
# -n 
 
echo "I will now do a DRY RUN, which does not move files"
read -p "Are you sure? y/n :" -n 1 -r
echo ""
 
if [[ $REPLY =~ ^[Yy]$ ]]; then
  rsync -rlptDPzh -n ${FROM} ${TO} 
else
  echo "Not doing DRY RUN"
fi
 
echo ""
 
echo "I will now do the sync of files"
read -p "Are you sure? y/n :" -n 1 -r
echo ""
 
if [[ $REPLY =~ ^[Yy]$ ]]; then
  rsync -rlptDPzh ${FROM} ${TO}
else
  echo "Not doing anything"
fi

Make it executable and run

chmod +x rsync_from_bio_to_home.sh

#run
./rsync_from_bio_to_home.sh

About the protocol

Model I - 'local_tm'
This will optimise the diffusion model whereby all spin of the molecule have a local tm value, i.e. there is no global diffusion tensor. This model needs to be optimised prior to optimising any of the other diffusion models. Each spin is fitted to the multiple model-free models separately, where the parameter tm is included in each model.

Model II - 'sphere'
This will optimise the isotropic diffusion model. Multiple steps are required, an initial optimisation of the diffusion tensor, followed by a repetitive optimisation until convergence of the diffusion tensor. In the relax script UI each of these steps requires this script to be rerun, unless the conv_loop flag is True. In the GUI (graphical user interface), the procedure is repeated automatically until convergence. For the initial optimisation, which will be placed in the directory './sphere/init/', the following steps are used:

  • The model-free models and parameter values for each spin are set to those of diffusion model MI.
  • The local tm parameter is removed from the models.
  • The model-free parameters are fixed and a global spherical diffusion tensor is minimised
  • For the repetitive optimisation, each minimisation is named from 'round_1' onwards. The initial 'round_1' optimisation will extract the diffusion tensor from the results file in './sphere/init/', and the results will be placed in the directory './sphere/round_1/'. Each successive round will take the diffusion tensor from the previous round. The following steps are used:
    • The global diffusion tensor is fixed and the multiple model-free models are fitted to each spin.
    • AIC model selection is used to select the models for each spin.
    • All model-free and diffusion parameters are allowed to vary and a global optimisation of all parameters is carried out.

Model III - 'prolate'
The methods used are identical to those of diffusion model MII, except that an axially symmetric diffusion tensor with Da >= 0 is used. The base directory containing all the results is './prolate/'.

Model IV -'oblate'
The methods used are identical to those of diffusion model MII, except that an axially symmetric diffusion tensor with Da <= 0 is used. The base directory containing all the results is './oblate/'.

Model V - 'ellipsoid'
The methods used are identical to those of diffusion model MII, except that a fully anisotropic diffusion tensor is used (also known as rhombic or asymmetric diffusion). The base directory is './ellipsoid/'

'final'
Once all the diffusion models have converged, the final run can be executed. This is done by setting the variable diff_model to 'final'. This consists of two steps, diffusion tensor model selection, and Monte Carlo simulations. Firstly AIC model selection is used to select between the diffusion tensor models. Monte Carlo simulations are then run solely on this selected diffusion model. Minimisation of the model is bypassed as it is assumed that the model is already fully optimised (if this is not the case the final run is not yet appropriate). The final black-box model-free results will be placed in the file 'final/results'.

See also