#- Hongyan Li: http://thread.gmane.org/gmane.science.nmr.relax.devel/694/focus=701
These will have lots of additional information. <br> This is just a selection of possibly the most useful messages.
You will soon see that this is a complicated topic. <br> Note that relax is capable of performing 100% of the functionality of : * Modelfree4 (with or without the Fast-Modelfree GUI interface)
* Dasha
* Tensor2 and
* DYNAMICS
If you play with the optimisation settings you can even find identical results to within machine precision - relax can mimic these other softwares.
=== Protocol ===
The key is that the full analysis protocol is rather complicated -
many people don't understand this - and that these softwares do not
implement the full iterative protocol.<br>
Therefore one either have to perform it manually or write a script to perform all of the steps.
The key is that the full analysis protocol is rather complicated - many people don't understand this - and that these softwares do not implement the full iterative protocol. Therefore one either have to perform it manually or write a script to perform all of the steps. The protocol is described in the relax manual in figure 7.2(http://www.nmr-relax.com/manual/diffusion_seeded_paradigm.html).
In summary:
* a) Find an initial diffusion tensor estimate (you can do this in relax by only using model m0).<br>
This requires all non-mobile residues and side chain spins to be excluded, and this can be problematic. See
the d'Auvergne and Gooley, 2008b paper at http://dx.doi.org/10.1007/s10858-007-9213-3 for an example of the
catastrophic failure that this initial estimate can result in. Or the bacteriorhodopsin fragment of Korzhnev et al., 1999
(http://dx.doi.org/10.1023/a:1008356809071) where this complete failure was earlier demonstrated.
* ba) Optimise all of the Find an initial diffusion tensor estimate (you can do this in relax by only using model-free models from m0 to m9). <br> This requires high precision optimisationall non-mobile residues and side chain spins to be excluded, for a comparison of all thesoftwares see and this can be problematic. See the d'Auvergne and Gooley, 2008a model-free optimisation2008b paper at http://dx.doi.org/10.1007/s10858-007-92149213-23 for an example of the catastrophic failure that this initial estimate can result in. Only relax andDasha implement the full range of model-free models, though the modelsm6, m7, and m8 cannot be used if only single field strength data isused (m6 is Or the original 2-time scale motion model bacteriorhodopsin fragment of Clore Korzhnev et al.,19901999 (http://dx.doi.org/10.1023/a:1008356809071)where this complete failure was earlier demonstrated.
* cb) Eliminate failed Optimise all of the model-free models (this is only available in relax <br>from m0 to m9. This requires high precision optimisation, for a comparison of all the softwares see the d'Auvergne and Gooley, 2006 2008a model elimination -free optimisation paper at http://dx.doi.org/10.1007/s10858-006007-90079214-z2. Only relax and Dasha implement the full range of model-free models, though the models m6, m7, and m8 cannot be used if only single field strength data is used (m6 is the original 2-time scale motion model of Clore et al., 1990).
* dc) Select the best model-free model for each spin systemEliminate failed models (this is only available in relax). <br>This again requires precision modern techniques, with the best being AIC modelselect (see See the d'Auvergne and Gooley, 2003 model-free 2006 model selectionelimination paper at http://dx.doi.org/10.10231007/A:1021902006114). If you areunaware that ANOVA statistics for model selection (hypothesis testingvia chis10858-squared, F006- and t9007-testsz) was abandoned by the field of modelselection over 100 years ago (a field which makes the NMR field lookvery, very small), then you should really look at that paper.
* ed) Optimise Select the global best model-free model for each spin system. <br> This is again requires precision modern techniques, with the diffusion tensor plus best being AIC model select (see the d'Auvergne and Gooley, 2003 model-free models model selection paper at http://dx.doi.org/10.1023/A:1021902006114). If you are unaware that ANOVA statistics for all spin systemsmodel selection (hypothesis testing via chi-squared, F- and t-tests) was abandoned by the field of model selection over 100 years ago (a field which makes the NMR field look very, very small), then you should really look at that paper.
* e) Optimise the global model. This is the diffusion tensor plus the model-free models for all spin systems. * f) Check for convergence (identical chi-squared values to a previous iteration, and not necessarily the last one). <br> If no, then go back to b) and repeat. Note that the chi-squared value can go upsignificantly between iterations, but this is because the model issimplifying itself at a much faster rate by loosing parameters - it'sOccam's razor at work. Again see the d'Auvergne and Gooley, 2008bpaper at http://dx.doi.org/10.1007/s10858-007-9213-3 for figuresdemonstrating this. The concept as to what is happening during thiscombined model-free optimisation and model selection algorithm isdescribed in the d'Auvergne and Gooley, 2007 MolBiosyst paper athttp://dx.doi.org/10.1039/b702202f. It can take up to 20 iterationsor more to reach convergence, depending upon the quality of therelaxation data and the 3D structure or the system in study.
* g) Once steps a-f have been completed for all global models (characterised by the spheroid, prolate spheroid, oblate spheroid, and
* h) Monte Carlo simulations for error analysis must be performed at the end.
* i) Elimination of failed Monte Carlo simulations is essential forkeeping the errors to reasonable values for certain spin systems. This is also a relax-only feature (see the d'Auvergne and Gooley, 2007model elimination paper at
http://dx.doi.org/10.1007/s10858-006-9007-z).
These steps must be implemented independently of which software youuse, as NONE implement the full protocol. Note however that theprotocol I developed (in the d'Auvergne and Gooley, 2007 theory paperat http://dx.doi.org/10.1039/b702202f and the d'Auvergne and Gooley,2008b paper at http://dx.doi.org/10.1007/s10858-007-9213-3) is fullyimplemented in relax, however this required multiple field strengthdata.
This is a rather large script located at'''auto_anlayses/dauvergne_protocol.py'''. This protocol is used by theGUI. So one option would be to copy this'''auto_anlayses/dauvergne_protocol.py''' script and modify it for thefigure 7.2 protocol.
=== Warning ===
I must warn you about using single field strength data.
It is now quite difficult to publish a model-free analysis with only
single field strength data as most of the field know about the
catastrophic analysis failures resulting in large amounts of
artificial motion. T
hese I must warn you about using single field strength data. It is now quite difficult to publish a model-free analysis with only single field strength data as most of the field know about the catastrophic analysis failures resulting in large amounts of artificial motion. These failures can also be much more subtle. Manyreviewers will ask for such data to be collected as the results cannotnot be trusted otherwise. For a model-free analysis, it is almostessential to collect data at multiple field strengths, otherwise itcan be sometimes impossible to distinguish between the anisotropicpart of the Brownian tumbling of the molecule and internal motion -specifically due to the NH vectors in secondary structure elements allpointing in a similar direction. I have a much better explanation, aswell as citations to all the relevant literature in:
* d'Auvergne E. J., Gooley P. R. (2007). Set theory formulation of themodel-free problem and the diffusion seeded model-free paradigm. Mol.Biosyst., 3(7), 483-494. (http://dx.doi.org/10.1039/b702202f).
In this paper, you will see reviewed both the artificial nanosecondmotions of the Schurr 1994 paper and the artifical Rex motions of theTjandra 1995 paper.
=== Recommendation ===
Finally, you will probably find it much easier to spend the 7-8 days
collecting data at another field strength than to implement the
protocol of steps a-i in a relax, Modelfree4, or Dasha script (or via
multiple iterations of the GUI programs), as well as study all of the
relevant literature to understand all of the types of failures than
only occurs with single field strength data. With multiple field
strength data you can perform [https://gna.org/users/semor Sebastien Morin's] consistency testing
analysis in relax (http://dx.doi.org/10.1007/s10858-009-9381-4 and
http://www.nmr-relax.com/manual/Consistency_testing.html). That way
you can see if your per-experiment temperature calibration and
per-experiment temperature control techniques have works sufficiently
well (http://www.nmr-relax.com/manual/Temperature_control_calibration.html)
and if you have used long enough recycle delays. Collecting data at a
second field would probably save you significant amounts of time, and
has the additional benefit that it would guarantee that the dynamics
you see at the end will be real. I cannot emphasize enough how
important it is to collect data at multiple fields, most importantly
the NOE and R2 data.
Regards,
EdwardFinally, you will probably find it much easier to spend the 7-8 days collecting data at another field strength than to implement the protocol of steps a-i in a relax, Modelfree4, or Dasha script (or via multiple iterations of the GUI programs), as well as study all of the relevant literature to understand all of the types of failures than only occurs with single field strength data. With multiple field strength data you can perform [https://gna.org/users/semor Sebastien Morin's] consistency testing analysis in relax (http://dx.doi.org/10.1007/s10858-009-9381-4 and http://www.nmr-relax.com/manual/Consistency_testing.html). That way you can see if your per-experiment temperature calibration andper-experiment temperature control techniques have works sufficiently well (http://www.nmr-relax.com/manual/Temperature_control_calibration.html) and if you have used long enough recycle delays. Collecting data at a second field would probably save you significant amounts of time, and has the additional benefit that it would guarantee that the dynamics you see at the end will be real. I cannot emphasize enough how important it is to collect data at multiple fields, most importantly the NOE and R2 data.