Difference between revisions of "RMSD"

From relax wiki
Jump to navigation Jump to search
 
(20 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Reference ==
+
The [https://en.wikipedia.org/wiki/Root-mean-square_deviation Root Mean Squared Deviation] (RMSD) of the baseplane noise in a spectrum.
Art Palmer's 1991 JACS paper
 
 
 
Do this for PEAK HIGHTS determination.
 
  
The empty region RMSD is smaller than the random coil region, due to power conservation in the Fourier transform. <br>
 
You can see this by dropping all the way to the baseplane and carefully looking. <br>
 
You should measure with boxes near the peaks, and make sure no peaks are in the box. <br>
 
This estimate is much better than the full spectrum RMSD measures from the other programs.
 
  
 
== Software to use ==
 
== Software to use ==
Line 13: Line 6:
  
 
In F1 mode, draw boxes in the empty regions for all spectra. You have a different error estimate for each spectrum, which relax handles.  
 
In F1 mode, draw boxes in the empty regions for all spectra. You have a different error estimate for each spectrum, which relax handles.  
 +
 +
It is a good idea to use the Sparky '''rm''' function to get the RMSD of the spectrum as a maximal noise estimate in several areas free of peaks. <br>
 +
If there are peaks which may be in noisier regions (eg near the water line) it can be a good idea to give those peaks separate errors by measuring the RMSD near those peaks seperately.
 +
 +
=== Reference ===
 +
For a region <br>
 +
http://www.cgl.ucsf.edu/home/sparky/manual/extensions.html#RegionRMSD
 +
 +
Whole spectrum <br>
 +
http://www.cgl.ucsf.edu/home/sparky/manual/views.html#Noise
  
 
== Notes ==
 
== Notes ==
The error is equal to the baseplane RMSD times the number of 2D spectra points in the box.<br>
+
The error is equal to the baseplane RMSD times the number of 2D spectra points in the box.
 +
 
 +
== Reference ==
 +
Do this for PEAK HIGHTS determination.
 +
 
 +
The empty region RMSD is smaller than the random coil region, due to power conservation in the Fourier transform. <br>
 +
You can see this by dropping all the way to the baseplane and carefully looking. <br>
 +
You should measure with boxes near the peaks, and make sure no peaks are in the box. <br>
 +
This estimate is much better than the full spectrum RMSD measures from the other programs.
 +
 
  
 
== Peak shifts ==
 
== Peak shifts ==
 
Be aware of peak shifts, due to temperature
 
Be aware of peak shifts, due to temperature
 +
 +
== showApod with NMRPipe ==
 +
# http://spin.niddk.nih.gov/NMRPipe/newdocs/ref/
 +
# http://spin.niddk.nih.gov/NMRPipe/ref/prog/showapod.html
 +
# http://spin.niddk.nih.gov/NMRPipe/doc/nmrProg/showApod.html
 +
# http://hincklab.uthscsa.edu/~ahinck/html/soft_packs/nmrpipe/nmrDraw.html
 +
# https://groups.yahoo.com/neo/groups/nmrpipe/conversations/topics/402
 +
# https://groups.yahoo.com/neo/groups/nmrpipe/conversations/messages/2003
 +
# https://groups.yahoo.com/neo/groups/nmrpipe/conversations/messages/2382
 +
 +
showApod test.ft2 | grep "REMARK Automated Noise Std Dev in Processed Data:" | awk '{print $9} '
 +
 +
This will print two numbers, first is noise estimate for the spectrum, second is noise estimate for the time-domain ... try it without the "-noverb" option to see more info.
 +
showApod -in test.ft2 -noverb
 +
 +
 +
[https://groups.yahoo.com/neo/groups/nmrpipe/conversations/topics/402 Post by Frank Delaglio, Oct 29, 2009]
 +
<source lang="text">
 +
Hi George,
 +
 +
A section of an earlier post about noise estimation is below.
 +
 +
The noise estimation is supposed to estimate the standard deviation
 +
in the baseline of the spectrum, i.e. the areas of the spectrum
 +
without substantial signal. It assumes:
 +
 +
1. The noise level in the spectrum is uniform
 +
 +
2. The noise has a Gaussian distribution, or roughly so.
 +
 +
3. Most of the spectrum is "empty", i.e. free of
 +
substantial signal.
 +
 +
So, if the noise estimation is successful, it will give values
 +
similar to those that you would find if you manually identify
 +
a baseline region and take its standard deviation.
 +
 +
Along these lines, you can imagine that by looking at the histogram
 +
of intensities in a signal-free region, you could also estimate its
 +
standard deviation.
 +
 +
As the post below says, some noise estimation tools use the entire
 +
spectral data set at once. However, this is NOT the same as
 +
simply taking the standard deviation of the entire spectrum
 +
at once, since the entire spectrum includes both signal AND
 +
baseline.
 +
 +
Instead, the noise estimation techniques attempt to separate the
 +
contributions from "signal" and "baseline". In our case,
 +
we assume that most points are baseline points, such that if
 +
we build a histogram of intensities, most of the values
 +
corresponding to small intensities come from the baseline
 +
rather than from the signals. This means that to a first
 +
approximation, the part of the histogram describing the smallest
 +
intensities can be used to characterize the baseline.
 +
 +
In practice, this noise estimation technique works better
 +
when we perform many independent noise estimates on vectors
 +
from the data, rather than perform one noise estimate using
 +
all the data at once. This is because individual vectors from
 +
a 2D or 3D dataset are more likely to contain a substantial fraction
 +
of baseline points, and in fact many individual vectors will consist
 +
entirely of baseline, in which case the histogram method will be
 +
at its most effective.
 +
 +
Hope this explanation helps ...
 +
 +
big fd
 +
 +
 +
From an earlier post about noise estimation:
 +
 +
---
 +
The noise estimation details used by autoFit.tcl were recently
 +
improved ... if you search the text of the script, the older
 +
version will use the function "vEstNoise", and the newer
 +
version will use the function "estSpecNoise".
 +
 +
The basic mechanism of noise detection is the same in both
 +
cases; a histogram of the data intensities is analyzed, under
 +
the assumption that most of the points in the spectrum are
 +
in the baseline, so that the innermost part of the histogram
 +
is due primarily to baseline noise. IF we assume that the
 +
baseline noise is normally distributed, we can use the histogram
 +
to estimate the standard deviation of the noise.
 +
 +
The older "vEstNoise" implementation analyzed an entire spectral
 +
region at once to form a single noise estimate. In the case of 2D/3D
 +
data, the newer "estSpecNoise" forms separate noise estimates for
 +
individual
 +
vectors from the data, and uses the median.
 +
---
 +
</source>
 +
 +
== References ==
 +
 +
* {{#lst:Citations|Palmer91}}
 +
 +
* {{#lst:Citations|Farrow94}}
  
 
== See also ==
 
== See also ==
[[Category:Analysis]]
+
# [[Spectrum_error_analysis]]
 +
# [http://tech.groups.yahoo.com/group/nmr_sparky/message/102 Message on Yahoo group]
 +
# [https://sites.google.com/site/jamiebairdtitus/analyzingusingrdcdata Jamie Baird-Titus page on Processing RDC Data]
 +
# [https://groups.yahoo.com/neo/groups/nmrpipe/conversations/messages/2382 ]
 +
[[Category:Analysis techniques]]

Latest revision as of 18:56, 21 October 2020

The Root Mean Squared Deviation (RMSD) of the baseplane noise in a spectrum.


Software to use

SPARKY. In the extension menu.

In F1 mode, draw boxes in the empty regions for all spectra. You have a different error estimate for each spectrum, which relax handles.

It is a good idea to use the Sparky rm function to get the RMSD of the spectrum as a maximal noise estimate in several areas free of peaks.
If there are peaks which may be in noisier regions (eg near the water line) it can be a good idea to give those peaks separate errors by measuring the RMSD near those peaks seperately.

Reference

For a region
http://www.cgl.ucsf.edu/home/sparky/manual/extensions.html#RegionRMSD

Whole spectrum
http://www.cgl.ucsf.edu/home/sparky/manual/views.html#Noise

Notes

The error is equal to the baseplane RMSD times the number of 2D spectra points in the box.

Reference

Do this for PEAK HIGHTS determination.

The empty region RMSD is smaller than the random coil region, due to power conservation in the Fourier transform.
You can see this by dropping all the way to the baseplane and carefully looking.
You should measure with boxes near the peaks, and make sure no peaks are in the box.
This estimate is much better than the full spectrum RMSD measures from the other programs.


Peak shifts

Be aware of peak shifts, due to temperature

showApod with NMRPipe

  1. http://spin.niddk.nih.gov/NMRPipe/newdocs/ref/
  2. http://spin.niddk.nih.gov/NMRPipe/ref/prog/showapod.html
  3. http://spin.niddk.nih.gov/NMRPipe/doc/nmrProg/showApod.html
  4. http://hincklab.uthscsa.edu/~ahinck/html/soft_packs/nmrpipe/nmrDraw.html
  5. https://groups.yahoo.com/neo/groups/nmrpipe/conversations/topics/402
  6. https://groups.yahoo.com/neo/groups/nmrpipe/conversations/messages/2003
  7. https://groups.yahoo.com/neo/groups/nmrpipe/conversations/messages/2382
showApod test.ft2 | grep "REMARK Automated Noise Std Dev in Processed Data:" | awk '{print $9} '

This will print two numbers, first is noise estimate for the spectrum, second is noise estimate for the time-domain ... try it without the "-noverb" option to see more info.

showApod -in test.ft2 -noverb


Post by Frank Delaglio, Oct 29, 2009

Hi George,

A section of an earlier post about noise estimation is below.

The noise estimation is supposed to estimate the standard deviation
in the baseline of the spectrum, i.e. the areas of the spectrum
without substantial signal. It assumes:

1. The noise level in the spectrum is uniform

2. The noise has a Gaussian distribution, or roughly so.

3. Most of the spectrum is "empty", i.e. free of
substantial signal.

So, if the noise estimation is successful, it will give values
similar to those that you would find if you manually identify
a baseline region and take its standard deviation.

Along these lines, you can imagine that by looking at the histogram
of intensities in a signal-free region, you could also estimate its
standard deviation.

As the post below says, some noise estimation tools use the entire
spectral data set at once. However, this is NOT the same as
simply taking the standard deviation of the entire spectrum
at once, since the entire spectrum includes both signal AND
baseline.

Instead, the noise estimation techniques attempt to separate the
contributions from "signal" and "baseline". In our case,
we assume that most points are baseline points, such that if
we build a histogram of intensities, most of the values
corresponding to small intensities come from the baseline
rather than from the signals. This means that to a first
approximation, the part of the histogram describing the smallest
intensities can be used to characterize the baseline.

In practice, this noise estimation technique works better
when we perform many independent noise estimates on vectors
from the data, rather than perform one noise estimate using
all the data at once. This is because individual vectors from
a 2D or 3D dataset are more likely to contain a substantial fraction
of baseline points, and in fact many individual vectors will consist
entirely of baseline, in which case the histogram method will be
at its most effective.

Hope this explanation helps ...

big fd


From an earlier post about noise estimation:

---
The noise estimation details used by autoFit.tcl were recently
improved ... if you search the text of the script, the older
version will use the function "vEstNoise", and the newer
version will use the function "estSpecNoise".

The basic mechanism of noise detection is the same in both
cases; a histogram of the data intensities is analyzed, under
the assumption that most of the points in the spectrum are
in the baseline, so that the innermost part of the histogram
is due primarily to baseline noise. IF we assume that the
baseline noise is normally distributed, we can use the histogram
to estimate the standard deviation of the noise.

The older "vEstNoise" implementation analyzed an entire spectral
region at once to form a single noise estimate. In the case of 2D/3D
data, the newer "estSpecNoise" forms separate noise estimates for 
individual
vectors from the data, and uses the median.
---

References

  • Palmer, 3rd, A. G., Rance, M., and Wright, P. E. (1991). Intramolecular motions of a zinc finger DNA-binding domain from Xfin characterized by proton-detected natural abundance carbon-13 heteronuclear NMR spectroscopy. J. Am. Chem. Soc., 113(12), 4371-4380. (DOI: 10.1021/ja00012a001)
  • Farrow, N. A., Muhandiram, R., Singer, A. U., Pascal, S. M., Kay, C. M., Gish, G., Shoelson, S. E., Pawson, T., Forman-Kay, J. D., Kay, L. E. (1994). Backbone dynamics of a free and phosphopeptide-complexed Src homology 2 domain studied by 15N NMR relaxation. Biochemistry, 33(19), 5984-6003. (DOI: 10.1021/bi00185a040)

See also

  1. Spectrum_error_analysis
  2. Message on Yahoo group
  3. Jamie Baird-Titus page on Processing RDC Data
  4. [1]