Target shooting with two different
weapons. The weapon on the left shows a high reliability, as the shots
are well gathered. However, the whole group and hence the average
is off centre, thus the method is less valid. The weapon on the
right shows better validity, as the average of the shots are on
centre, but the shots are less well gathered (more scattered),
the weapon will tend to hit in a more different location each time, it
is less
reliable.
Different echo methods may have different validity and reliability, as
demonstrated by this study (
151):
Comparison of three different ultraound methods for deformation
imagiong, against tagged MR as reference; Left
2D strain, middle
segmental strain by combined tissue
Doppler and speckle tracking, and right strain by dynamic
velocity gradient. Top row: Bland
Altmann plots, bottom row scatterplots with identity line shown. It can
be seen that there is a significant bias between 2D strain and MR,
while the measurements are fairly well gathered together, but on the
average below the identity line. The segmental method has a small bias,
but this is not significant. The method is less reliable, as seen by
greater scatter (and lower correlation). The method on the left shows
no bias, i.e. good validity, but even greater scatter, and is clearly
the least reliable.
Validity is related to the comparison to a reference method, but it may
also be related to basic physiological theory. It may be maintained,
for instance, that deformation measures alone have a low physiological
validity, as the main point is the stress-strain relation, i.e. the
deformation in relation to the load, and that measurements of
deformatipon alone without incorporating some measure of load have
little physiological validity. However, for clinical purposes, the main
point is not physiological validity, but the
dicriminatory ability of the method.
As differnt methods use different approaches, the validity agains a
reference may vary. However, for ech method, both
cut off and
normal values may be established, and
measurements may be evaluated in terms of these in itself. For
instance, the lower normal value for EF is 50% for ultrasound, but 60%
for ventriculaography, as this method uses contrast and thus activates
the Frank-Starling mechanism by volume loading. The main interest for
the clinician is the methods
dicriminatory
ability, the ability to discriminate between normal and abnormal
function. And given a set of reference values for the method, the main
point is reliability.
Reliability is used concurrent with reproducibility or repeatibility ,
or it's inverse; varability.