It is useful to compare the performance of different measures of
model quality. For a given measure, the level of model
degradation that can be detected in the validation experiments
described above depends on both the change in the value of the
measure as a function of model degradation and the uncertainty in
the value. To quantify this, we define the sensitivity
of a measure as follows.
where
is the value of the measure for some degree
of degradation
,
is the mean error in the
estimate of
over the range.
is reciprocal of the change in
required for
to change by one noise standard error, which
indicates the lower limit of change in quality
which can be
detected by the measure. Sensitivities for the specificity and
generalisation for different values of shuffle radius are shown in
Figure 8. These results demonstrate
that specificity is a more sensitive measure of model quality than
generalisation, and that the use of shuffle distance improves the
sensitivities of both measures over those obtained using Euclidean
distance.