An alternative approach is based on measuring the alignment [3,4], or overlap [4,6] of anatomical structures annotated by an expert, or obtained as a result of (semi-)automated segmentation. Manual annotation is expensive to obtain and prone to subjective error. Reliable automated or semi-automated segmentation is extremely difficult to achieve - indeed if it was available it would often obviate the need for NRR.
We have used an overlap-based approach to provide a 'gold standard' method of assessment. The method requires manual annotation of each image - providing an anatomical/tissue label for each voxel - and measures the overlap of corresponding labels following registration, using a generalisation of Tanimoto's overlap coefficient. Each label for a given image is represented using a binary image but, after warping and interpolation into a common reference frame based on the results of NRR, we obtain a set of fuzzy label images. These are combined in a generalised overlap score [8] which provides a single figure of merit aggregated over all labels and all images in the set:
where indexes voxels in the registered images,
indexes the
labels and
indexes image pairs (all permutations are
considered).
and
represent voxel label values
for a pair of registered images and are in the range
. The
and
operators are standard results for the
intersection and union of fuzzy sets. This generalised overlap
measures the consistency with which each set of labels partitions
the image volume.
The parameter
affects the relative weighting of
different labels. With
, label contributions are
implicitly volume-weighted with respect to one another. This means
that large structures contribute more to the overall measure. We
have also considered the cases where
weights labels
by the inverse of their volume (which makes the relative weighting
of different labels equal), where
weights labels by
the inverse of their volume squared (which gives regions of
smaller volume higher weighting), and where
weights
labels by their complexity, which we define as the mean absolute
voxel intensity gradient over the labelled region.
An overlap score based on a generalisation of the popular Dice Similarity Coefficient (DSC) would also be possible but, since DSC is related monotonically to the Tanimoto Coefficient (TC) by DSC = 2TC/(TC+1) [5] we have not considered this further.