Peter N. Yianilos1
12/5/91 Rev 2/27/92 Rev 7/7/2002
The first of these
is easily
established and generalizes to measure spaces.
The second applies to vectors in
and is given by
. That this is a metric is more
difficult to demonstrate and is true for Euclidian distance
(the
norm) but for no other integral Minkowski metric.
The short and elegant proof we give is due to David Robbins and
Marshall Buck [1].
We also explore a number of variations.
Keywords: Metric Space, Distance Function, Similarity Function/Coefficient, Euclidian Distance, Association, Clustering, Vector Quantization, Pattern Recognition, Statistical Methods.
The notion of Metric Space [2], is a cornerstone of mathematical topology and analysis, and is often employed in pattern recognition and clustering systems.
Let be a set and
a non-negative real valued
function on
satisfying for
:
Then is a metric space.
Alternatively
is said to be a distance
function and impose a metric on
.
The third item in this definition is usually referred to as the triangle inequality. It is worthwhile noting that not all approaches to pattern recognition and clustering impose this requirement. The term similarity measure or dissimilarity measure, or some variation thereof, is then used to describe the comparison function.
Two well known examples of metric spaces are Euclidian -space
with:
and the symmetric difference metric on members of , the
set consisting of all finite sets:
is of course geometrical length, and
simply counts the number of elements on
which
and
disagree.
Both of these metrics are in general unbounded, and their
value is independent of the size of and
. I.e.
two very large sets that have three points of disagreement,
are just as distant under
as two very small sets
which also differ by three elements. In Euclidian
-space,
values
and
are just as distant as
and
.
Hence we may in a sense view and
as measures
of absolute distance or difference.
is
seen to be insensitive to
while
does not
independently consider
and
when measuring
distance.
It is therefore natural to consider forming relative distance measures for these underlying spaces, since such measures may be more effective in the solution of certain problems. Certainly the notion of relative error is an important one in numerical analysis. For sets, and Euclidian space, size might naturally be taken to mean cardinality, and norm respectively. Thus we are interested in set metrics which measure relative to the empty set, and in an alternative to the Euclidian distance metric which measures relative to the origin.
With no domain assumptions, and
are
unbounded. This may create algorithmic difficulty (or at a
minimum inconvenience). Thus converting these metrics to
bounded forms may sometimes be useful.
The simplest way in general to effect a bound is to compose
the metric with another function which acts as a range
compander such that the combination still satisfies
definition 1. One well known example
of such a function that bounds any metric to
is:
There are many such formulas for bounding metrics. It may
for example be shown that the sum of metrics is a metric, and
beyond this that a metric results from composition with any
continuous, differentiable, strictly increasing function
such that
, and
is non-increasing.
2
Since however any such method depends only on the value
, the bounded form will inherit the absolute or
relative behavior of its unbounded parent.
These metric companding methods are in contrast to normalization to which we now turn.
In the sections that follow we will introduce the metric
defined by
and also
metric
defined
3 by
.
In contrast to the absolute behavior of and
these functions judge distance with consideration
to the relative location of the origin and empty set
respectively.
That they are in fact metrics is not obvious and a considerable portion of this paper is devoted to the required proofs.
We will also present a number of alternative forms including mixed metrics that combine absolute and relative behavior.
To normalize the symmetric difference metric we choose to divide its value by the size of the union of its arguments. More formally we have:
We would point out here that not every reasonable attempt at
normalization results in a metric. Dividing instead by
, an altogether sensible thing to do, fails to
satisfy the triangle inequality. This may be seen by letting
,
, and
. The
and
distances
are then both
while the distance
is
.
4
While it is clear that this function is bounded by ,
and that its behavior is more relative in that
will affect its value, it must be demonstrated that
is in fact a metric.
Proof: We must show that for all finite sets ,
, and
, the following are true:
We have i) because
only when
. The second requirement ii) is clear from the
commutivity of basic set operations.
Item iii), the triangle inequality, requires more work. Using EQ 1, we must show:
It is easy to verify that this is true if
,
, or
, So we restrict our attention to the case in which
none of the denominators in EQ 2 is zero.
Now the union of sets A,B, and C, may be partitioned (see figure 1) into seven disjoint subsets whose orders we denote:
For later convenience we also define
.
With these definitions, EQ 2 becomes with some simplification:
Now observe that may be removed without loss of
generality from the left denominator, since its removal
can only increase that side. It will then suffice to show that:
Now replacing with
and adding fractions,
we arrive at:
The denominator of EQ 4 is equal to the
denominator in EQ 5 plus , and is
therefore no smaller.
It therefore suffices to show the that the inequality is true for the numerators alone, i.e. that:
Starting from the left side of this inequality we have:
and we are done.
Our arguments for sets in generalize easily to
measure spaces[3,4]. The case
of finite non-zero measures is straightforward since these
correspond well with finite sets and our earlier arguments.
To see this, remember that for any
, we
earlier expressed the triangle inequality in terms of the
orders of their natural decomposition into seven disjoint
parts. Our proof of theorem 1 may then be
viewed as establishing the truth of an inequality in these
seven independent variables - an inequality that holds for
any assignment of finite non-negative values; subject only to
the restriction that the denominators may not vanish. The
same decomposition applies if
, with the
inequality relating measure instead of set order, thus
motivating the connection between
and measure spaces.
As a practical matter the finite setting is the most
important case. However the generalization extends fully to
measures which assume value , and we will take the
time to show this.
For elements of finite non-zero measure, definition of our metric will correspond to simple generalization of the notion of set order, to that of measure. For elements in general, several cases are necessary to patch together a definition. While these special cases manage to define the metric for all members of the space, only the simplest discrete distance notion applies to elements with zero or infinite measure.
We now state the corollary:
Proof: Everything but the triangle inequality follows immediately from the definition.
Note first that if or either one of
,
is one, then the triangle
inequality is trivially established. So in particular the
inequality is satisfied if
.
Further observe that if or
is infinite or
zero, then
iff
, and one otherwise. I.e.
the definition reduces to a simple equality test for these
cases.
With these points in mind we distinguish three cases:
Otherwise we may assume without loss of generality that
. Here
implies
- and if
, then
unless
.
Otherwise we may assume without loss of generality that
. But then
unless
, in which case
; a
situation covered by the preceding argument.
Proceeding further we can develop a slightly more general
form for . Consider some
and
their associated decomposition. Now for some
,
adding
to each of
,
, and
, leaves them
non-negative and the inequality therefore valid.
This may be thought of as extending each of ,
, and
,
by a new region which does not intersect the others.
Equation 3 then becomes:
![]() |
(7) |
This all motivates:
![]() |
(8) |
for .
And from our discussion above it follows that:
Thus we see that the discontinuity when
may be
eliminated
5 by biasing the denominator term
. Function
is just
, and as
, the behavior of
approaches that of
simple symmetric difference.
In a later section we will present several examples of metrics that arise from the results above. Before doing so however, we turn our attention back to Euclidian space and our goal of establishing a normalized metric there.
In this section we will demonstrate that Euclidian distance
normalized by combined norm, defines an alternative and
bounded metric on
.
That this definition is in fact a metric will later be stated as a theorem. Its proof will require a series of Lemmas which build insight into the function.
It is worthwhile noting that in
, an easy proof exists
and that a very similar
metric (a special case of our
definition 6), is employed in
[5].
We would also observe that this definition fails to generate a
metric if the norm is substituted. 6 For if
,
, and
, the triangle inequality
does not hold. The
norm also fails; consider
,
, and
. For this example, and
the
norm, the triangle inequality becomes:
This fails for . Therefore, we have that our
definition does not generate a metric for any of the
integral Minkowski metrics
where
.
Our to-be-proven positive result for is thus that much
more interesting.
Proof:
Let ,
,
,
be four distinct points in Euclidean space.
Then
proof: Let ,
,
,
,
,
.
Then we need to prove that
Case 1: . Then we have
Case 2:
Then we have
![]() |
(10) | ||
![]() |
![]() |
(11) |
from which the result follows immediately as well.
Case 3: . Observe that the theorem is true for four points
,
,
,
if and only if it is true for
and the points
,
,
obtained by inverting
,
and
in the sphere
of radius 1 centered at
. (Recall that
and
are on the
same ray from
and satisfy
. The key fact that is
needed is
.) This reduces
case 3 to case 1.
If we now label the origin as , we may re-state the theorem
as a corollary in purely geometrical terms:
So despite a singularity at the
origin, imposes a metric on
. This amounts to
a fully relative distance measure which clearly cannot
well cope with zero. By contrast, standard Euclidian
distance may be viewed as a fully absolute distance
measure. It is easy to imagine a continuum of intermediate
metrics which are partially relative.
Furthermore, some applications, while requiring a modicum of
relative behavior, may also require better behavior at
the origin than provides.
All of this motivates the following definition:
![]() |
(12) |
The Normalized Euclidian Distance of
definition 9 corresponds to
. Note that the definition 6's
special condition at
is only required when
,
for when
, the denominator cannot vanish.
The case , merely corresponds to standard Euclidian
distance scaled by
.
In between these two extremes lies a family of hybrid
functions that combine the relative behavior of with the
absolute behavior of
.
To establish that
is in general a metric
space, we will require one lemma concerning the behavior of
quasi-quadrilaterals in
.
I.e. the sum of the product of the lengths of opposite sides of an arbitrary quadrilateral, is no smaller than the product of its diagonals.
Proof: We have only to argue that we may reduce
setting to
since there, the lemma is recognized as a
standard theorem of inversive geometry ([6]
Theorem 5.12) - a direct generalization of Ptolemy's theorem.
Since any four points may be embedded in
, we may
assume our setting is at most this.
Within
, choose coordinates so that
lie on
the
plane, and
is located above (or on) it. I.e.
and
.
Now translate so that is the origin.
Further choose rotation and orientation so that line is
parallel to and above the X-axis with
left of
. I.e.
and
.
Now consider spheres and
. The
intersection of these spheres is a circle perpendicular to
the XY plane and parallel to the Y-axis, having center along
.
Any choice for along on this circle it will leave
everything unchanged in our inequality but for
.
Notice that the point on this circle farthest from the origin, is just its intersection with the XY plane.
Denoting this point and relocating
there, increases
and and the inequality's right side, while leaving the
left unchanged.
Thus it is clear that the inequality is true for points
in
iff it is true for
in
.
We are now ready to establish that
is a metric:
Proof: The first and second metric conditions are straightforward. We therefore turn to the triangle inequality.
If , then by definition
and
is
nothing more than scaled Euclidian distance. If
,
then by definition
and
is just scaled
normalized Euclidian distance.
So the only interesting case is that in which
. Here, we may assume without loss of generality that
since other
cases may be reduced to this one by multiplying each term in the
triangle inequality by
and then reducing.
Now defining for convenience:
, it
remains for us to show that:
Now forming a common denominator, discarding it, and then rearranging, we have:
We now separately consider matched lines from either side of
this expression. It turns out that the inequality holds for
each such pairing, thus establishing the overall inequality.
The inequality for the first line is just scaled ordinary
Euclidian distance. Then, letting denote the origin, the
inequality holds for the second line due to
corollary 3, and for the third
7 due to
lemma 1.
To illustrate the wide variety of metrics that can be
constructed with our earlier results, we start by returning to
metrics on .
We first observe that finite sets may be regarded as binary vectors in Euclidian space. With this observation theorem 2 gives us that the following is a metric:
![]() |
(13) |
It is interesting to note that without the square root
operations, this definition corresponds to normalization of
by
, which was earlier shown not to be
a metric.
Again thinking of sets as binary vectors or variables, metric
is recognized as a well known measure of association
referred to as the Jaccard coefficient
[7], or S-coefficient:
Where refer to the standard entries in a 2-by-2
contingency (or association) table for binary variables.
This very basic comparison function may be expressed in many
forms. Expressing essentially the same thing as a similarity
measure corresponds to the Tanimoto Coefficient,
[8], operating on binary vectors:
In the language of computer programming this is just the
count of 1's in the exclusive or of bit vectors and
, divided by the count of 1's in their logical or.
In [9], the measure is referred to as association. While the authors do not state that these various forms fail to satisfy the triangle inequality, they describe them along with other non-metric measures.
Other sources such as [10] mention
explicitly the set function
, but again fail to note
that it satisfies the triangle inequality.
The Bray-Curtis measure[11], is written:
and is recognized as our but under the
norm. Now
we have seen that this does not form a metric, but since
is a metric we may write:
with the knowledge that this does form a metric.
Also in [11], the reader will recognize the
Canberra Metric as in
, making
a
considerable generalization of this apparently well known
metric.
In familiar settings such as
and
, with measure
defined as
length/ area/ volume, the metric
corresponds to the
amount of non-common length/ area/ volume, normalized by the
total length/ area/ volume.
Here are some examples of metrics derived from
corollary 1, that are defined for
-tuples of non-negative values. For brevity's sake, we
will not show the
case for which any metric must evaluate
to zero.
This may be viewed as a repaired Bray-Curtis
measure in that
. I.e. the denominator has an
extra term which suffices to form a metric.
which resembles somewhat our first example but behaves quite differently.
An example in which a single component greatly influences the resulting value.
Now corollary 1 may also be applied to
function spaces. Let denote the space of
non-negative continuous functions on
. Given
, the metric defined by:
may be thought of as an extension of our first positive
-tuple metric above, as
, or in a
Lebesgue/Measure-Space context.
Extending our second -tuple metric to Riemann integrable
functions allows us to write:
Note that in some cases, and with the proper assumptions, the continuity and range of integration assumptions made above may be relaxed.
Now for the sake of brevity, we have focused mainly on
examples using
/
. Similar constructions apply
to
for both
-tuples and function spaces.
It is worthwhile noting (however obvious) that may also
be viewed as an alternate metric for the complex plane with
norm corresponding to complex absolute value.
We close this section by commenting that used as a
measure of relative error in numerical analysis, may prove
interesting. Given a sequence
of
approximate solutions, we might define
and the triangle inequality would
then give us that:
.
We have developed normalized forms for two important metrics and shown each to be an endpoint of a continuum of metric functions ranging to the original unnormalized forms.
In a combinatorial setting,
may be viewed as a
prototype for constructing a normalized bounded metric for
many applications - for every injection of a pattern space
into
induces a metric on the original pattern
space.
8
Furthermore, evaluating this metric is possible given only
and the individual orders
, and
since
.
In practice, these values may sometimes be directly computed
without explicitly forming injective images in set space.
In [12] a VLSI chip is described which in
computes these three orders for a particular mapping of
finite symbol strings into , and combines them to
define a measure of string similarity.
In more Euclidian settings, the family
may be used
where normalized behavior is required and the metric triangle
inequality is of value, e.g. in finding nearest neighbors, or
when bounding sequential sums of distances.
Beyond the existence and basic properties established in this
paper, one may examine a number of topics including the
notion of colinearity in these normalized spaces, the nature
of the geodesics corresponding to , and questions of
statistical behavior.
This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.47)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 -image_type gif -nonext_page_in_navigation -noprevious_page_in_navigation -up_url ../main.html -up_title 'Publication homepage' -numbered_footnotes nmet.tex
The translation was initiated by Peter N. Yianilos on 2002-07-07