Peter N. Yianilos1
12/5/91 Rev 2/27/92 Rev 7/7/2002
The first of these is easily established and generalizes to measure spaces.
The second applies to vectors in and is given by . That this is a metric is more difficult to demonstrate and is true for Euclidian distance (the norm) but for no other integral Minkowski metric. The short and elegant proof we give is due to David Robbins and Marshall Buck [1].
We also explore a number of variations.
Keywords: Metric Space, Distance Function, Similarity Function/Coefficient, Euclidian Distance, Association, Clustering, Vector Quantization, Pattern Recognition, Statistical Methods.
The notion of Metric Space [2], is a cornerstone of mathematical topology and analysis, and is often employed in pattern recognition and clustering systems.
Let be a set and a non-negative real valued function on satisfying for :
Then is a metric space. Alternatively is said to be a distance function and impose a metric on .
The third item in this definition is usually referred to as the triangle inequality. It is worthwhile noting that not all approaches to pattern recognition and clustering impose this requirement. The term similarity measure or dissimilarity measure, or some variation thereof, is then used to describe the comparison function.
Two well known examples of metric spaces are Euclidian -space with:
and the symmetric difference metric on members of , the set consisting of all finite sets:
is of course geometrical length, and simply counts the number of elements on which and disagree.
Both of these metrics are in general unbounded, and their value is independent of the size of and . I.e. two very large sets that have three points of disagreement, are just as distant under as two very small sets which also differ by three elements. In Euclidian -space, values and are just as distant as and .
Hence we may in a sense view and as measures of absolute distance or difference. is seen to be insensitive to while does not independently consider and when measuring distance.
It is therefore natural to consider forming relative distance measures for these underlying spaces, since such measures may be more effective in the solution of certain problems. Certainly the notion of relative error is an important one in numerical analysis. For sets, and Euclidian space, size might naturally be taken to mean cardinality, and norm respectively. Thus we are interested in set metrics which measure relative to the empty set, and in an alternative to the Euclidian distance metric which measures relative to the origin.
With no domain assumptions, and are unbounded. This may create algorithmic difficulty (or at a minimum inconvenience). Thus converting these metrics to bounded forms may sometimes be useful.
The simplest way in general to effect a bound is to compose the metric with another function which acts as a range compander such that the combination still satisfies definition 1. One well known example of such a function that bounds any metric to is:
There are many such formulas for bounding metrics. It may for example be shown that the sum of metrics is a metric, and beyond this that a metric results from composition with any continuous, differentiable, strictly increasing function such that , and is non-increasing. 2
Since however any such method depends only on the value , the bounded form will inherit the absolute or relative behavior of its unbounded parent.
These metric companding methods are in contrast to normalization to which we now turn.
In the sections that follow we will introduce the metric defined by and also metric defined 3 by .
In contrast to the absolute behavior of and these functions judge distance with consideration to the relative location of the origin and empty set respectively.
That they are in fact metrics is not obvious and a considerable portion of this paper is devoted to the required proofs.
We will also present a number of alternative forms including mixed metrics that combine absolute and relative behavior.
To normalize the symmetric difference metric we choose to divide its value by the size of the union of its arguments. More formally we have:
We would point out here that not every reasonable attempt at normalization results in a metric. Dividing instead by , an altogether sensible thing to do, fails to satisfy the triangle inequality. This may be seen by letting , , and . The and distances are then both while the distance is . 4
While it is clear that this function is bounded by , and that its behavior is more relative in that will affect its value, it must be demonstrated that is in fact a metric.
Proof: We must show that for all finite sets , , and , the following are true:
We have i) because only when . The second requirement ii) is clear from the commutivity of basic set operations.
Item iii), the triangle inequality, requires more work. Using EQ 1, we must show:
It is easy to verify that this is true if , , or , So we restrict our attention to the case in which none of the denominators in EQ 2 is zero.
Now the union of sets A,B, and C, may be partitioned (see figure 1) into seven disjoint subsets whose orders we denote:
For later convenience we also define .
With these definitions, EQ 2 becomes with some simplification:
Now observe that may be removed without loss of generality from the left denominator, since its removal can only increase that side. It will then suffice to show that:
Now replacing with and adding fractions, we arrive at:
The denominator of EQ 4 is equal to the denominator in EQ 5 plus , and is therefore no smaller.
It therefore suffices to show the that the inequality is true for the numerators alone, i.e. that:
Starting from the left side of this inequality we have:
and we are done.
Our arguments for sets in generalize easily to measure spaces[3,4]. The case of finite non-zero measures is straightforward since these correspond well with finite sets and our earlier arguments.
To see this, remember that for any , we earlier expressed the triangle inequality in terms of the orders of their natural decomposition into seven disjoint parts. Our proof of theorem 1 may then be viewed as establishing the truth of an inequality in these seven independent variables - an inequality that holds for any assignment of finite non-negative values; subject only to the restriction that the denominators may not vanish. The same decomposition applies if , with the inequality relating measure instead of set order, thus motivating the connection between and measure spaces.
As a practical matter the finite setting is the most important case. However the generalization extends fully to measures which assume value , and we will take the time to show this.
For elements of finite non-zero measure, definition of our metric will correspond to simple generalization of the notion of set order, to that of measure. For elements in general, several cases are necessary to patch together a definition. While these special cases manage to define the metric for all members of the space, only the simplest discrete distance notion applies to elements with zero or infinite measure.
We now state the corollary:
Proof: Everything but the triangle inequality follows immediately from the definition.
Note first that if or either one of , is one, then the triangle inequality is trivially established. So in particular the inequality is satisfied if .
Further observe that if or is infinite or zero, then iff , and one otherwise. I.e. the definition reduces to a simple equality test for these cases.
With these points in mind we distinguish three cases:
Otherwise we may assume without loss of generality that . Here implies - and if , then unless .
Otherwise we may assume without loss of generality that . But then unless , in which case ; a situation covered by the preceding argument.
Proceeding further we can develop a slightly more general form for . Consider some and their associated decomposition. Now for some , adding to each of , , and , leaves them non-negative and the inequality therefore valid.
This may be thought of as extending each of , , and , by a new region which does not intersect the others.
Equation 3 then becomes:
(7) |
This all motivates:
(8) |
for .
And from our discussion above it follows that:
Thus we see that the discontinuity when may be eliminated 5 by biasing the denominator term . Function is just , and as , the behavior of approaches that of simple symmetric difference.
In a later section we will present several examples of metrics that arise from the results above. Before doing so however, we turn our attention back to Euclidian space and our goal of establishing a normalized metric there.
In this section we will demonstrate that Euclidian distance normalized by combined norm, defines an alternative and bounded metric on .
That this definition is in fact a metric will later be stated as a theorem. Its proof will require a series of Lemmas which build insight into the function.
It is worthwhile noting that in , an easy proof exists and that a very similar metric (a special case of our definition 6), is employed in [5].
We would also observe that this definition fails to generate a metric if the norm is substituted. 6 For if , , and , the triangle inequality does not hold. The norm also fails; consider , , and . For this example, and the norm, the triangle inequality becomes:
This fails for . Therefore, we have that our definition does not generate a metric for any of the integral Minkowski metrics where .
Our to-be-proven positive result for is thus that much more interesting.
Proof:
Let , , , be four distinct points in Euclidean space.
Then
proof: Let , , , , , .
Then we need to prove that
Case 1: . Then we have
Case 2: Then we have
(10) | |||
(11) |
from which the result follows immediately as well.
Case 3: . Observe that the theorem is true for four points , , , if and only if it is true for and the points , , obtained by inverting , and in the sphere of radius 1 centered at . (Recall that and are on the same ray from and satisfy . The key fact that is needed is .) This reduces case 3 to case 1.
If we now label the origin as , we may re-state the theorem as a corollary in purely geometrical terms:
So despite a singularity at the origin, imposes a metric on . This amounts to a fully relative distance measure which clearly cannot well cope with zero. By contrast, standard Euclidian distance may be viewed as a fully absolute distance measure. It is easy to imagine a continuum of intermediate metrics which are partially relative.
Furthermore, some applications, while requiring a modicum of relative behavior, may also require better behavior at the origin than provides.
All of this motivates the following definition:
(12) |
The Normalized Euclidian Distance of definition 9 corresponds to . Note that the definition 6's special condition at is only required when , for when , the denominator cannot vanish.
The case , merely corresponds to standard Euclidian distance scaled by .
In between these two extremes lies a family of hybrid functions that combine the relative behavior of with the absolute behavior of .
To establish that is in general a metric space, we will require one lemma concerning the behavior of quasi-quadrilaterals in .
I.e. the sum of the product of the lengths of opposite sides of an arbitrary quadrilateral, is no smaller than the product of its diagonals.
Proof: We have only to argue that we may reduce setting to since there, the lemma is recognized as a standard theorem of inversive geometry ([6] Theorem 5.12) - a direct generalization of Ptolemy's theorem.
Since any four points may be embedded in , we may assume our setting is at most this.
Within , choose coordinates so that lie on the plane, and is located above (or on) it. I.e. and .
Now translate so that is the origin.
Further choose rotation and orientation so that line is parallel to and above the X-axis with left of . I.e. and .
Now consider spheres and . The intersection of these spheres is a circle perpendicular to the XY plane and parallel to the Y-axis, having center along .
Any choice for along on this circle it will leave everything unchanged in our inequality but for .
Notice that the point on this circle farthest from the origin, is just its intersection with the XY plane.
Denoting this point and relocating there, increases and and the inequality's right side, while leaving the left unchanged.
Thus it is clear that the inequality is true for points in iff it is true for in .
We are now ready to establish that is a metric:
Proof: The first and second metric conditions are straightforward. We therefore turn to the triangle inequality.
If , then by definition and is nothing more than scaled Euclidian distance. If , then by definition and is just scaled normalized Euclidian distance.
So the only interesting case is that in which . Here, we may assume without loss of generality that since other cases may be reduced to this one by multiplying each term in the triangle inequality by and then reducing.
Now defining for convenience: , it remains for us to show that:
Now forming a common denominator, discarding it, and then rearranging, we have:
We now separately consider matched lines from either side of this expression. It turns out that the inequality holds for each such pairing, thus establishing the overall inequality. The inequality for the first line is just scaled ordinary Euclidian distance. Then, letting denote the origin, the inequality holds for the second line due to corollary 3, and for the third 7 due to lemma 1.
To illustrate the wide variety of metrics that can be constructed with our earlier results, we start by returning to metrics on .
We first observe that finite sets may be regarded as binary vectors in Euclidian space. With this observation theorem 2 gives us that the following is a metric:
(13) |
It is interesting to note that without the square root operations, this definition corresponds to normalization of by , which was earlier shown not to be a metric.
Again thinking of sets as binary vectors or variables, metric is recognized as a well known measure of association referred to as the Jaccard coefficient [7], or S-coefficient:
Where refer to the standard entries in a 2-by-2 contingency (or association) table for binary variables. This very basic comparison function may be expressed in many forms. Expressing essentially the same thing as a similarity measure corresponds to the Tanimoto Coefficient, [8], operating on binary vectors:
In the language of computer programming this is just the count of 1's in the exclusive or of bit vectors and , divided by the count of 1's in their logical or.
In [9], the measure is referred to as association. While the authors do not state that these various forms fail to satisfy the triangle inequality, they describe them along with other non-metric measures.
Other sources such as [10] mention explicitly the set function , but again fail to note that it satisfies the triangle inequality.
The Bray-Curtis measure[11], is written:
and is recognized as our but under the norm. Now we have seen that this does not form a metric, but since is a metric we may write:
with the knowledge that this does form a metric.
Also in [11], the reader will recognize the Canberra Metric as in , making a considerable generalization of this apparently well known metric.
In familiar settings such as and , with measure defined as length/ area/ volume, the metric corresponds to the amount of non-common length/ area/ volume, normalized by the total length/ area/ volume.
Here are some examples of metrics derived from corollary 1, that are defined for -tuples of non-negative values. For brevity's sake, we will not show the case for which any metric must evaluate to zero.
This may be viewed as a repaired Bray-Curtis measure in that . I.e. the denominator has an extra term which suffices to form a metric.
which resembles somewhat our first example but behaves quite differently.
An example in which a single component greatly influences the resulting value.
Now corollary 1 may also be applied to function spaces. Let denote the space of non-negative continuous functions on . Given , the metric defined by:
may be thought of as an extension of our first positive -tuple metric above, as , or in a Lebesgue/Measure-Space context.
Extending our second -tuple metric to Riemann integrable functions allows us to write:
Note that in some cases, and with the proper assumptions, the continuity and range of integration assumptions made above may be relaxed.
Now for the sake of brevity, we have focused mainly on examples using / . Similar constructions apply to for both -tuples and function spaces.
It is worthwhile noting (however obvious) that may also be viewed as an alternate metric for the complex plane with norm corresponding to complex absolute value.
We close this section by commenting that used as a measure of relative error in numerical analysis, may prove interesting. Given a sequence of approximate solutions, we might define and the triangle inequality would then give us that: .
We have developed normalized forms for two important metrics and shown each to be an endpoint of a continuum of metric functions ranging to the original unnormalized forms.
In a combinatorial setting, may be viewed as a prototype for constructing a normalized bounded metric for many applications - for every injection of a pattern space into induces a metric on the original pattern space. 8
Furthermore, evaluating this metric is possible given only and the individual orders , and since .
In practice, these values may sometimes be directly computed without explicitly forming injective images in set space. In [12] a VLSI chip is described which in computes these three orders for a particular mapping of finite symbol strings into , and combines them to define a measure of string similarity.
In more Euclidian settings, the family may be used where normalized behavior is required and the metric triangle inequality is of value, e.g. in finding nearest neighbors, or when bounding sequential sums of distances.
Beyond the existence and basic properties established in this paper, one may examine a number of topics including the notion of colinearity in these normalized spaces, the nature of the geodesics corresponding to , and questions of statistical behavior.
This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.47)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 -image_type gif -nonext_page_in_navigation -noprevious_page_in_navigation -up_url ../main.html -up_title 'Publication homepage' -numbered_footnotes nmet.tex
The translation was initiated by Peter N. Yianilos on 2002-07-07