Fourth ACM-SIAM Symposium on Discrete Algorithms 1993

Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces

Peter N. Yianilos

Abstract: We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation is very high.

Also relevant are high-dimensional Euclidian settings in which the distribution of data is in some sense of lower dimension and embedded in the space.

The vp-tree (vantage point tree) is introduced in several forms, together with associated algorithms, as an improved method for these difficult search problems. Tree construction executes in O(n log n) time, and search is under certain circumstances and in the limit, O(log n) expected time.

The theoretical basis for this approach is developed and the results of several experiments are reported. In Euclidian cases, kd-tree performance is compared.

Keywords: Metric Space, Nearest Neighbor, Computational Geometry, Associative Memory, Randomized Methods, Pattern Recognition, Clustering.


This manuscript was completed during June 1992 and issued as an NEC Research Institute Technical Report.