Distance between points
The distance between two points can have several different definitions depending on the specific situation, although without any further specification it usually refers to the length of the segment connecting the two points (also called Euclidean distance).
Euclidean Distance
The Euclidean distance is generally defined as the lenght of the segment connecting two points.
Formula
The Euclidean distance can be easily calculated when the coordinates of the two points are known. For example, in the XY plane, the formulas would be the following:
The formula above can be generalized for vector spaces of any dimension, by taking the square root of the sum of the squares of differences in each dimension:
Algorithm
The algorithm for the Euclidean distance can be derived directly from the defining formula above:
/* Assume that P and Q are N-dimensional arrays */
distance(P, Q) {
N = dim(P)
square_sum = 0.0 /* initialize running sum */
for dimension i from 1 to N { /* iterate over all dimensions */
square_sum = square_sum + (P[i] - Q[i])^2 /* sum squares of differences */
}
distance = sqrt(square_sum); /* distance is the square root of the sum of squared differences */
return distance;
}
Manhattan Distance
Among the many possible definitions of distance, another one commonly found in computer science is the Manhattan distance. This is typically a distance used when movements are constrained to happen over a grid:
Formula
The formula below applies when calculating the Manhattan distance over an N-dimensional grid where adjacent nodes are assumed to have distance 1:
Algorithm
The algorithm can be implemented as a direct derivation of the formula:
/* Assume that P and Q are N-dimensional arrays */
manhattan_distance(P, Q) {
N = dim(P)
distance = 0.0 /* initialize running sum */
for dimension i from 1 to N { /* iterate over all dimensions */
distance = distance + abs(P[i] - Q[i]) /* sum squares of differences */
}
return distance;
}
Minkowski Distance
Minkowski Distance is a metric in a normed vector space which can be considered as a generalization of both the Euclidean distance and the Manhattan distance. It is named after the German mathematician Hermann Minkowski.
Formula
The formula below applies when calculating the Minkowski distance over an N-dimensional grid:
$P = (p_1, p_2, ..., p_n)\$ $Q = (q_1, q_2, ..., q_n)\$ $d(P, Q) = \left( \sum_{i=1}^{n} |p_i - q_i|^m \right) ^ {\frac{1}{m}}$
Algorithm
/* Assume that P and Q are N-dimensional arrays */
manhattan_distance(P, Q) {
N = dim(P)
distance = 0.0 /* initialize running sum */
for dimension i from 1 to N { /* iterate over all dimensions */
distance = distance + abs(P[i] - Q[i])^m /* sum the processed differences */
}
return distance^(1/m);
}
Performance
Both Euclidean and Manhattan distance algorithms for each respective distance definition are linear (O(n)) with respect to the number of dimensions with which each point is represented. The Minkowski distance algorithm's complexity is around O(nm) for n dimensional vector and the factor power m.
Implementations
Language | Link | |
---|---|---|
Helpful Links
- Euclidean distance
- Manhattan distance
- Minkowski distance
- Norm a generalization of the concept of distance
- Lp space a generalization of a class of norms