# The triangle inequality

I have a Stata program, `metricp`, which takes a distance matrix (square, symmetric, zero diagonal) and tests for the triangle inequality (that for all `i`, `j`, there is no `k` such that `d[i,j] > d[i,k] + d[j,k]`).

I needed something similar today in R, and found Matthew Vavrek’s fossil package, which includes a `tri.ineq()` function. However, it turned out to be very slow on the large matrix I threw at it, so I decided to speed it up with some of the techniques in `metricp.ado`.

This is the original tri.ineq():

````tri.ineq` <-
function(dist)
{
mat<-as.matrix(dist)
n<-dim(mat)[1]
ineq <- 0
for (i in 1:(n-2)) {
for (j in (i+1):(n-1)) {
for (k in (j+1):n) {
sds<-c(mat[j,i], mat[k,i], mat[k,j])
lng<-max(sds)
if (lng>(sum(sds)-lng)) ineq<-ineq+1
}
}
}
return(ineq==0)
}
```

It’s sensible code, but it uses three nested loops (for each `i` and `j`, look for an infringing `k`) and it doesn’t stop once it has found one. My code instead takes each i and j and adds the vector of distances to all ks and takes its minimum, and stops once it has found a minimum number of infringing ks.

My version:

````tri.ineq2` <-
function(dist)
{
mat<-as.matrix(dist)
n<-dim(mat)[1]
ineq <- 0
for (i in 1:(n-1)) {
for (j in (i+1):n) {
if (mat[i,j] > min(mat[i,]+mat[j,])) {
ineq <- ineq+1
}
if (ineq>=10) break
}
if (ineq>=10) break
}
return(ineq==0)
}
```

The term `mat[i,]+mat[j,]` represents a vector if distances i to k to j for all ks, so taking its minimum gets the distance for the “worst” k, if any infringes the triangle inequality. Breaking on `ineq>=10` avoids the waste of time of checking every infringement; it could make sense to break after a single infringement (my `metricp.ado` command offers options for reporting the infringements).

Testing with a 1151 times 1151 matrix shows its about 20 times as fast (when the matrix is metric):

```> system.time(tri.ineq(jm3))
user  system elapsed
545.120   0.000 545.182
> source("/tmp/tri.ineq2.R")
> system.time(tri.ineq2(jm3))
user  system elapsed
24.248   0.012  24.264
```

I left a note on Matthew’s blog but it didn’t format properly, hence this short note.

`metricp.ado` is part of SADI. Do `ssc install sadi` in Stata to install it.

## One thought on “The triangle inequality”

1. Gonçalo says:

Really interesting. I was searching for ways to speed up a script I am using, and this might do the trick. At least partially. Thank you!

This site uses Akismet to reduce spam. Learn how your comment data is processed.