{"id":456,"date":"2018-01-04T21:44:49","date_gmt":"2018-01-04T21:44:49","guid":{"rendered":"http:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=456"},"modified":"2018-01-04T22:08:49","modified_gmt":"2018-01-04T22:08:49","slug":"the-triangle-inequality","status":"publish","type":"post","link":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=456","title":{"rendered":"The triangle inequality"},"content":{"rendered":"<div id=\"text-1\" class=\"outline-text-2\">\n<p>I have a Stata program, <code>metricp<\/code>, which takes a distance matrix (square, symmetric, zero diagonal) and tests for the triangle inequality (that for all <code>i<\/code>, <code>j<\/code>, there is no <code>k<\/code> such that <code>d[i,j] &gt; d[i,k] + d[j,k]<\/code>).<\/p>\n<p>I needed something similar today in R, and found Matthew Vavrek&#8217;s <a href=\"https:\/\/cran.r-project.org\/web\/packages\/fossil\/index.html\">fossil package<\/a>, which includes a <code>tri.ineq()<\/code> function. However, it turned out to be very slow on the large matrix I threw at it, so I decided to speed it up with some of the techniques in <code>metricp.ado<\/code>.<\/p>\n<p><!--more--><\/p>\n<p>This is the original tri.ineq():<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-R\">`tri.ineq` &lt;-\r\nfunction(dist) \r\n{\r\n    mat&lt;-as.matrix(dist)\r\n    n&lt;-dim(mat)[1]\r\n    ineq &lt;- 0\r\n    for (i in 1:(n-2)) {\r\n        for (j in (i+1):(n-1)) {\r\n            for (k in (j+1):n) {\r\n                sds&lt;-c(mat[j,i], mat[k,i], mat[k,j])\r\n                lng&lt;-max(sds)\r\n                if (lng&gt;(sum(sds)-lng)) ineq&lt;-ineq+1\r\n            }\r\n        }\r\n    }\r\n    return(ineq==0)\r\n}\r\n<\/pre>\n<\/div>\n<p>It&#8217;s sensible code, but it uses three nested loops (for each <code>i<\/code> and <code>j<\/code>, look for an infringing <code>k<\/code>) and it doesn&#8217;t stop once it has found one. My code instead takes each i and j and adds the vector of distances to all ks and takes its minimum, and stops once it has found a minimum number of infringing ks.<\/p>\n<p>My version:<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-R\">`tri.ineq2` &lt;-\r\nfunction(dist) \r\n{\r\n    mat&lt;-as.matrix(dist)\r\n    n&lt;-dim(mat)[1]\r\n    ineq &lt;- 0\r\n    for (i in 1:(n-1)) {\r\n        for (j in (i+1):n) {\r\n            if (mat[i,j] &gt; min(mat[i,]+mat[j,])) {\r\n                ineq &lt;- ineq+1\r\n            }\r\n            if (ineq&gt;=10) break\r\n        } \r\n        if (ineq&gt;=10) break\r\n    }\r\n    return(ineq==0)\r\n}\r\n<\/pre>\n<\/div>\n<p>The term <code>mat[i,]+mat[j,]<\/code> represents a vector if distances i to k to j for all ks, so taking its minimum gets the distance for the &#8220;worst&#8221; k, if any infringes the triangle inequality. Breaking on <code>ineq&gt;=10<\/code> avoids the waste of time of checking every infringement; it could make sense to break after a single infringement (my <code>metricp.ado<\/code> command offers options for reporting the infringements).<\/p>\n<p>Testing with a 1151 times 1151 matrix shows its about 20 times as fast (when the matrix is metric):<\/p>\n<pre class=\"example\">&gt; system.time(tri.ineq(jm3))\r\n   user  system elapsed \r\n545.120   0.000 545.182 \r\n&gt; source(\"\/tmp\/tri.ineq2.R\")\r\n&gt; system.time(tri.ineq2(jm3))\r\n   user  system elapsed \r\n 24.248   0.012  24.264\r\n<\/pre>\n<p>I left a note on <a href=\"https:\/\/matthewvavrek.com\/programs-and-code\/fossil\/\">Matthew&#8217;s blog<\/a> but it didn&#8217;t format properly, hence this short note.<\/p>\n<p><code>metricp.ado<\/code> is part of <i>SADI<\/i>. Do <code>ssc install sadi<\/code> in Stata to install it.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>I have a Stata program, metricp, which takes a distance matrix (square, symmetric, zero diagonal) and tests for the triangle inequality (that for all i, j, there is no k such that d[i,j] &gt; d[i,k] + d[j,k]). I needed something similar today in R, and found Matthew Vavrek&#8217;s fossil package, which includes a tri.ineq() function. &hellip; <a href=\"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=456\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">The triangle inequality<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/456"}],"collection":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=456"}],"version-history":[{"count":14,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/456\/revisions"}],"predecessor-version":[{"id":479,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/456\/revisions\/479"}],"wp:attachment":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=456"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=456"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=456"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}