{"id":72,"date":"2011-03-20T15:03:15","date_gmt":"2011-03-20T15:03:15","guid":{"rendered":"http:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=72"},"modified":"2014-04-20T17:05:31","modified_gmt":"2014-04-20T17:05:31","slug":"relative-rates-and-odds-ratios","status":"publish","type":"post","link":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=72","title":{"rendered":"Relative rates and odds ratios"},"content":{"rendered":"<p>A frequent theme in the medical statistics and epidemiological literature is that odds ratios (ORs) as effect measures for binary outcomes are counter intuitive and an impediment to understanding. <a href=\"#barros03:_alter_logis_regres_cross_studies\">Barros and Hirakata (2003)<\/a>, for instance, refer to the relative rate as the &#8220;measure of choice&#8221; and complain that the OR will &#8220;overestimate&#8221; the RR as the baseline probability rises. Clearly, ORs are less intuitive than relative rates (RRs), but in this note I take issue with the conclusion sometimes made, that models with relative-rate interpretations should be used instead of logistic regression and other OR models. This is because RRs are not measures of the size of the statistical association between a variable and an outcome since they also vary inversely with the baseline probability), and because, under certain assumptions, ORs and related measures are. That is, RRs may feel more real but they are likely to be misleading.<\/p>\n<p><!--more-->While the argument is often cast in terms of rejecting logitistic in favour of log-binomial regression and other alternatives, let&#8217;s look at some 2*2 tables and hand-calculated ORs and RRs. In the following two tabulations the OR is constant at 2.5, but the baseline probability (in class==0) is respectively 2% and 75%.<a href=\"#foot53\" name=\"tex2html2\"><sup><span class=\"arabic\">1<\/span><\/sup><\/a><\/p>\n<table width=\"100%\">\n<tbody>\n<tr>\n<td>\n<pre>. tab class outcome, matcell(n)\r\n\r\n           |        outcome\r\n     class |        No        Yes |     Total\r\n-----------+----------------------+----------\r\n  Class 0  |       980         20 |     1,000\r\n  Class 1  |       951         49 |     1,000\r\n-----------+----------------------+----------\r\n     Total |     1,931         69 |     2,000 \r\n\r\n. scalar RR = (n[2,2]\/(n[2,1]+n[2,2]))\/(n[1,2]\/(n[1,1]+n[1,2]))\r\n. scalar OR = (n[2,2]\/n[2,1])\/(n[1,2]\/n[1,1])\r\n. scalar D  = n[2,2] - n[1,2]\r\n. di  \"RR \" %6.3f RR \"; OR \" %6.3f OR \"; N extra outcomes\" %5.0f D\r\nRR  2.450; OR  2.525; N extra outcomes   29<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Here the RR is 2.45, the OR 2.53 and the number of extra outcomes in class 1 is 29, or 2.9%.<\/p>\n<table width=\"100%\">\n<tbody>\n<tr>\n<td>\n<pre>           |        outcome\r\n     class |        No        Yes |     Total\r\n-----------+----------------------+----------\r\n  Class 0  |       270        730 |     1,000\r\n  Class 1  |       129        871 |     1,000\r\n-----------+----------------------+----------\r\n     Total |       399      1,601 |     2,000\r\n...\r\nRR  1.193; OR  2.497; N extra outcomes  141<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>But when the baseline probability is high, the RR plummets (suggesting a 19% increase instead of 145%), despite the approximate substantive measure giving 141 or 14.1% extra cases. In this simple case, it seems RRs track substantive significance rather worse than ORs do.<\/p>\n<p>But how do RRs and ORs compare in terms of estimating the size of the underlying statistical or causal association? There are many underlying causal structures possible, but let&#8217;s use Stata to simulate a simple one.<a href=\"#foot54\" name=\"tex2html4\"><sup><span class=\"arabic\">2<\/span><\/sup><\/a> Let the outcome of interest depend on an unobserved (and perhaps unobservable) interval variable. If this propensity is above a certain threshold, the outcome occurs, but let the threshold (and thus the proportion having the outcome) differ from time to time. Let the difference between the two groups be that they have different distributions of the underlying propensity &#8211; normal, with the same variance but different means.<a href=\"#foot21\" name=\"tex2html6\"><sup><span class=\"arabic\">3<\/span><\/sup><\/a> Conceptually, this inter-group difference is the source of effect we are trying to measure, while variation in the threshold is not related to the causal effect. We run the simulation with a sample size of 10,000 and an inter-group difference of 0.2 standard deviations, and 2*2 tables are created for outcome probabilites. Here, for example, for 20% and 60% probabilities:<\/p>\n<table width=\"100%\">\n<tbody>\n<tr>\n<td>\n<pre>set obs 10000\r\ngen class = _n &lt;= 5000\r\ngen propensity = invnorm(uniform()) + (class==1)*0.2\r\nsort propensity\r\ngen outcome20 = _n &gt; (1 - 0.2)*_N\r\ngen outcome60 = _n &gt; (1 - 0.6)*_N<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This yields the following:<\/p>\n<table width=\"100%\">\n<tbody>\n<tr>\n<td>\n<pre>           |       outcome20                                  |       outcome60\r\n     class |         0          1 |     Total           class |         0          1 |     Total\r\n-----------+----------------------+----------      -----------+----------------------+----------\r\n         0 |     4,144        856 |     5,000               0 |     2,199      2,801 |     5,000\r\n           |     82.88      17.12 |    100.00                 |     43.98      56.02 |    100.00\r\n-----------+----------------------+----------      -----------+----------------------+----------\r\n         1 |     3,856      1,144 |     5,000               1 |     1,801      3,199 |     5,000\r\n           |     77.12      22.88 |    100.00                 |     36.02      63.98 |    100.00\r\n-----------+----------------------+----------      -----------+----------------------+----------\r\n     Total |     8,000      2,000 |    10,000           Total |     4,000      6,000 |    10,000\r\n           |     80.00      20.00 |    100.00                 |     40.00      60.00 |    100.00 \r\n\r\n20% probability: RR:  1.34; OR:  1.44              60% probability: RR:  1.14; OR:  1.39<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div>\n<p><a name=\"fig:orrr\"><\/a><a name=\"34\"><\/a><\/p>\n<table>\n<caption><strong><a href=\"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/wp-content\/uploads\/2011\/03\/orrr-norm.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-90\" title=\"orrr-norm\" alt=\"\" src=\"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/wp-content\/uploads\/2011\/03\/orrr-norm.png\" width=\"550\" height=\"400\" \/><\/a><\/strong><strong>Figure 1:<\/strong><br \/>\n<small class=\"FOOTNOTESIZE\">RRs and ORs: the points are the individual<br \/>\nestimates and the lines the average across 30 replications, with a<br \/>\nnormal propensity distribution. The mean RR starts very close to the<br \/>\nmean OR but drops to no effect (RR=1) in an almost linear fashion.<\/small><\/p>\n<\/caption>\n<tbody>\n<tr>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>Between 20% and 60% outcome probability, the OR drops but the RR drops rather more. Figure <a href=\"#fig:orrr\">1<\/a> show results for probabilities between 1% and 99%, replicated thirty times (lines or the average values, dots for the actual values). As can be seen, the ORs vary in a shallow U, but the RRs drop precipitously to a zero effect for high baseline probabilities.<\/p>\n<p>Figure <a href=\"#fig:orrrl-logit\">2<\/a> repeats this exercise with logistic rather than normal propensity distributions, with the same variance (logistic distributions resemble normal but have higher kurtosis). Here the average OR is rather more stable. In fact, it can be shown mathematically that the OR is related directly to the difference in means, and is completely independent of the threshold.<\/p>\n<div>\n<p><a name=\"fig:orrrl-logit\"><\/a><a name=\"41\"><\/a><\/p>\n<table>\n<caption><a href=\"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/wp-content\/uploads\/2011\/03\/orrr-logit.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-93\" title=\"orrr-logit\" alt=\"\" src=\"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/wp-content\/uploads\/2011\/03\/orrr-logit.png\" width=\"550\" height=\"400\" \/><\/a><strong>Figure 2:<\/strong><br \/>\n<small class=\"FOOTNOTESIZE\">RRs and ORs with a logistic propensity<br \/>\ndistribution: the points are the individual estimates and the lines<br \/>\nthe average across 30 replications. Here the OR is much more<br \/>\nstable.<\/small><\/caption>\n<tbody>\n<tr>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>Clearly, this causal backstory is simplistic.<a href=\"#foot44\" name=\"tex2html9\"><sup><span class=\"arabic\">4<\/span><\/sup><\/a> The latent propensity may have other distributions, and the inter-class difference may be other than additive (though, note that log-normal distributions with a multiplicative difference are equivalent in effect to normal with an additive difference). If one class has greater variance, the causal effect will be non-linear (over-represented at both high and low propensity). However, in so far as it is approximately realistic, this story suggests that the odds ratio is a reasonably stable measure of an effect, while the relative rate is superficially intuitive but is not an effect measure.<\/p>\n<p>While the OR and RR can be calculated by hand, the results from logistic (<tt>logit outcome class<\/tt>), poisson (<tt>poisson outcome class<\/tt>) and log-binomial regression (<tt>glm outcome class, link(log) family(binomial)<\/tt>) are exactly the same. The extension to probit regression is obvious. If the simulated distribution is normal, the mean probit estimate (not shown) is as flat as the OR is for the logistic distribution.<a href=\"#foot57\" name=\"tex2html10\"><sup><span class=\"arabic\">5<\/span><\/sup><\/a><\/p>\n<p>When we are thinking in terms of models, rather than hand-calculated statistics, we can view the propensity distributions as conditional on the other variables. The effect of these variables is analogous to shifting the threshold. In this case, the RR will be unreliable even if the average level of the outcome is stable, if there are other variables with large effects on the outcome. Thus, unless predicted probabilities in the data are all very low (say, under\u00a0 10%) it seems unwise to base interpretations on RR models. If it is important for the audience to see effects on probabilities, use <tt>-margins-<\/tt> to report marginal effects for different configurations of covariates. The fact that marginal effects vary with the values of the covariates is a feature, not a bug, reflecting the complexity of reality rather than being a wrong-headed consequence of an awkward model.<\/p>\n<h2><a name=\"SECTIONREF\"><\/a>Bibliography<\/h2>\n<dl>\n<dt><a name=\"barros03:_alter_logis_regres_cross_studies\"><\/a><br \/>\nBarros, A. J. and Hirakata, V. N. (2003).<\/dt>\n<dd>Alternatives for logistic regression in cross-sectional studies: An<br \/>\nempirical comparison of models that directly estimate the prevalence ratio.<em>BMC Medical Research Methodology<\/em>, 3(21).<\/p>\n<\/dd>\n<\/dl>\n<hr \/>\n<h4>Footnotes<\/h4>\n<dl>\n<dt><a name=\"foot53\"><\/a>&#8230; 75\\%.<a href=\"#tex2html2\"><sup><span class=\"arabic\">1<\/span><\/sup><\/a><\/dt>\n<dd>Stata<br \/>\ncode at <tt><a href=\"http:\/\/teaching.sociology.ul.ie\/catdat\/ortab.do\" name=\"tex2html3\">http:\/\/teaching.sociology.ul.ie\/catdat\/ortab.do<\/a><\/tt>.<\/dd>\n<dt><a name=\"foot54\"><\/a>&#8230;<br \/>\none.<a href=\"#tex2html4\"><sup><span class=\"arabic\">2<\/span><\/sup><\/a><\/dt>\n<dd>Stata code at<br \/>\n<tt><a href=\"http:\/\/teaching.sociology.ul.ie\/catdat\/orsim.do\" name=\"tex2html5\">http:\/\/teaching.sociology.ul.ie\/catdat\/orsim.do<\/a><\/tt>.<\/dd>\n<dt><a name=\"foot21\"><\/a>&#8230;<br \/>\nmeans.<a href=\"#tex2html6\"><sup><span class=\"arabic\">3<\/span><\/sup><\/a><\/dt>\n<dd>The attentive reader may recognise this as related to<br \/>\nthe latent variable justification of the logistic regression model,<br \/>\nbut for the moment please consider its plausibility as a simple causal<br \/>\nmodel.<\/dd>\n<dt><a name=\"foot44\"><\/a>&#8230; simplistic.<a href=\"#tex2html9\"><sup><span class=\"arabic\">4<\/span><\/sup><\/a><\/dt>\n<dd>It also suits<br \/>\nonly one-off outcomes &#8211; if the outcome is a result of exposure over<br \/>\ntime, the OR is as misleading as the RR, and an estimate of the<br \/>\nhazard-rate ratio is needed.<\/dd>\n<dt><a name=\"foot57\"><\/a>&#8230;<br \/>\ndistribution.<a href=\"#tex2html10\"><sup><span class=\"arabic\">5<\/span><\/sup><\/a><\/dt>\n<dd>If you multiply the probit estimate by <span class=\"MATH\">pi\/sqrt(3)<\/span>, it approximates the log of the odds ratio quite closely.<\/dd>\n<\/dl>\n","protected":false},"excerpt":{"rendered":"<p>A frequent theme in the medical statistics and epidemiological literature is that odds ratios (ORs) as effect measures for binary outcomes are counter intuitive and an impediment to understanding. Barros and Hirakata (2003), for instance, refer to the relative rate as the &#8220;measure of choice&#8221; and complain that the OR will &#8220;overestimate&#8221; the RR as &hellip; <a href=\"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=72\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Relative rates and odds ratios<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/72"}],"collection":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=72"}],"version-history":[{"count":32,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/72\/revisions"}],"predecessor-version":[{"id":304,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/72\/revisions\/304"}],"wp:attachment":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=72"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=72"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=72"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}