{"id":669,"date":"2020-05-27T12:06:29","date_gmt":"2020-05-27T12:06:29","guid":{"rendered":"http:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=669"},"modified":"2020-05-27T12:06:29","modified_gmt":"2020-05-27T12:06:29","slug":"correlations-smoothed-time-series-and-sewage-sludge","status":"publish","type":"post","link":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=669","title":{"rendered":"Correlations, smoothed time-series and sewage sludge"},"content":{"rendered":"<p>\nA very nice idea: search for evidence of COVID-19 RNA in municipal wastewater, as a cheap and fast form of public health surveillance. A <a href=\"https:\/\/doi.org\/10.1101\/2020.05.19.20105999\">pre-print<\/a> shows that this works well, in a trial in Connecticut. I think the evidence is in their favour, but they commit two cardinal errors: first, they report a correlation (well, a squared correlation) between time-series and second, they do it on smoothed data. Autocorrelation means time-series may have vastly inflated and\/or spurious correlations, and stripping the noise out of variables removes the noise from the comparison, making it seem, well, much less noisy than it is.\n<\/p>\n<p>\nThis is one of their key results: the smoothed RNA curve looks just like the smoothed hospital admissions curve, with a lead of about 3 days:\n<\/p>\n<div class=\"figure\">\n<p><img decoding=\"async\" src=\"http:\/\/teaching.sociology.ul.ie\/bhalpin\/sewage.png\" alt=\"sewage.png\" \/>\n<\/p>\n<\/div>\n<p>\nThey report an R<sup>2<\/sup> of 0.99 for this relationship.\n<\/p>\n<p>\nHowever, they also show the data. Given there are 2 series for 44 days, we can pick this off the graph without too much effort:\n<\/p>\n<div class=\"figure\">\n<p><img decoding=\"async\" src=\"http:\/\/teaching.sociology.ul.ie\/bhalpin\/sewagecheck.png\" alt=\"sewagecheck.png\" \/>\n<\/p>\n<\/div>\n<p>\n(This is prompted by @lycraolaoghaire&#8217;s tweets: <a href=\"https:\/\/twitter.com\/lycraolaoghaire\/status\/1265251252239286272?s=20\">https:\/\/twitter.com\/lycraolaoghaire\/status\/1265251252239286272?s=20<\/a>).\n<\/p>\n<p>\nIt turns out that the correlation between the RNA measurement and hospital admissions is 0.357 (R<sup>2<\/sup> = 0.13). If we lag by one day, the R<sup>2<\/sup> rises to a very respectable 0.45, but declines again if we lag by 2 (0.22) or 3 (0.22) days. In other words, there is a real signal here, but it is vastly overstated by R<sup>2<\/sup> = 0.99, and the lead it gives is not as big as claimed.\n<\/p>\n<p>\nPredicting hospital admissions using lagged RNA values, with lags of 1 to 5, and then all five lags together (green line) looks like this:\n<\/p>\n<div class=\"figure\">\n<p><img decoding=\"async\" src=\"http:\/\/teaching.sociology.ul.ie\/bhalpin\/lagpred.png\" alt=\"lagpred.png\" \/>\n<\/p>\n<\/div>\n<p>\nThis is a much less impressive graph than the original, but it is picking up something. Most of the work is done by the one-day lag, which has a clear effect, and the combined 5-lag model isn&#8217;t better (by LR-test) than the L1 model only. However, using this technique very widely as a passive surveillance technique is going to pick up unexpected large shifts in disease RNA, which is much more important than being able to predict moderate changes in hospitalisation from moderate changes in RNA presence in sewage sludge.\n<\/p>\n<p>\nScreen-picked data available <a href=\"http:\/\/teaching.sociology.ul.ie\/bhalpin\/sewage.csv\">here<\/a>, no warranties.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A very nice idea: search for evidence of COVID-19 RNA in municipal wastewater, as a cheap and fast form of public health surveillance. A pre-print shows that this works well, in a trial in Connecticut. I think the evidence is in their favour, but they commit two cardinal errors: first, they report a correlation (well, &hellip; <a href=\"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=669\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Correlations, smoothed time-series and sewage sludge<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/669"}],"collection":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=669"}],"version-history":[{"count":4,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/669\/revisions"}],"predecessor-version":[{"id":673,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/669\/revisions\/673"}],"wp:attachment":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=669"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=669"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=669"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}