{"id":309,"date":"2014-09-24T08:31:35","date_gmt":"2014-09-24T08:31:35","guid":{"rendered":"http:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=309"},"modified":"2014-09-24T08:31:35","modified_gmt":"2014-09-24T08:31:35","slug":"substitution-costs-from-transition-rates","status":"publish","type":"post","link":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=309","title":{"rendered":"Substitution costs from transition rates"},"content":{"rendered":"<p>Given that determining substitution costs in sequence analysis is such a bone of contention, many researchers look for a way for the data to generate the costs. The typical way to do this is, is by pooling transition rates and defining the substitution cost to be:<\/p>\n<p><em>2 &#8211; p(ij) &#8211; p(ji)<\/em><\/p>\n<p>where <em>p(ij)<\/em> is the transition rate from state <em>i<\/em> to state <em>j<\/em>. Intuitively, states that are closer to each other will have higher transitions, and vice versa.<!--more--><\/p>\n<p>I don&#8217;t recommend this approach in general, for reasons which I will not go into here, but I do have a utility in my Stata package for sequence analysis, SADI, which calculates these quantities, <code>trans2subs<\/code>.<\/p>\n<p>This requires the data in long format, so we reshape first (by default the sequences are in wide format, as variables <code>state1<\/code> to <code>state<\/code><em>N<\/em>).<\/p>\n<p><code>reshape long state, i(id) j(t)<br \/>\ntrans2subs state, id(id) distmat(trpr1)<br \/>\ntrans2subs state, id(id) distmat(trpr2) diagincl<br \/>\nreshape wide<\/code><\/p>\n<p>The transition rates are calculated by default without the diagonal (i.e., ignoring cases where the sequence remains in the same state from <em>t<\/em> to <em>t+1<\/em>), but this can be over-ridden by an option.<\/p>\n<p>The command works by cross-tabulating <em>state<\/em> with its lag, putting the results in a matrix, and letting Mata do some simple calculations on the result. However, the <code>trans2subs<\/code> command as distributed is fragile, and can break down in certain circumstances, for instance where a row or column has values only on the diagonal (i.e., a state that is only exited or is never exited, such as never-married or retired). Thanks to Anna Manzoni for alerting me to this problem.<\/p>\n<p>As a short term solution, I present an alternative command here, <code>t2s<\/code>, which is more robust. I will replace <code>trans2subs<\/code> with this code when I next update the SADI package, but for now you can access it from <a title=\"t2s.ado\" href=\"http:\/\/teaching.sociology.ul.ie\/seqanal\/t2s.ado\">this link<\/a>, or by cutting and pasting from here:<\/p>\n<p><code><br \/>\nmata:<br \/>\nvoid transition_driven_subsmat2(string matrix tabmat, scalar diagincl) {<br \/>\n\/\/ Read stata matrix into mata<br \/>\nG=st_matrix(tabmat)<\/code><\/p>\n<p>if (rows(G)!=cols(G)) {<br \/>\n_error(&#8220;Table isn&#8217;t square&#8221;)<br \/>\n}<\/p>\n<p>if (diagincl==0) {<br \/>\nG = G &#8211; diag(G)<br \/>\n}<\/p>\n<p>Gr=G:\/rowsum(G)<br \/>\nsubsmat= trunc(0.5:+(J(rows(G),rows(G),2) &#8211; Gr &#8211; Gr&#8217;):*1000000):\/1000000<br \/>\nsubsmat = subsmat &#8211; diag(subsmat)<br \/>\nst_matrix(tabmat,subsmat)<br \/>\n}<br \/>\nend<\/p>\n<p>capture program drop t2s<br \/>\nprogram define t2s<br \/>\nsyntax varlist(min=1 max=1) [if] [in], IDvar(varname) SUBSmat(string) [DIAgincl]<\/p>\n<p>if (&#8220;`diagincl'&#8221;==&#8221;&#8221;) {<br \/>\nlocal diagincl 0<br \/>\n}<br \/>\nelse {<br \/>\nlocal diagincl 1<br \/>\n}<\/p>\n<p>marksample touse<\/p>\n<p>local colvar `varlist&#8217;<br \/>\ntempvar rowvar<\/p>\n<p>by `idvar&#8217;: gen `rowvar&#8217;=`colvar'[_n-1] if _n&gt;1<\/p>\n<p>di &#8220;Generating transition-driven substitution matrix&#8221;<\/p>\n<p>qui tab `rowvar&#8217; `colvar&#8217; if `touse&#8217;, matcell(`subsmat&#8217;)<\/p>\n<p>mata: transition_driven_subsmat2(&#8220;`subsmat'&#8221;,`diagincl&#8217;)<br \/>\nend<br \/>\n<code><\/code><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Given that determining substitution costs in sequence analysis is such a bone of contention, many researchers look for a way for the data to generate the costs. The typical way to do this is, is by pooling transition rates and defining the substitution cost to be: 2 &#8211; p(ij) &#8211; p(ji) where p(ij) is the &hellip; <a href=\"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/?p=309\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Substitution costs from transition rates<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/309"}],"collection":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=309"}],"version-history":[{"count":5,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/309\/revisions"}],"predecessor-version":[{"id":316,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/309\/revisions\/316"}],"wp:attachment":[{"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=309"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=309"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/teaching.sociology.ul.ie\/bhalpin\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=309"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}