Models for ordered categories (ii)

True ordinality means simply a ranking of categories, with no necessity that their ranking should be related to locations on a continuous dimension as with linear-by-linear, log-multiplicative or other models treated above.

We can fit a number of models which use logit to take account of such ordinality:

the adjacent categories model
the continuation ratio model
the proportional odds model

All these models can be considered as similar to the baseline-category logit model: they fit multiple simultaneous binary logits to restructured versions of the original table.

One restriction is that the ordinal variable needs to be considered as the dependent variable (this is not the case with the previously discussed models).

**Figure:** Dataset re-arrangements to use logit models to model multicategory data: (a) baseline category (b) adjacent category (c) continuation ratio (d) proportional odds.
$\includegraphics{logit.eps}$

Figure 2 relates these models visually: the top row represents a table containing a variable with five categories (in the table only one row is shown: imagine there are many). Below the table are shown the arrangements of subtables implicit in the various models:

The baseline-category logit compares each of categories 2 to 5 with category 1, in four simultaneous logit models.
The adjacent-category logit fits logit models to 4 adjacent pairs.
The continuation-ratio model fits 4 logits on each successive step, with the `zero' category cumulating, and the `one' category being the next category.
The proportional odds approach fits 4 models on the entire data, simply changing the cut-point.

The adjacent-category model and the continuation-ratio model can be implemented simply in SPSS, but the proportional odds model has its own module. This is because the four sub-models are independent in the former two (each one contains at least some new data) but not in the proportional odds model, where the same observations are simply re-partitioned.

To fit the adjacent-category and continuation-ratio models in SPSS we need to restructure the table.

For the adjacent category approach we need to create $n-1$

subtables, each consisting of an adjacent pair. Thus we end up with a table with one binary variable, and an extra dimension with $n-1$

categories. That is, an $n\times m$ table becomes a $2\times m \times n-1$ table.

   c1 c2 c3
r1 11 12 13
r2 21 22 23
r3 31 32 33
r4 41 32 43

becomes

      c1 c2 c3
l1 r1 11 12 13
   r1 21 22 23
l2 r1 21 22 23
   r1 31 32 33
l3 r1 31 32 33
   r1 41 32 43

For the mouse-foetus data the transformation is like this:

1 0    281     becomes    1 0 0    281
1 62.5 225                1 0 62.5 225
1 125  283                1 0 125  283
1 250  202                1 0 250  202
1 500  9                  1 0 500  9  
2 0    1                  1 1 0    1  
2 62.5 0                  1 1 62.5 0  
2 125  7                  1 1 125  7  
2 250  59                 1 1 250  59 
2 500  132                1 1 500  132
3 0    15                 2 0 0    1  
3 62.5 17                 2 0 62.5 0  
3 125  22                 2 0 125  7  
3 250  38                 2 0 250  59 
3 500  144                2 0 500  132
                          2 1 0    15 
                          2 1 62.5 17 
                          2 1 125  22 
                          2 1 250  38 
                          2 1 500  144

treating outcome (normal, abnormal, dead) as ordinal. The first column is the new variable with $n-1$

categories, indexing the new pairs. This can usually be done `by hand' in a text editor or a spreadsheet, since tables are usually small enough to be convenient.

The model can then be fitted by

GENLOG
  outcome  BY  subtab with dose2
  /MODEL=MULTINOMIAL
  /PRINT estim   /PLOT none
  /DESIGN outcome outcome*dose2 
          outcome*subtab.

where subtab is the index of the subtable.

This is analogous to the baseline-category logit, but the `real' variable (e.g., vote) is replaced by the indicator of the subtable. The interpretation of a parameter is the effect of its independent variable on the log-odds of being in a higher versus a lower outcome in any adjacent pair. If we feel that this effect differs across adjacent pairs, a design term like:

/DESIGN = outcome outcome*dose2 
          outcome*subtab
          outcome*dose2*subtab.

will allow separate effects.

The continuation ratio data manipulation is very similar, except we cumulatively collapse categories.

    DOSE  OUTCOME    COUNT   SUBTAB

    0        1      281        1   * row 1
   62.5      1      225        1   
  125        1      283        1   
  250        1      202        1   
  500        1        9        1   
    0        2        1        1   * row 2
   62.5      2        0        1   
  125        2        7        1   
  250        2       59        1   
  500        2      132        1   
    0        1      282        2   * row 1 + row 2
   62.5      1      225        2   
  125        1      290        2   
  250        1      261        2   
  500        1      141        2   
    0        2       15        2   * row 3
   62.5      2       17        2   
  125        2       22        2   
  250        2       38        2   
  500        2      144        2

The same model statement as for the adjacent-category logit applies here. The interpretation is a little different: each estimate relates to the effect on the log odds of being in category

versus categories 1 to $i-1$

combined. The file mice-auto.sps gives an example of doing this automatically within SPSS.

The proportional odds model cannot be fitted directly by this means, as the submodels implied are not independent. Various means are available for fitting it iteratively (using GLIM macros, etc. ) but it can be fitted as an individual-level ordered logit model in SPSS, Stata and SAS. The ``Ordinal Regression'' option in SPSS (PLUM), and the ologit command in Stata fit this model, or proc logicistic in SAS (see Agresti, p276). The interpretation of the parameters generated is the effect on the log-odds of being in the higer band versus the lower band.

Loglinear Analysis Unit 8