- True ordinality means simply a ranking of categories, with no
necessity that their ranking should be related to locations on a
continuous dimension as with linear-by-linear, log-multiplicative
or other models treated above.
- We can fit a number
of models which use logit to take account of such ordinality:
- the adjacent categories model
- the continuation ratio model
- the proportional odds model
- All these models can be considered as similar to the
baseline-category logit model: they fit multiple
simultaneous binary logits to restructured versions of the
original table.
- One restriction is that the ordinal variable needs to be
considered as the dependent variable (this is not the case with
the previously discussed models).
Figure:
Dataset re-arrangements to use logit models to model
multicategory data: (a) baseline category (b) adjacent category (c)
continuation ratio (d) proportional odds.
|
- Figure 2 relates these models visually: the
top row represents a table containing a variable with five
categories (in the table only one row is shown: imagine there are
many). Below the table are shown the arrangements of subtables
implicit in the various models:
- The baseline-category logit compares each of categories 2
to 5 with category 1, in four simultaneous logit models.
- The adjacent-category logit fits logit models to 4 adjacent
pairs.
- The continuation-ratio model fits 4 logits on each successive
step, with the `zero' category cumulating, and the `one'
category being the next category.
- The proportional odds approach fits 4 models on the entire
data, simply changing the cut-point.
- The adjacent-category model and the continuation-ratio model can
be implemented simply in SPSS, but the proportional odds model
has its own module. This is because the four sub-models are independent in
the former two (each one contains at least some new data) but not
in the proportional odds model, where the same observations are
simply re-partitioned.
- To fit the adjacent-category and continuation-ratio models in
SPSS we need to restructure the table.
- For the adjacent category approach we need to create
subtables, each consisting of an adjacent pair. Thus we end up
with a table with one binary variable, and an extra dimension
with categories. That is, an table becomes a
table.
c1 c2 c3
r1 11 12 13
r2 21 22 23
r3 31 32 33
r4 41 32 43
becomes
c1 c2 c3
l1 r1 11 12 13
r1 21 22 23
l2 r1 21 22 23
r1 31 32 33
l3 r1 31 32 33
r1 41 32 43
- For the mouse-foetus data the transformation is like this:
1 0 281 becomes 1 0 0 281
1 62.5 225 1 0 62.5 225
1 125 283 1 0 125 283
1 250 202 1 0 250 202
1 500 9 1 0 500 9
2 0 1 1 1 0 1
2 62.5 0 1 1 62.5 0
2 125 7 1 1 125 7
2 250 59 1 1 250 59
2 500 132 1 1 500 132
3 0 15 2 0 0 1
3 62.5 17 2 0 62.5 0
3 125 22 2 0 125 7
3 250 38 2 0 250 59
3 500 144 2 0 500 132
2 1 0 15
2 1 62.5 17
2 1 125 22
2 1 250 38
2 1 500 144
treating outcome (normal, abnormal, dead) as ordinal. The first
column is the new variable with categories, indexing the
new pairs. This can usually be done `by hand' in a text editor or a
spreadsheet, since tables are usually small enough to be convenient.
- The model can then be fitted by
GENLOG
outcome BY subtab with dose2
/MODEL=MULTINOMIAL
/PRINT estim /PLOT none
/DESIGN outcome outcome*dose2
outcome*subtab.
where subtab is the index of the subtable.
- This is analogous to the baseline-category logit, but the
`real' variable (e.g., vote) is replaced by the indicator of the
subtable. The interpretation of a parameter is the effect of its
independent variable on the log-odds of being in a higher versus
a lower outcome in any adjacent pair. If we feel that this effect
differs across adjacent pairs, a design term like:
/DESIGN = outcome outcome*dose2
outcome*subtab
outcome*dose2*subtab.
will allow separate effects.
- The continuation ratio data manipulation is very similar,
except we cumulatively collapse categories.
DOSE OUTCOME COUNT SUBTAB
0 1 281 1 * row 1
62.5 1 225 1
125 1 283 1
250 1 202 1
500 1 9 1
0 2 1 1 * row 2
62.5 2 0 1
125 2 7 1
250 2 59 1
500 2 132 1
0 1 282 2 * row 1 + row 2
62.5 1 225 2
125 1 290 2
250 1 261 2
500 1 141 2
0 2 15 2 * row 3
62.5 2 17 2
125 2 22 2
250 2 38 2
500 2 144 2
- The same model statement as for the adjacent-category logit
applies here. The interpretation is a little different: each
estimate relates to the effect on the log odds of being in
category versus categories 1 to combined. The file
mice-auto.sps
gives an example of doing this automatically within SPSS.
- The proportional odds model cannot be fitted directly by this
means, as the submodels implied are not independent. Various
means are available for fitting it iteratively (using GLIM
macros, etc. ) but it can be fitted as an individual-level ordered
logit model in SPSS, Stata and SAS. The ``Ordinal Regression''
option in SPSS (PLUM), and the ologit
command in Stata fit this model, or proc logicistic in
SAS (see Agresti, p276). The interpretation of the parameters
generated is the effect on the log-odds of being in the higer
band versus the lower band.