Fitting models to large tables

As the number of dimensions increase the number of possible models increases rapidly.
It is no longer practical to examine them all, so we need a way of searching efficiently
We could consider automatic stepwise selection as SPSS will do by default if you go Statistics $\ensuremath{\Rightarrow}$ Loglinear $\ensuremath{\Rightarrow}$ Model Search (this invokes the HILOGLINEAR command. This is not a good idea: to quote from the Stata mailing list (for fuller detail see http://teaching.sociology.ul.ie/~brendan/CDA/stepwise.text).
1. It yields R-squared values that are badly biased high.
2. The F and chi-squared tests quoted next to each variable on the printout do not have the claimed distribution.
3. The method yields confidence intervals for effects and predicted values that are falsely narrow (See Altman and Anderson Stat in Med).
4. It yields P-values that do not have the proper meaning and the proper correction for them is a very difficult problem.
5. It gives biased regression coefficients that need shrinkage (the coefficients for remaining variables are too large; see Tibshirani, 1996).
6. It has severe problems in the presence of collinearity.
7. It is based on methods (e.g. F tests for nested models) that were intended to be used to test pre-specified hypotheses.
8. Increasing the sample size doesn't help very much (see Derksen and Keselman).
9. It allows us to not think about the problem.
10. It uses a lot of paper.
  Note that "all possible subsets" regression does not solve any of these problems.
On the other hand, non-automatic stepwise model building is sometimes a good idea e.g., :
- Forward:
  1. Begin with first order terms one at a time, and include all which seem important.
  2. Add 2-way interactions, first one at a time and then cumulatively, including all which significantly reduce the deviance. Pay a lot of attention to what seems very significant, and what is marginal or non-significant.
  3. If you have enough dimensions, continue from your favoured two-way model adding three-way terms that are possible (i.e., their hierarchical descendents are already in the model), and so on.
  4. Pay attention to the possibility that a higher-order interaction might be significant despite its lower-order interaction being insignificant.
- Backward
  1. Pick a model with all, say, 3-way interactions.
  2. See what terms you can eliminate without raising the deviance.
- But always pay attention to what the addition or removal suggests about the model.
Another approach is to start with a `minimum-interest' model, which contains terms to take account of all that you are not interested in: for instance, if you really want to know whether sentencing policy differs according to defendent's race:
- fit the model drace + vrace + verdict + vrace*verdict + vrace*drace.
- Add the term of interest, drace*verdict, and test for improvement.
- Then start refining (adding or deleting terms) to get rid of unnecessary terms or add others that improve the fit.
- If you arrive at a different model this way, retest your term of interest.

© Brendan Halpin (e-mail)	23-Apr-2012
Department of Sociology, University of Limerick
Taught programme: MA in Sociology (Applied Social Research),
Short course, May 14/15 2012: Categorical Data Analysis for Social Scientists