As the number of dimensions increase the number of possible
models increases rapidly.
It is no longer practical to examine them all, so we need a
way of searching efficiently
We could consider automatic stepwise selection as SPSS will
do by default if you go
StatisticsLoglinearModel Search (this
invokes the HILOGLINEAR command. This is not a good
idea: to quote from the Stata mailing list (for fuller detail see
http://teaching.sociology.ul.ie/~brendan/CDA/stepwise.text).
It yields R-squared values that are badly biased high.
The F and chi-squared tests quoted next to each variable on
the printout do not have the claimed distribution.
The method yields confidence intervals for effects and
predicted values that are falsely narrow (See Altman and
Anderson Stat in Med).
It yields P-values that do not have the proper meaning and
the proper correction for them is a very difficult problem.
It gives biased regression coefficients that need shrinkage
(the coefficients for remaining variables are too large; see
Tibshirani, 1996).
It has severe problems in the presence of collinearity.
It is based on methods (e.g. F tests for nested models)
that were intended to be used to test pre-specified hypotheses.
Increasing the sample size doesn't help very much (see
Derksen and Keselman).
It allows us to not think about the problem.
It uses a lot of paper.
Note that "all possible subsets" regression does not solve any
of these problems.
On the other hand, non-automatic stepwise model building is
sometimes a
good idea e.g., :
Forward:
Begin with first order terms one at a time, and include
all which seem important.
Add 2-way interactions, first one at a time and then
cumulatively, including all which significantly reduce the
deviance. Pay a lot of attention to what seems very
significant, and what is marginal or non-significant.
If you have enough dimensions, continue from your
favoured two-way model adding three-way terms that are
possible (i.e., their hierarchical descendents are already in
the model), and so on.
Pay attention to the possibility that a higher-order
interaction might be significant despite its lower-order
interaction being insignificant.
Backward
Pick a model with all, say, 3-way interactions.
See what terms you can eliminate without raising the
deviance.
But always pay attention to what the addition or removal
suggests about the model.
Another approach is to start with a `minimum-interest' model,
which contains terms to take account of all that you are not
interested in: for instance, if you really want to know whether
sentencing policy differs according to defendent's race:
fit the model drace + vrace + verdict + vrace*verdict +
vrace*drace.
Add the term of interest, drace*verdict, and test
for improvement.
Then start refining (adding or deleting terms) to get rid of
unnecessary terms or add others that improve the fit.
If you arrive at a different model this way, retest your term
of interest.