When modelling a binary dependent variable we want to predict
a probability in the 0-1 range, with
a binomial distribution
Under the generalised linear model framework, a number of
transformations are possible with a binomially distributed error:
Logit:
Probit: inverse of normal CDF
Complementary log-log:
Log-log:
The logit and probit mappings
The logit and probit mappings are symmetric and give similar
results
The logit transformation is used more often as it is
mathematically more tractable
Under GLMs, the transformation of the dependent variable is a
linear function of the independent variables, with a specified
error structure (binomial in the case of binary dependent
variables):
In all linear models we interpret the effect of a parameter
in the same way: the change in the linear component due to a
one-unit change in the independent variable
Under logistic regression, change in the linear component is
a change in the log of the odds, and thus the parameter estimates
can be interpreted as the log of the odds ratio for cases that
differ only by one unit in an explanatory variable:
This is as true for dummy variables as for continuous
But what of the link between parameters and probability?
The effect of a change of in the linear component
depends strongly on the starting point: for
probabilities in the mid range it is much bigger than for
probabilities closer to 0 or 1
We can present predicted changes in probability for a given
starting point - this is sometimes useful in presentation, but
generally it is better to get used to (log) odds-ratios.
The logit and probit transformations are symmetric, so the
effect of a one-unit increase of a variable at p=0.9 is the same
as that of a one-unit decrease at p=0.1.
In most applications this is acceptable, but for situations
where it is not, the third and fourth links