On 11-Mar-08 08:58:55, Werner Wernersen wrote:
Hi,
could anyone explain to me what this warning
exactly means and what the consequences are?
Is it due to the fact that there are very extreme
observations / outliers included or what is the
for it?
Thanks so much,
Werner
What it means is exactly what it says. How it arises
will
probably be some variant of the following kind of
data
(I'm guessing that your application of glm() was to
data
with 0/1 responses, as in a logistic regression):
X = 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ...
Y = 0 0 0 1 1 1 1 ...
i.e. all the 0's occur on one side of a value (say
1.25)
of X, and all the 1's occur on the other side.
If you take a model (e.g. logistic):
P(Y=1 | X) = exp((X-a)*b)/(1 + exp((X-a)*b))
then, for any finite values of a and b, the formula
will
give a value >0 for P(Y=1 | X) where X < 1.25 (i.e.
where
Y=0) so P(Y=0 | X) < 1; and a value <1 for P(Y=1 |
X)
where X > 1.25 (i.e. Y=1).
However, if you take say a=1.25 (a value which
separates the
0's from the 1,s), and then let b -> infinity, then
you will
find that
P(Y=0 | X) -> 1, P(Y=1 | X) -> 0, for X < 1.25
P(Y=0 | X) -> 0, P(Y=1 | X) -> 1, for X > 1.25
so the limit as b -> inf perfectly predicts the
observed outcome.
However, the value of a is indeterminate so long as
it is
between the largest X for the Y=0 observations, and
the smallest
X for the Y=1 observations.
This situation cannot arise with data where the
largest X for
which Y=0 is greater than the smallest X for which
Y=1, e.g.
X = 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ...
Y = 0 0 1 0 1 1 1 ...
The above example is a very simple example of what
is called
"linear separation". It arises more generally when
there are
several covariates X1, X2, ... , Xk and there is a
linear
function
L = a1*X1 + a2*X2 + ... + ak*Xk
for which (with the data as observed) there is a
value L0
such that
Y = 0 for all the data such that L < L0
Y = 1 for all the data such that L > L0
In particular, if ever the number of covariates (k)
is greater
than (n-2), where n is the number of cases in your
data, then
you have (k+1) or fewer points in k dimensions, and
there will
be a k-dimensional plane (as given by L above) which
will
separate the (X1,...,Xk)-points where Y=0 from the
(X1,...,Xk)-points where Y=1. Regardless of how you
assign labels
"Y=0" and "Y=1" to (k+1) or fewer points, they will
be linearly
separable.
Even if k < n-1, so that they are not *in general*
linearly
separated, there is still a a positive probability
that you
can get data for which they are linerally separated;
and
then the same situation arises. This probability
increases
as the number of covariates (k) increases.
What the warning message is telling you is that a
perfect
fit is possible within the parametrisation of the
model:
a probability P(Y=1)=0 is fitted to cases where the
observed
Y = 0; and a probability P(Y=1)=1 is fitted to cases
where
the observed Y = 1.
Best wishes,
Ted.