Peter Dalgaard writes (in response to a question about 2-way ANOVA
with imbalance):
... There are various
boneheaded ways in which people try to use to assign some kind of
SumSq to main effects in the presence of interaction, and they are all
wrong - although maybe not very wrong if the unbalance is slight.
People keep saying this --- very vehemently --- and it is NOT TRUE.
Point 1 --- imbalance is really irrelevant here, a fact which
is usually (always?) overlooked. If the design is balanced,
all ``types'' of sums of squares are the same. The sequential
sums of squares which R will happily produce might well contain
``significant'' values for SSA and/or SSB ***and*** a significant
value for the interaction sum of squares, SSAB.
Point 2 --- What does such ``significance'' ***mean***? It is not
correct to say that it means nothing at all. The significance
of say, SSA, reports on the result of the test of a hypothesis.
This hypothesis is a ***meaningful*** hypothesis. It may well not be
an important hypothesis, or a particularly interesting hypothesis,
or a hypothesis that the experimenter actually cares about.
It is substantially different from the hypothesis which is tested
by SSA when there is no interaction. (Different, but related.)
Bill Venables fulminates that consideration of such a hypothesis is
contrary to the fundamental philosophy of statistcial modelling, and
thereby an abomination in the sight of God, and probably Politically
Incorrect to boot. This may well be so. Nonetheless it ***is***
a well-defined and meaningful hypothesis.
Rather than dismissing the testing of such a hypothesis as being
``bone-headed'', the guru should point out to the desciple
(a) just what hypothesis is being tested,
(b) that this hypothesis packs a substantially different
load of freight than that which is tested when there is
no interaction, and
(c) that the desciple should carefully search his or her
soul as to whether the hypothesis which is being tested
is of any actual interest.
This would go much further toward bringing the desciple to true
enlightenment.
Point 3 --- what hypothesis is being tested by SSA?
Let factor A correspond to index i, and B to index j.
Let the cell means be mu_ij. (In the overparameterized
notation, mu_ij = mu + alpha_i + beta_j + gamma_ij.)
The hypothesis being tested is
H_0: mu_1.-bar = mu_2.-bar = ... = mu_a.-bar
where factor A has a levels, and ``mu_i.-bar'' means
the average (arithmetic mean) of mu_i1, mu_i2, ..., mu_ib.
(Note --- factor B has b levels.)
I.e. the hypothesis is that there is no difference, on average,
between the levels of A, the average being taken over the levels
of B.
Now taking such an average may not be a sensible thing to do,
but it is perfectly well-defined, and thus a ***meaningful***
hypothesis is being tested. (The meaning of which the hypothesis
is full might not be very exciting, but that is more of a practical
than a statistical issue.)
Note that the hypothesis being tested, while possibly of dubious
import, is perfectly comprehensible to the human mind.
(Remark: In real life, if we were really interested in averaging
over the levels of B at all, we would probably want a ***weighted***
average, with the weights corresponding to the preponderance of
the levels of B in the population.)
Note that if there is no interaction (if the gamma_ij are all zero)
then the hypothesis being tested is that for each fixed j, the mu_ij
are all ***identical*** (say mu_ij = tau_j) and hence the averages
over j are equal (mu_i.-bar = tau.-bar, independent of i.)
This is all easier to think about graphically. For each j, plot the
mu_ij against the index i, giving a ``profile''. ``No interaction''
means that all profiles are parallel. No interaction and no A
effect means that all profiles are horizontal.
If the profiles are parallel, then all profiles will be horizontal
if and only if their mean is horizontal.
However if the profiles are ***not*** parallel (i.e. if there is
interaction) their means may be horizontal anyhow.
Let me repeat: This horizontallity may not be of much interest if
the profiles are not parallel, but it is a perfectly well-defined
concept, and testing for it makes perfect sense in the abstract.
Point 4 --- on the (remote?) chance that we really are interested in
the above horizontallity, and if the design is in fact NOT BALANCED,
then the much maligned type III sums of squares are ***definitely***
called for. Type III sums of squares will test the null hypothesis
stated in Point 3, irrespective of balance. Sequential sums of
squares will test another, different, and totally bizarre hypothesis.
(Again a perfectly ``meaningfull'' hypothesis, but one such that the
meaning is really too convoluted to admit any sort of comprehension
by the human mind. Moreover this hypothesis is dependent on the
design structure, rendering it even more unlikely to be of any
interest, even if one could understand what it it is saying.)
cheers,
Rolf Turner
rolf at maths.unb.ca
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Allow me to make this discussion a little more concrete for my mortal mind.
Let Factor A = grams of nitrogen fertilizer (levels i=1,2)
Let Factor B = watering regime (levels j=1,2)
Let response Y = Yield of soybeans (g / m^2)
Suppose we measure yield in different treatments and find means of:
A1 A2
B1 50 100
B2 60 150
Suppose further that we have sufficiently small error to detect differences
among all means and (of course) "significant" main effects and significant
interaction. I would argue strongly that adding nitrogen, regardless of
other factors, increases yield. I would also argue that adding water,
regardless, of other factors, increaases yield. I would also conclude that
adding both together increases uyield more than you might expect based on
adding each factor separately. In a messy ecological setting, we frequently
don't know such basic information.
A more important example arises when factor B is a random effect (spatially
arrayed blocks outside the control of the experimenter?). In such a case, if
levels of B provide a representative sample of the relevent universe, then a
significant main effect demonstrates an overall trend REGARDLESS of what is
going on within each block. Significant interaction may provide interesting
details in the system, greater insight etc., but in the end, we might be
interested primarily in the overall trend across a landscape.
Comments?
Regards,
Henry Stevens
Hstevens at muohio.edu
----- Original Message -----
From: "Rolf Turner" <rolf at maths.uwa.edu.au>
To: <r-help at stat.math.ethz.ch>
Sent: Wednesday, October 17, 2001 12:16 AM
Subject: [R] Type III sums of squares.
Peter Dalgaard writes (in response to a question about 2-way ANOVA
with imbalance):
... There are various
boneheaded ways in which people try to use to assign some kind of
SumSq to main effects in the presence of interaction, and they are all
wrong - although maybe not very wrong if the unbalance is slight.
People keep saying this --- very vehemently --- and it is NOT TRUE.
Point 1 --- imbalance is really irrelevant here, a fact which
is usually (always?) overlooked. If the design is balanced,
all ``types'' of sums of squares are the same. The sequential
sums of squares which R will happily produce might well contain
``significant'' values for SSA and/or SSB ***and*** a significant
value for the interaction sum of squares, SSAB.
Point 2 --- What does such ``significance'' ***mean***? It is not
correct to say that it means nothing at all. The significance
of say, SSA, reports on the result of the test of a hypothesis.
This hypothesis is a ***meaningful*** hypothesis. It may well not be
an important hypothesis, or a particularly interesting hypothesis,
or a hypothesis that the experimenter actually cares about.
It is substantially different from the hypothesis which is tested
by SSA when there is no interaction. (Different, but related.)
Bill Venables fulminates that consideration of such a hypothesis is
contrary to the fundamental philosophy of statistcial modelling, and
thereby an abomination in the sight of God, and probably Politically
Incorrect to boot. This may well be so. Nonetheless it ***is***
a well-defined and meaningful hypothesis.
Rather than dismissing the testing of such a hypothesis as being
``bone-headed'', the guru should point out to the desciple
(a) just what hypothesis is being tested,
(b) that this hypothesis packs a substantially different
load of freight than that which is tested when there is
no interaction, and
(c) that the desciple should carefully search his or her
soul as to whether the hypothesis which is being tested
is of any actual interest.
This would go much further toward bringing the desciple to true
enlightenment.
Point 3 --- what hypothesis is being tested by SSA?
Let factor A correspond to index i, and B to index j.
Let the cell means be mu_ij. (In the overparameterized
notation, mu_ij = mu + alpha_i + beta_j + gamma_ij.)
The hypothesis being tested is
H_0: mu_1.-bar = mu_2.-bar = ... = mu_a.-bar
where factor A has a levels, and ``mu_i.-bar'' means
the average (arithmetic mean) of mu_i1, mu_i2, ..., mu_ib.
(Note --- factor B has b levels.)
I.e. the hypothesis is that there is no difference, on average,
between the levels of A, the average being taken over the levels
of B.
Now taking such an average may not be a sensible thing to do,
but it is perfectly well-defined, and thus a ***meaningful***
hypothesis is being tested. (The meaning of which the hypothesis
is full might not be very exciting, but that is more of a practical
than a statistical issue.)
Note that the hypothesis being tested, while possibly of dubious
import, is perfectly comprehensible to the human mind.
(Remark: In real life, if we were really interested in averaging
over the levels of B at all, we would probably want a ***weighted***
average, with the weights corresponding to the preponderance of
the levels of B in the population.)
Note that if there is no interaction (if the gamma_ij are all zero)
then the hypothesis being tested is that for each fixed j, the mu_ij
are all ***identical*** (say mu_ij = tau_j) and hence the averages
over j are equal (mu_i.-bar = tau.-bar, independent of i.)
This is all easier to think about graphically. For each j, plot the
mu_ij against the index i, giving a ``profile''. ``No interaction''
means that all profiles are parallel. No interaction and no A
effect means that all profiles are horizontal.
If the profiles are parallel, then all profiles will be horizontal
if and only if their mean is horizontal.
However if the profiles are ***not*** parallel (i.e. if there is
interaction) their means may be horizontal anyhow.
Let me repeat: This horizontallity may not be of much interest if
the profiles are not parallel, but it is a perfectly well-defined
concept, and testing for it makes perfect sense in the abstract.
Point 4 --- on the (remote?) chance that we really are interested in
the above horizontallity, and if the design is in fact NOT BALANCED,
then the much maligned type III sums of squares are ***definitely***
called for. Type III sums of squares will test the null hypothesis
stated in Point 3, irrespective of balance. Sequential sums of
squares will test another, different, and totally bizarre hypothesis.
(Again a perfectly ``meaningfull'' hypothesis, but one such that the
meaning is really too convoluted to admit any sort of comprehension
by the human mind. Moreover this hypothesis is dependent on the
design structure, rendering it even more unlikely to be of any
interest, even if one could understand what it it is saying.)
cheers,
Rolf Turner
rolf at maths.unb.ca
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear All,
I often come across observational data sets, where the interest is in predicting
the class membership (often more than 2 classes) as a function of several
variables; generally, the number of predictors is very large, and it is also of
interest to make that number as small as possible (for instance, to minimize length
of future questionaires). I thought that a possible approach would be to use some
kind of stepwise model selection; as criterion for variable selection I would use
the prediction error from models fitted with "multinom" (package nnet), where the
prediction error would be obtained using k-fold cross-validation.
I have seen somewhat similar approaches, but not this one in particular, and since
I'd think the general situation is fairly common to many people, I am wondering
whether the idea makes sense, or if it is a completely misguided and boneheaded
one.
(I think this is relatively easy to implement, comparing the results of
predict.multinom with the true class membership of the hold-out sets; that would be
the value returned by the function "extractAIC.mycvmultinom", and then I would be
able to just call stepAIC on objects of class "mycvmultinom").
Thanks,
Ram?n D?az
Inner Research
Vel?zquez 109
28006 Madrid
Spain
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._