Split-plot Design

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080320/fa5530a1/attachment.pl>
Your question is not very clear, but if you are trying to match the
results in Kuehl, you need a fixed-effects model:

dat <- read.table("expl14-1.txt", header=TRUE)
dat$block <- factor(dat$block)
dat$nitro <- factor(dat$nitro)
dat$thatch <- factor(dat$thatch)

colnames(dat) <- c("block","nitro","thatch","chlor")
m1 <- aov(chlor~nitro*thatch+Error(block/nitro), data=dat)
summary(m1)

Mixed-effects models and degrees of freedom have been discussed many
times on this list....search the archives.

K Wright

 Hi listers,

 I've been studying anova and at the book of Kuehl at the chapter
 about split-plot there is a experiment with the results... I am trying to
 understand the experiments and make the code in order to obtain the
 results... But there is something that I didn't understand yet...
 I have a split-plot design (2 blocks) with two facteurs, one
 facteur has 4 treatments and the other facteur is a measure
 taken in three years...
 I organize my data set as:

 Nitro Bloc Year Measure
 a
 x
 1         3.8
 a
  x
 2         3.9
 a         x         3         2.0
 a         y         1         3.7
 a         y         2
 2.4
 a         y         3
 1.2
 b         x
  1         4.0
 b         x
 2         2.5
 and so on...

 So, I am trying this code, because I want to test each factor and the
 interaction...
 lme=lme(measure ~ bloc + nitro + bloc*nitro, random= ~ 1|year,
 data=lme)
 summary(lme)
 The results that I am obtaining are not correct, because
 I calculated the degrees of fredom and they are not
 correct... According to this design I will get two errors one for the
 whole plot and other for the subplot....

 Well, as I told you, I am still learning... Any suggestions...

 Thanks in advance,

 Ribeiro

        [[alternative HTML version deleted]]

_______________________________________________
 R-sig-mixed-models at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

I do not think it quite true that the aov model that has an
Error() term is a fixed effects model.  The use of the word
"stratum" implies that a mixed effects model is lurking
somewhere.  The F-tests surely assume such a model.

Some little time ago, Doug Bates invested me, along with
Peter Dalgaard, a member of the degrees of freedom police.
Problem is, I am unsure of the responsibilities, but maybe
they include commenting on a case such as this.

lme() makes a stab at an appropriate choice of degrees
of freedom, but does not always get it right, to the extent
that there is a right answer.  [lmer() has for the time being
given up on giving degrees of freedom and p-values for
fixed effects estimates.]  This part of the output from lme()
should, accordingly, be used with discretion.  In case of
doubt, check against a likelihood ratio test.  In a simple
enough experimental design, users who understand how
to calculate degrees of freedom will reason them out for
themselves.
John Maindonald.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

Your question is not very clear, but if you are trying to match the
results in Kuehl, you need a fixed-effects model:

dat <- read.table("expl14-1.txt", header=TRUE)
dat$block <- factor(dat$block)
dat$nitro <- factor(dat$nitro)
dat$thatch <- factor(dat$thatch)

colnames(dat) <- c("block","nitro","thatch","chlor")
m1 <- aov(chlor~nitro*thatch+Error(block/nitro), data=dat)
summary(m1)

Mixed-effects models and degrees of freedom have been discussed many
times on this list....search the archives.

K Wright

On Thu, Mar 20, 2008 at 12:39 PM,  <marcioestat at pop.com.br> wrote:

Hi listers,

I've been studying anova and at the book of Kuehl at the chapter
about split-plot there is a experiment with the results... I am  
trying to
understand the experiments and make the code in order to obtain the
results... But there is something that I didn't understand yet...
I have a split-plot design (2 blocks) with two facteurs, one
facteur has 4 treatments and the other facteur is a measure
taken in three years...
I organize my data set as:

Nitro Bloc Year Measure
a
x
1         3.8
a
 x
2         3.9
a         x         3         2.0
a         y         1         3.7
a         y         2
2.4
a         y         3
1.2
b         x
 1         4.0
b         x
2         2.5
and so on...

So, I am trying this code, because I want to test each factor and the
interaction...
lme=lme(measure ~ bloc + nitro + bloc*nitro, random= ~ 1|year,
data=lme)
summary(lme)
The results that I am obtaining are not correct, because
I calculated the degrees of fredom and they are not
correct... According to this design I will get two errors one for the
whole plot and other for the subplot....

Well, as I told you, I am still learning... Any suggestions...

Thanks in advance,

Ribeiro

       [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
I do not think it quite true that the aov model that has an
Error() term is a fixed effects model.  The use of the word
"stratum" implies that a mixed effects model is lurking
somewhere.  The F-tests surely assume such a model.
This is something that has been bothering me for a while.  Often it's 
argued that ANOVA is just regression; clearly this is not true when it's 
a repeated measures ANOVA, unless "regression" is interpreted broadly. 
I think Andrew Gelman argues this somewhere.  I don't see how to get aov 
to give me a formula, and lm doesn't fit stuff with an Error() term, but 
if it could, logically I would expect the formula to resemble closely 
the sort of thing you get with a mixed effects models.

The closest analogy I can find is that doing this...

 > aov1 = aov(yield ~  N+P+K + Error(block), npk)
 > summary(aov1)

Error: block
           Df Sum Sq Mean Sq F value Pr(>F)
Residuals  5    343      69

Error: Within
           Df Sum Sq Mean Sq F value Pr(>F)
N          1  189.3   189.3   11.82 0.0037 **
P          1    8.4     8.4    0.52 0.4800
K          1   95.2    95.2    5.95 0.0277 *
Residuals 15  240.2    16.0
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

[I believe the F's and p's here come from a linear construction of 
nested models (i.e., N versus intercept only; N+P versus N; plus N+P+K 
versus N+P).]

... is a bit like doing this with lmer...

 > lmer0 = lmer(yield ~  1 + (1|block), npk)
 > lmer1 = lmer(yield ~  N + (1|block), npk)
 > lmer2 = lmer(yield ~  N+P + (1|block), npk)
 > lmer3 = lmer(yield ~  N+P+K + (1|block), npk)
 > anova(lmer0,lmer1)
Data: npk
Models:
lmer0: yield ~ 1 + (1 | block)
lmer1: yield ~ N + (1 | block)
       Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer0  2 157.5 159.8  -76.7
lmer1  3 151.5 155.1  -72.8  7.93      1     0.0049 **
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
 > anova(lmer1,lmer2)
Data: npk
Models:
lmer1: yield ~ N + (1 | block)
lmer2: yield ~ N + P + (1 | block)
       Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer1  3 151.5 155.1  -72.8
lmer2  4 153.0 157.8  -72.5  0.47      1       0.49
 > anova(lmer2,lmer3)
Data: npk
Models:
lmer2: yield ~ N + P + (1 | block)
lmer3: yield ~ N + P + K + (1 | block)
       Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer2  4 153.0 157.8  -72.5
lmer3  5 149.0 154.9  -69.5  6.02      1      0.014 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Namely, fitting the models, and making comparisons using likelihood 
ratio tests.  Okay, the LLR is asymptotically chisq distributed with df 
the difference in parameters whereas... well F is different - I always 
get confused with the df and what bits of the residuals plug in where.

But, is this vaguely right?  (I'm aware that order matters when the 
design isn't balanced, and have read much about the type I errors versus 
type III errors disagreements.)

Actually this does lead to one question: following through the analogy, 
are repeated measures ANOVAs (those fitted with aov) always random 
intercept models, i.e. with no random slopes?

Cheers,

Andy
Andy Fugard, Postgraduate Research Student
Psychology (Room F3), The University of Edinburgh,
   7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk
The following (which uses the variable names in the original query)
shows that "lme" and "aov" produce the same results:

(mod <- lme(measure ~ nitro*year,random= ~ 1|bloc/nitro))
anova(mod)
summary(aov(measure ~ nitro*year + Error(bloc/nitro)))

(In the text, Kuehl rounded off the mean squares before computing
the F statistics, so they don't match.)

This is the classical "split plot" analysis, and for balanced data
there is no controversy over the F statistics.  Of course as always
we must assume (I prefer "pretend") that the random effects, including
the "errors", have normal distributions with constant variances.

The trouble arises with unbalanced data and/or more complex models.
Hopefully, reliable and convenient inference methods for those cases
will be developed, but until then those of us who continue to use
"nlme" (or SAS, or ...) must understand that we do so at our own risk.

Regards,   Rob Kushler
John Maindonald wrote:
I do not think it quite true that the aov model that has an
Error() term is a fixed effects model.  The use of the word
"stratum" implies that a mixed effects model is lurking
somewhere.  The F-tests surely assume such a model.
This is something that has been bothering me for a while.  Often it's 
argued that ANOVA is just regression; clearly this is not true when it's 
a repeated measures ANOVA, unless "regression" is interpreted broadly. 
I think Andrew Gelman argues this somewhere.  I don't see how to get aov 
to give me a formula, and lm doesn't fit stuff with an Error() term, but 
if it could, logically I would expect the formula to resemble closely 
the sort of thing you get with a mixed effects models.

The closest analogy I can find is that doing this...

 > aov1 = aov(yield ~  N+P+K + Error(block), npk)
 > summary(aov1)
Error: block
           Df Sum Sq Mean Sq F value Pr(>F)
Residuals  5    343      69

Error: Within
           Df Sum Sq Mean Sq F value Pr(>F)
N          1  189.3   189.3   11.82 0.0037 **
P          1    8.4     8.4    0.52 0.4800
K          1   95.2    95.2    5.95 0.0277 *
Residuals 15  240.2    16.0
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

[I believe the F's and p's here come from a linear construction of 
nested models (i.e., N versus intercept only; N+P versus N; plus N+P+K 
versus N+P).]

... is a bit like doing this with lmer...

 > lmer0 = lmer(yield ~  1 + (1|block), npk)
 > lmer1 = lmer(yield ~  N + (1|block), npk)
 > lmer2 = lmer(yield ~  N+P + (1|block), npk)
 > lmer3 = lmer(yield ~  N+P+K + (1|block), npk)
 > anova(lmer0,lmer1)
Data: npk
Models:
lmer0: yield ~ 1 + (1 | block)
lmer1: yield ~ N + (1 | block)
       Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer0  2 157.5 159.8  -76.7
lmer1  3 151.5 155.1  -72.8  7.93      1     0.0049 **
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
 > anova(lmer1,lmer2)
Data: npk
Models:
lmer1: yield ~ N + (1 | block)
lmer2: yield ~ N + P + (1 | block)
       Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer1  3 151.5 155.1  -72.8
lmer2  4 153.0 157.8  -72.5  0.47      1       0.49
 > anova(lmer2,lmer3)
Data: npk
Models:
lmer2: yield ~ N + P + (1 | block)
lmer3: yield ~ N + P + K + (1 | block)
       Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer2  4 153.0 157.8  -72.5
lmer3  5 149.0 154.9  -69.5  6.02      1      0.014 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Namely, fitting the models, and making comparisons using likelihood 
ratio tests.  Okay, the LLR is asymptotically chisq distributed with df 
the difference in parameters whereas... well F is different - I always 
get confused with the df and what bits of the residuals plug in where.

But, is this vaguely right?  (I'm aware that order matters when the 
design isn't balanced, and have read much about the type I errors versus 
type III errors disagreements.)

Actually this does lead to one question: following through the analogy, 
are repeated measures ANOVAs (those fitted with aov) always random 
intercept models, i.e. with no random slopes?

Cheers,

Andy

On Thu, Mar 20, 2008 at 7:04 PM, John Maindonald
I do not think it quite true that the aov model that has an
 Error() term is a fixed effects model.  The use of the word
 "stratum" implies that a mixed effects model is lurking
 somewhere.  The F-tests surely assume such a model.
 Some little time ago, Doug Bates invested me, along with
 Peter Dalgaard, a member of the degrees of freedom police.
 Problem is, I am unsure of the responsibilities, but maybe
 they include commenting on a case such as this.
Indeed, I should have been more explicit regarding the duties of a
member of the degrees of freedom police when I appointed you :-)  They
are exactly what you are doing - to weigh in on discussions of degrees
of freedom.

Perhaps I can expand a bit on why I think that the degrees of freedom
issue is not a good basis for approaching mixed models in general.

Duncan Temple Lang used to have a quote about "Language shapes the way
we think." in his email signature and I think similar ideas apply to
how we think about models.  Certainly the concept of error strata and
the associated degrees of freedom are one way of thinking about
mixed-effects models and they are effective in the case of completely
balanced designs.  If one has a completely balanced design and the
number of units at various strata is small then decomposition of the
variability into error strata is possible and the calculation of
degrees of freedom is important (because the degrees of freedom are
small - exact values of degrees of freedom are unimportant when they
are large).  Such data occur frequently - in text books.  In fact,
they are the only type of data that occur in most text books and hence
they have shaped the way we think about the models.  Generations of
statisticians have looked at techniques for small, balanced data sets
and decided that these define "the" approach to the model.  So when
confronted with large observational (and, hence, unbalanced or
"messy") data sets they approach the modeling with this mind set.

Certainly such data did occur in some agricultural trials (I'm not
sure if this is the state of the art in current agricultural trials)
and this form of analysis was important but it depends strongly on
balanced data and balance is a very fragile characteristic in most
data.  The one exception is data in text books - most often data for
examples in older texts were added as an afterthought and, for space
reasons and to illustrate the calculations, were small data sets which
just happened to be completely balanced so that all the calculations
worked out.   Fortunately this is no longer the case in books like
your book with Braun where the data sets are real data and available
separately from the book in R packages but it will take many years to
flush the way of thinking about models induced by those small,
balanced data sets from statistical "knowledge".  (There is an
enormous amount of inertia built into the system.  Most introductory
statistics textbooks published today, and probably those published for
the next couple of decades, will include "statistical tables" as
appendices because, as everyone knows, all of the statistical analysis
we do in practice involves doing a few calculations on a hand
calculator and looking up tabulated values of distributions.)

Getting back to the "language shapes the way we think" issue, I would
compare it to strategy versus tactics.  I spent a sabbatical in
Australia visiting Bill Venables.  I needed to adjust to driving on
the left hand side of the road and was able to do that after a short
initial period of confusion and discomfort.  If I think of places in
Adelaide now it is natural for me to think of myself sitting on the
right hand side of the front seat and driving on the left hand side of
the road.  That's tactics.  However, even after living in Adelaide for
several months I remember leaving the parking lot of a shopping mall
and choosing an indirect route out the back instead of the direct
route ahead because the direct route would involve a left hand turn
onto a busy road.  Instead I chose to make two right hand turns to get
to an intersection with traffic lights where I could make the left
turn.  At the level of strategy I still "knew" that left hand turns
are difficult and right hand turns are easy - exactly the wrong
approach in Australia.

This semester I am attending a seminar on hierarchical linear models
organized in our social sciences school by a statistician whose
background is in agricultural statistics.  I make myself unpopular
because I keep challenging the ideas of the social scientists and of
the organizer.  The typical kinds of studies we discuss are
longitudinal studies where the "observational units" (i.e. people) are
grouped within some social contexts.  These are usually large, highly
unbalanced data  sets. Because they are longitudinal they inevitably
involve some migration of people from one group to another.

The migration means that random effects for person and for the various
types of groups are not nested.  The models are not hierarchical or at
least they should not be.  Trying to jam the analysis of such data
into the hierarchical/multilevel framework of software like HLM or
MLwin doesn't work well but that certainly won't stop people from
trying.  The hierarchical language in which they learned the model
affects their strategy.

The agricultural statistician, by contrast, doesn't worry about the
nesting etc.,  For him the sole issue of important is to determine the
number of degrees of freedom associated with the various error strata.
 These studies involve tens of thousands of subjects and are nowhere
close to being balanced but his strategy is still based on having a
certain number of blocks and plots within the blocks and subplots
within the plots and everything is exactly balanced.

So I am not opposed to approaches based on error strata and computing
degrees of freedom where appropriate and where important.  But try not
to let the particular case of completely balanced small data sets
determine the strategy of the approach to all models involving random
effects.  It's fragile and does not generalize well, just as assuming
a strict hierarchy of factors associated with random effects is
fragile.  Balance is the special case.  Nesting is the special case.
It is a bad idea to base strategy on what happens in very special
cases.

they are always balanced and small data sets

Another very fragile
 lme() makes a stab at an appropriate choice of degrees
 of freedom, but does not always get it right, to the extent
 that there is a right answer.  [lmer() has for the time being
 given up on giving degrees of freedom and p-values for
 fixed effects estimates.]  This part of the output from lme()
 should, accordingly, be used with discretion.  In case of
 doubt, check against a likelihood ratio test.  In a simple
 enough experimental design, users who understand how
 to calculate degrees of freedom will reason them out for
 themselves.
 John Maindonald.

 John Maindonald             email: john.maindonald at anu.edu.au
 phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
 Centre for Mathematics & Its Applications, Room 1194,
 John Dedman Mathematical Sciences Building (Building 27)
 Australian National University, Canberra ACT 0200.

 On 21 Mar 2008, at 9:23 AM, Kevin Wright wrote:

 > Your question is not very clear, but if you are trying to match the
 > results in Kuehl, you need a fixed-effects model:
 >
 > dat <- read.table("expl14-1.txt", header=TRUE)
 > dat$block <- factor(dat$block)
 > dat$nitro <- factor(dat$nitro)
 > dat$thatch <- factor(dat$thatch)
 >
 > colnames(dat) <- c("block","nitro","thatch","chlor")
 > m1 <- aov(chlor~nitro*thatch+Error(block/nitro), data=dat)
 > summary(m1)
 >
 > Mixed-effects models and degrees of freedom have been discussed many
 > times on this list....search the archives.
 >
 > K Wright
 >
 >
 > On Thu, Mar 20, 2008 at 12:39 PM,  <marcioestat at pop.com.br> wrote:
 >>
 >>
 >>
 >> Hi listers,
 >>
 >> I've been studying anova and at the book of Kuehl at the chapter
 >> about split-plot there is a experiment with the results... I am
 >> trying to
 >> understand the experiments and make the code in order to obtain the
 >> results... But there is something that I didn't understand yet...
 >> I have a split-plot design (2 blocks) with two facteurs, one
 >> facteur has 4 treatments and the other facteur is a measure
 >> taken in three years...
 >> I organize my data set as:
 >>
 >> Nitro Bloc Year Measure
 >> a
 >> x
 >> 1         3.8
 >> a
 >>  x
 >> 2         3.9
 >> a         x         3         2.0
 >> a         y         1         3.7
 >> a         y         2
 >> 2.4
 >> a         y         3
 >> 1.2
 >> b         x
 >>  1         4.0
 >> b         x
 >> 2         2.5
 >> and so on...
 >>
 >>
 >> So, I am trying this code, because I want to test each factor and the
 >> interaction...
 >> lme=lme(measure ~ bloc + nitro + bloc*nitro, random= ~ 1|year,
 >> data=lme)
 >> summary(lme)
 >> The results that I am obtaining are not correct, because
 >> I calculated the degrees of fredom and they are not
 >> correct... According to this design I will get two errors one for the
 >> whole plot and other for the subplot....
 >>
 >> Well, as I told you, I am still learning... Any suggestions...
 >>
 >> Thanks in advance,
 >>
 >> Ribeiro
 >>
 >>
 >>        [[alternative HTML version deleted]]
 >>
 >>
 >> _______________________________________________
 >> R-sig-mixed-models at r-project.org mailing list
 >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
 >>
 >>
 >
 > _______________________________________________
 > R-sig-mixed-models at r-project.org mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

 _______________________________________________
 R-sig-mixed-models at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

I like this - a good robust defence!  Let the debate go on.

Designs that are in some sense balanced are, as far as I can judge,  
still the staple of agricultural trials.  The sorts of issues that  
arise in time series modeling must also be increasingly important.   
The thinking behind those designs is important, and ought to penetrate  
more widely.

That general style of design is not however limited to agricultural  
experimentation.  It is used widely, with different wrinkles, in  
psychology in industrial experimentation, in medicine, in laboratory  
experimentation, and with internet-based experimentation a new major  
area of application.  It ought to be used much more widely outside of  
agriculture.  Agriculture has been somewhat unique in having  
professional statisticians working alongside agricultural scientists  
for many decades now.

The wrinkles can be important.  Fisher famously commented on the  
frequent need to ask Nature more than one question at a time:
"No aphorism is more frequently repeated in connection with field  
trials, than that we must ask Nature few questions, or, ideally, one  
question, at a time. The writer is convinced that this view is wholly  
mistaken. Nature, he suggests, will best respond to a logical and  
carefully thought out questionnaire"

For each fixed effect and interaction, it is however necessary to be  
clear what is the relevant error term.  Some time ago, someone posted  
a question about the analysis of a psychology experiment where the  
number of possible error terms was very large, where some pooling was  
desirable, and where it was not at all clear what terms in the anova  
table to pool, and where d.f, were so small that the mean squares did  
not provide much clue.  (I looked for the email exchange, but could  
not immediately find it.) That is not a good design.  There may be  
much more potential for this sort of difficultly in psychology as  
compared to agriculture.  In agriculture the sites/blocks/plots/ 
subplots hierarchy is the order of the day, albeit often crossed with  
seasons to ensure that the analysis is not totally obvious.

The data sets are often larger than formerly. (But not always, again  
note some psychology experiments. Or they may not be large in the  
sense of allowing many degrees of freedom for error) Large datasets  
readily arise when, as for the internet-based experimentation,  
randomization can be done and data collected automatically.  (Some  
advertisers are randomizing their pop-up ads, to see which gives the  
best response.) With largish degrees of freedom for error, it is no  
longer necessary to worry about exact balance.  I consider Doug that  
you are too tough on the textbooks.  Maybe they ought to be branching  
out from agricultural experimentation more than they do; that is as  
much as I'd want to say.  There are any number of examples from  
psychology, some of them very interesting, in the psychology books on  
experimental design.  (Data from published papers ought nowadays as a  
matter of course go into web-based archives - aside from other  
considerations this would provide useful teaching resources. In some  
areas, this is already happening.)

Degrees of freedom make sense in crossed designs also; it is the F- 
statistics for SEs for fixed effects that can be problematic.  It may  
happen that one or two sources of error (maybe treatments by years)  
will dominate to such an extent that other sources can pretty much be  
ignored.  The conceptual simplification is worth having; its utility  
may not be evident from a general multi-level modeling perspective.

Maybe one does not want statisticians to be too dyed in the wool  
agricultural.  Still, a bit of that thinking goes a long way, not  
least among social scientists.  The arguments in Rosenbaum's  
"Observational Data" are much easier to follow if one comes to them  
with some knowledge and practical experience of old-fashioned  
experimental design.  That seems to me a good indication of the quite  
fundamental role of those ideas, even if one will never do a  
randomized experiment.

I'd have every novice statistician do apprenticeship's that include  
experience in horticultural science (they do not have a long tradition  
of working alongside professional statisticians), medicine and  
epidemiology, business statistics, and somewhere between health social  
science and psychology.  It is unfortunate that I did not myself have  
this broad training, which may go some way to explaining why I am not  
in complete agreement with Doug's sentiments!!

This is not to defend, in any large variety of places and  
circumstances, inference that relies on degrees of freedom.  Not that  
my point relate directly to degrees of freedom.  On degrees of  
freedom, I judge them to have a somewhat wider usefulness than Doug  
will allow.  It would be nice if lmer() were to provide Kenward &  
Rogers style degrees of freedom ( as the commercial ASReml-R software  
does), but Doug is right to press the dangers of giving information  
that can for very unbalanced designs be misleading.  Given a choice  
between mcmcsamp() and degrees of freedom, maybe I would choose  
mcmcsamp().

There's enlightening discussion of the opportunities that internet- 
based business offers for automated data collection and for  
experimentation on website visitors, in Ian Ayres "SuperCrunchers. Why  
thinking by numbers is the new way to be smart" (Bantam).   
Fortunately, the hype of the title pretty much goes once on gets into  
the text.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

On Thu, Mar 20, 2008 at 7:04 PM, John Maindonald
<john.maindonald at anu.edu.au> wrote:
I do not think it quite true that the aov model that has an
Error() term is a fixed effects model.  The use of the word
"stratum" implies that a mixed effects model is lurking
somewhere.  The F-tests surely assume such a model.

Some little time ago, Doug Bates invested me, along with
Peter Dalgaard, a member of the degrees of freedom police.
Problem is, I am unsure of the responsibilities, but maybe
they include commenting on a case such as this.
Indeed, I should have been more explicit regarding the duties of a
member of the degrees of freedom police when I appointed you :-)  They
are exactly what you are doing - to weigh in on discussions of degrees
of freedom.

Perhaps I can expand a bit on why I think that the degrees of freedom
issue is not a good basis for approaching mixed models in general.

Duncan Temple Lang used to have a quote about "Language shapes the way
we think." in his email signature and I think similar ideas apply to
how we think about models.  Certainly the concept of error strata and
the associated degrees of freedom are one way of thinking about
mixed-effects models and they are effective in the case of completely
balanced designs.  If one has a completely balanced design and the
number of units at various strata is small then decomposition of the
variability into error strata is possible and the calculation of
degrees of freedom is important (because the degrees of freedom are
small - exact values of degrees of freedom are unimportant when they
are large).  Such data occur frequently - in text books.  In fact,
they are the only type of data that occur in most text books and hence
they have shaped the way we think about the models.  Generations of
statisticians have looked at techniques for small, balanced data sets
and decided that these define "the" approach to the model.  So when
confronted with large observational (and, hence, unbalanced or
"messy") data sets they approach the modeling with this mind set.

Certainly such data did occur in some agricultural trials (I'm not
sure if this is the state of the art in current agricultural trials)
and this form of analysis was important but it depends strongly on
balanced data and balance is a very fragile characteristic in most
data.  The one exception is data in text books - most often data for
examples in older texts were added as an afterthought and, for space
reasons and to illustrate the calculations, were small data sets which
just happened to be completely balanced so that all the calculations
worked out.   Fortunately this is no longer the case in books like
your book with Braun where the data sets are real data and available
separately from the book in R packages but it will take many years to
flush the way of thinking about models induced by those small,
balanced data sets from statistical "knowledge".  (There is an
enormous amount of inertia built into the system.  Most introductory
statistics textbooks published today, and probably those published for
the next couple of decades, will include "statistical tables" as
appendices because, as everyone knows, all of the statistical analysis
we do in practice involves doing a few calculations on a hand
calculator and looking up tabulated values of distributions.)

Getting back to the "language shapes the way we think" issue, I would
compare it to strategy versus tactics.  I spent a sabbatical in
Australia visiting Bill Venables.  I needed to adjust to driving on
the left hand side of the road and was able to do that after a short
initial period of confusion and discomfort.  If I think of places in
Adelaide now it is natural for me to think of myself sitting on the
right hand side of the front seat and driving on the left hand side of
the road.  That's tactics.  However, even after living in Adelaide for
several months I remember leaving the parking lot of a shopping mall
and choosing an indirect route out the back instead of the direct
route ahead because the direct route would involve a left hand turn
onto a busy road.  Instead I chose to make two right hand turns to get
to an intersection with traffic lights where I could make the left
turn.  At the level of strategy I still "knew" that left hand turns
are difficult and right hand turns are easy - exactly the wrong
approach in Australia.

This semester I am attending a seminar on hierarchical linear models
organized in our social sciences school by a statistician whose
background is in agricultural statistics.  I make myself unpopular
because I keep challenging the ideas of the social scientists and of
the organizer.  The typical kinds of studies we discuss are
longitudinal studies where the "observational units" (i.e. people) are
grouped within some social contexts.  These are usually large, highly
unbalanced data  sets. Because they are longitudinal they inevitably
involve some migration of people from one group to another.

The migration means that random effects for person and for the various
types of groups are not nested.  The models are not hierarchical or at
least they should not be.  Trying to jam the analysis of such data
into the hierarchical/multilevel framework of software like HLM or
MLwin doesn't work well but that certainly won't stop people from
trying.  The hierarchical language in which they learned the model
affects their strategy.

The agricultural statistician, by contrast, doesn't worry about the
nesting etc.,  For him the sole issue of important is to determine the
number of degrees of freedom associated with the various error strata.
These studies involve tens of thousands of subjects and are nowhere
close to being balanced but his strategy is still based on having a
certain number of blocks and plots within the blocks and subplots
within the plots and everything is exactly balanced.

So I am not opposed to approaches based on error strata and computing
degrees of freedom where appropriate and where important.  But try not
to let the particular case of completely balanced small data sets
determine the strategy of the approach to all models involving random
effects.  It's fragile and does not generalize well, just as assuming
a strict hierarchy of factors associated with random effects is
fragile.  Balance is the special case.  Nesting is the special case.
It is a bad idea to base strategy on what happens in very special
cases.

they are always balanced and small data sets

Another very fragile

lme() makes a stab at an appropriate choice of degrees
of freedom, but does not always get it right, to the extent
that there is a right answer.  [lmer() has for the time being
given up on giving degrees of freedom and p-values for
fixed effects estimates.]  This part of the output from lme()
should, accordingly, be used with discretion.  In case of
doubt, check against a likelihood ratio test.  In a simple
enough experimental design, users who understand how
to calculate degrees of freedom will reason them out for
themselves.
John Maindonald.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

On 21 Mar 2008, at 9:23 AM, Kevin Wright wrote:

Your question is not very clear, but if you are trying to match the
results in Kuehl, you need a fixed-effects model:

dat <- read.table("expl14-1.txt", header=TRUE)
dat$block <- factor(dat$block)
dat$nitro <- factor(dat$nitro)
dat$thatch <- factor(dat$thatch)

colnames(dat) <- c("block","nitro","thatch","chlor")
m1 <- aov(chlor~nitro*thatch+Error(block/nitro), data=dat)
summary(m1)

Mixed-effects models and degrees of freedom have been discussed many
times on this list....search the archives.

K Wright

On Thu, Mar 20, 2008 at 12:39 PM,  <marcioestat at pop.com.br> wrote:

Hi listers,

I've been studying anova and at the book of Kuehl at the chapter
about split-plot there is a experiment with the results... I am
trying to
understand the experiments and make the code in order to obtain the
results... But there is something that I didn't understand yet...
I have a split-plot design (2 blocks) with two facteurs, one
facteur has 4 treatments and the other facteur is a measure
taken in three years...
I organize my data set as:

Nitro Bloc Year Measure
a
x
1         3.8
a
x
2         3.9
a         x         3         2.0
a         y         1         3.7
a         y         2
2.4
a         y         3
1.2
b         x
1         4.0
b         x
2         2.5
and so on...

So, I am trying this code, because I want to test each factor and  
the
interaction...
lme=lme(measure ~ bloc + nitro + bloc*nitro, random= ~ 1|year,
data=lme)
summary(lme)
The results that I am obtaining are not correct, because
I calculated the degrees of fredom and they are not
correct... According to this design I will get two errors one for  
the
whole plot and other for the subplot....

Well, as I told you, I am still learning... Any suggestions...

Thanks in advance,

Ribeiro

     [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

A response to your final paragraph.  There's no reason in principle
why aov() should not allow for random slopes.  In fact, one can
put continuous variables into the Error() term, but without taking
time to study the output with some care, it is not obvious to me what
the results mean.

[For reasons that are not clear to me, you mentioned Type III
"errors".  The controversy over Type III versus Type I sums of
squares relates to models with fixed effect interactions.

With models that have a single error term, drop1() will example
the effect of single term deletions, often in unbalanced designs
more relevant than the sequential information in the anova table.
Importantly, drop1() respects hierarchy, whereas Type III sums of
squares do not.]

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

John Maindonald wrote:
I do not think it quite true that the aov model that has an
Error() term is a fixed effects model.  The use of the word
"stratum" implies that a mixed effects model is lurking
somewhere.  The F-tests surely assume such a model.
This is something that has been bothering me for a while.  Often  
it's argued that ANOVA is just regression; clearly this is not true  
when it's a repeated measures ANOVA, unless "regression" is  
interpreted broadly. I think Andrew Gelman argues this somewhere.  I  
don't see how to get aov to give me a formula, and lm doesn't fit  
stuff with an Error() term, but if it could, logically I would  
expect the formula to resemble closely the sort of thing you get  
with a mixed effects models.

The closest analogy I can find is that doing this...

aov1 = aov(yield ~  N+P+K + Error(block), npk)
summary(aov1)
Error: block
         Df Sum Sq Mean Sq F value Pr(>F)
Residuals  5    343      69

Error: Within
         Df Sum Sq Mean Sq F value Pr(>F)
N          1  189.3   189.3   11.82 0.0037 **
P          1    8.4     8.4    0.52 0.4800
K          1   95.2    95.2    5.95 0.0277 *
Residuals 15  240.2    16.0
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

[I believe the F's and p's here come from a linear construction of  
nested models (i.e., N versus intercept only; N+P versus N; plus N+P 
+K versus N+P).]

... is a bit like doing this with lmer...

lmer0 = lmer(yield ~  1 + (1|block), npk)
lmer1 = lmer(yield ~  N + (1|block), npk)
lmer2 = lmer(yield ~  N+P + (1|block), npk)
lmer3 = lmer(yield ~  N+P+K + (1|block), npk)
anova(lmer0,lmer1)
Data: npk
Models:
lmer0: yield ~ 1 + (1 | block)
lmer1: yield ~ N + (1 | block)
     Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer0  2 157.5 159.8  -76.7
lmer1  3 151.5 155.1  -72.8  7.93      1     0.0049 **
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
anova(lmer1,lmer2)
Data: npk
Models:
lmer1: yield ~ N + (1 | block)
lmer2: yield ~ N + P + (1 | block)
     Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer1  3 151.5 155.1  -72.8
lmer2  4 153.0 157.8  -72.5  0.47      1       0.49
anova(lmer2,lmer3)
Data: npk
Models:
lmer2: yield ~ N + P + (1 | block)
lmer3: yield ~ N + P + K + (1 | block)
     Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer2  4 153.0 157.8  -72.5
lmer3  5 149.0 154.9  -69.5  6.02      1      0.014 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Namely, fitting the models, and making comparisons using likelihood  
ratio tests.  Okay, the LLR is asymptotically chisq distributed with  
df the difference in parameters whereas... well F is different - I  
always get confused with the df and what bits of the residuals plug  
in where.

But, is this vaguely right?  (I'm aware that order matters when the  
design isn't balanced, and have read much about the type I errors  
versus type III errors disagreements.)

Actually this does lead to one question: following through the  
analogy, are repeated measures ANOVAs (those fitted with aov) always  
random intercept models, i.e. with no random slopes?

Cheers,

Andy

-- 
Andy Fugard, Postgraduate Research Student
Psychology (Room F3), The University of Edinburgh,
 7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk
Quoting John Maindonald <John.Maindonald at anu.edu.au>:
A response to your final paragraph.  There's no reason in principle
why aov() should not allow for random slopes.  In fact, one can
put continuous variables into the Error() term, but without taking
time to study the output with some care, it is not obvious to me what
the results mean.
Aha, thanks.
[For reasons that are not clear to me, you mentioned Type III
"errors".  The controversy over Type III versus Type I sums of
squares relates to models with fixed effect interactions.
Interference effect; I meant to say "sums of squares", not "errors".   
I brought this up as I wanted to flag awareness that it's not always  
appropriate (with respect to one's guiding hypotheses/questions) to  
compare nested models where terms are added sequentialy in some  
arbitrary order.

Thanks,

Andy
John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

On 22 Mar 2008, at 12:51 AM, Andy Fugard wrote:

John Maindonald wrote:
I do not think it quite true that the aov model that has an
Error() term is a fixed effects model.  The use of the word
"stratum" implies that a mixed effects model is lurking
somewhere.  The F-tests surely assume such a model.
This is something that has been bothering me for a while.  Often   
it's argued that ANOVA is just regression; clearly this is not true  
 when it's a repeated measures ANOVA, unless "regression" is   
interpreted broadly. I think Andrew Gelman argues this somewhere.    
I don't see how to get aov to give me a formula, and lm doesn't fit  
 stuff with an Error() term, but if it could, logically I would   
expect the formula to resemble closely the sort of thing you get   
with a mixed effects models.

The closest analogy I can find is that doing this...

aov1 = aov(yield ~  N+P+K + Error(block), npk)
summary(aov1)
Error: block
        Df Sum Sq Mean Sq F value Pr(>F)
Residuals  5    343      69

Error: Within
        Df Sum Sq Mean Sq F value Pr(>F)
N          1  189.3   189.3   11.82 0.0037 **
P          1    8.4     8.4    0.52 0.4800
K          1   95.2    95.2    5.95 0.0277 *
Residuals 15  240.2    16.0
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

[I believe the F's and p's here come from a linear construction of   
nested models (i.e., N versus intercept only; N+P versus N; plus   
N+P+K versus N+P).]

... is a bit like doing this with lmer...

lmer0 = lmer(yield ~  1 + (1|block), npk)
lmer1 = lmer(yield ~  N + (1|block), npk)
lmer2 = lmer(yield ~  N+P + (1|block), npk)
lmer3 = lmer(yield ~  N+P+K + (1|block), npk)
anova(lmer0,lmer1)
Data: npk
Models:
lmer0: yield ~ 1 + (1 | block)
lmer1: yield ~ N + (1 | block)
    Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer0  2 157.5 159.8  -76.7
lmer1  3 151.5 155.1  -72.8  7.93      1     0.0049 **
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
anova(lmer1,lmer2)
Data: npk
Models:
lmer1: yield ~ N + (1 | block)
lmer2: yield ~ N + P + (1 | block)
    Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer1  3 151.5 155.1  -72.8
lmer2  4 153.0 157.8  -72.5  0.47      1       0.49
anova(lmer2,lmer3)
Data: npk
Models:
lmer2: yield ~ N + P + (1 | block)
lmer3: yield ~ N + P + K + (1 | block)
    Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
lmer2  4 153.0 157.8  -72.5
lmer3  5 149.0 154.9  -69.5  6.02      1      0.014 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Namely, fitting the models, and making comparisons using likelihood  
 ratio tests.  Okay, the LLR is asymptotically chisq distributed   
with df the difference in parameters whereas... well F is different  
 - I always get confused with the df and what bits of the residuals  
 plug in where.

But, is this vaguely right?  (I'm aware that order matters when the  
 design isn't balanced, and have read much about the type I errors   
versus type III errors disagreements.)

Actually this does lead to one question: following through the   
analogy, are repeated measures ANOVAs (those fitted with aov)   
always random intercept models, i.e. with no random slopes?

Cheers,

Andy

-- 
Andy Fugard, Postgraduate Research Student
Psychology (Room F3), The University of Edinburgh,
7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk
-- 
Andy Fugard, Postgraduate Research Student
Psychology (Room F3), The University of Edinburgh,
   7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Thanks for your response John.  I just have one quick comment about
the Kenward-Roger degrees of freedom calculation.  It has been some
time since I looked at that paper but my impression at the time was
that the equations would not easily translate into the formulation
used in lmer.  The approach used in lmer is like that of Henderson's
mixed model equations (with many modifications).  That is, it is based
on a penalized least squares problem, not a generalized least squares
problem.  My recollection is that Kenward and Roger wrote their
equations in terms of the generalized least squares problem.

You are not the first person to suggest that incorporating the
Kenward-Roger calculation would enhance lmer.  The reason I haven't
done so is that I believe it is far from trivial to do so and I have
many, many other enhancements I would prefer to spend my time on.
However, this is open source software and any enterprising person who
wants to implement it is more than welcome to do so.  It may be
sufficiently involved to be a thesis topic - I don't know because I
haven't studied it carefully enough.

Any person considering doing that should read or reread Bill Venables'
"Exegeses on Linear Models" before embarking on it.  As he points out
in his discussion of modifications made in S-PLUS to emulate some
calculations in SAS, the "brute force" approach of taking a set of
equations and implementing them literally is rarely a good approach.
So I am not talking about a "pidgin R" implementation here where a
linear least squares calculation is written

XpX <- t(X) %*% X
XpXinv <- solve(XpX)
Xpy <- t(X) %*% y
betahat <- XpXinv %*% Xpy

An mer object includes slots L, RZX and RX that define the Cholesky
decomposition of the crossproduct matrix that is more-or-less like
that in the Henderson mixed model equations.  The L slot itself has a
Perm slot that gives the fill-reducing permutation P for the random
effects.  That may not be relevant - I'm not sure.  The original model
matrices are available as the X and Zt (transpose of Z) slots.  The
(transpose of the) derived model matrix for the orthogonal random
effects is available as A.  The terms attribute of the model matrix X
and the assign attribute of the model frame (in the slot named
"frame") should be used to associate terms with columns of the model
matrix.

It's possible that the calculation would be straightforward.  As i
said, I don't know.  My gut feeling is that it is not, which is why I
haven't embarked on it.
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080323/15c5464c/attachment.pl>
Hi Ribeiro,

try section 1.6 of Pinheiro and Bates, as a starting point, and/or
section 10.2 of Venables and Ripley.

Andrew

Hey
listers,

?

?

It&rsquo;s good to know that I still
have a lot of search to do&hellip;

?

According to the two procedures AOV
and LME, I got two different results and I didn&rsquo;t understand at all
the results of LME&hellip; There is a coefficient estimate of each level
and I just pretend to test if the effects of the factors and interaction
are significant or not&hellip;So I would like to learn more about this
function, because it's adequate for split-plot design...
test.anova=
aov(mes ~ nitro*thatch + Error(block/nitro), data=test)

?

summary(test.anova)

?

test.lme
<- lme(mes ~ nitro*thatch,random= ~ 1|block/nitro, data=test)

?

summary(test.lme)

?

Does anybody has any document or
reference that explain this results in details of the lme&hellip;
I&rsquo;ve looked for, but I didn&rsquo;t find any good
explication&hellip; I notice that there is a lot of information about the
lmer, but I am going step by step&hellip;

?

?

Thanks,

?

?

Ribeiro

?

?
Thanks for your response John. I just have one quick comment
about 
the Kenward-Roger degrees of freedom calculation. It has
been some 
time since I looked at that paper but my impression
at the time was 
that the equations would not easily translate
into the formulation 
used in lmer. The approach used in lmer is
like that of Henderson's 
mixed model equations (with many
modifications). That is, it is based 
on a penalized least
squares problem, not a generalized least squares 
problem. My
recollection is that Kenward and Roger wrote their 
equations in
terms of the generalized least squares problem. 
You
are not the first person to suggest that incorporating the 

Kenward-Roger calculation would enhance lmer. The reason I haven't 
done so is that I believe it is far from trivial to do so and I
have 
many, many other enhancements I would prefer to spend my
time on. 
However, this is open source software and any
enterprising person who 
wants to implement it is more than
welcome to do so. It may be 
sufficiently involved to be a
thesis topic - I don't know because I 
haven't studied it
carefully enough. 
Any person considering doing that
should read or reread Bill Venables' 
"Exegeses on Linear
Models" before embarking on it. As he points out 
in his
discussion of modifications made in S-PLUS to emulate some 

calculations in SAS, the "brute force" approach of taking a set
of 
equations and implementing them literally is rarely a good
approach. 
So I am not talking about a "pidgin R"
implementation here where a 
linear least squares calculation is
written 
XpX <- t(X) %*% X 
XpXinv <-
solve(XpX) 
Xpy <- t(X) %*% y 
betahat <- XpXinv
%*% Xpy 
An mer object includes slots L, RZX and RX
that define the Cholesky 
decomposition of the crossproduct
matrix that is more-or-less like 
that in the Henderson mixed
model equations. The L slot itself has a 
Perm slot that gives
the fill-reducing permutation P for the random 
effects. That
may not be relevant - I'm not sure. The original model 
matrices
are available as the X and Zt (transpose of Z) slots. The 

(transpose of the) derived model matrix for the orthogonal random 
effects is available as A. The terms attribute of the model matrix
X 
and the assign attribute of the model frame (in the slot
named 
"frame") should be used to associate terms with
columns of the model 
matrix. 

It's
possible that the calculation would be straightforward. As i 

said, I don't know. My gut feeling is that it is not, which is why I 
haven't embarked on it. 

_______________________________________________ 

R-sig-mixed-models at r-project.org mailing list 

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models 

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Andrew Robinson  
Department of Mathematics and Statistics            Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia         Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/