Skip to content

using weights=varIdent in lme: how to change reference level of grouping factor?

4 messages · Ben Bolker, Karl Ove Hufthammer, White, Emma

1 day later
#
Emma Knight <emma_knight at ...> writes:
Good question. I would have guessed that re-ordering the factor
would do the trick, but I guess not.

library(nlme)
fm1 <- lme(distance ~ age, random =~age|Subject,
     weights=varIdent(form=~1|Sex),data=Orthodont)
fm1$modelStruct$varStruct
## Variance function structure of class varIdent representing
##    Male   Female 
## 1.000000 0.404098 

Orth2 <- transform(Orthodont,
      Sex=factor(Sex,levels=c("Female","Male")))
fm2 <- update(fm1,data=Orth2)
summary(model.frame(fm2))
fm2$modelStruct$varStruct
## Variance function structure of class varIdent representing
##    Male   Female 
## 1.000000 0.404098 

  I have spent some time delving through the guts of lme()
but haven't figured it out yet.  I'm somewhat baffled,
as the default order of factor() should be Female, Male in
any case, so I don't see where the order could be getting
messed up.
3 days later
#
Ben Bolker wrote:

            
Is the following problem related? When using *initalized* variance 
functions, strange things happen:

  $ library(nlme)
  $ Orth2 <- transform(Orthodont,
  +                    Sex=factor(Sex,levels=c("Female","Male")))
  $ Orth3=Orthodont[nrow(Orthodont):1,]
  $ var1=Initialize(varIdent(form=~1|Sex), data=Orthodont)
  $ var2=Initialize(varIdent(form=~1|Sex), data=Orth2)
  $ var3=Initialize(varIdent(form=~1|Sex), data=Orth3)
  $ var1
  Variance function structure of class varIdent representing
    Male Female 
       1      1 
  $ var2
  Variance function structure of class varIdent representing
    Male Female 
       1      1 
  $ var3
  Variance function structure of class varIdent representing
  Female   Male 
       1      1 

So reordering the levels doesn?t change the reference level (var2), but 
reordering the data seems to (var3). That?s nice. But when we actually try 
to fit the models, we get nonsensical results:

  $ fm1=lme(distance ~ age, random =~age|Subject,
  +          weights=var1, data=Orthodont)
  $ fm2=lme(distance ~ age, random =~age|Subject, weights=var2, data=Orth2)
  $ fm3=lme(distance ~ age, random =~age|Subject, weights=var3, data=Orth3)
  $ fm1$modelStruct$varStruct
  Variance function structure of class varIdent representing
      Male   Female 
  1.000000 0.404098 
  $ fm2$modelStruct$varStruct
  Variance function structure of class varIdent representing
      Male   Female 
  1.000000 0.404098 
  $ fm3$modelStruct$varStruct
  Variance function structure of class varIdent representing
     Female      Male 
  1.0000000 0.4315485 

This result seems completely wrong. Shouldn?t the result for Male be ~2.4?

The estimates are identical for the first two models, but *slightly* 
different for the third one:

  $ fixef(fm1)
  (Intercept)         age 
   17.2983171   0.6068598 
  $ fixef(fm2)
  (Intercept)         age 
   17.2983171   0.6068598 
  $ fixef(fm3)
  (Intercept)         age 
   17.3879588   0.6046554

But the most surprising thing (for me) is that the model fitting doesn?t 
seem to care about the value for the stratifying variable at all. For 
example, running

  Orth4=transform(Orthodont, Sex="test")

And fitting the model gives the same results as for fm3. I thought maybe the 
reason was the lme used the values from the variance function instead of 
from the data frame, but even if I completely reorder the data frame, 
breaking any implicit match, I get the same results as for fm3:

  $ Orth4=transform(Orthodont, Sex="test")
  $ Orth4=Orth4[sample(nrow(Orth4)),]
  $ fm4=lme(distance ~ age, random =~age|Subject, weights=var3,
    + data=Orth4)
  $ fm4$modelStruct$varStruct
  Variance function structure of class varIdent representing
     Female      Male 
  1.0000000 0.4315485 
  $ fixef(fm4)
  (Intercept)         age 
   17.3879588   0.6046554

Perhaps the reason could be that the differences in variances between the 
strata are so small that they don?t have any effect on estimates? However, 
fitting a model with *equal* within-group variances does give somewhat 
different estimates:

  $ fm5=lme(distance ~ age, random =~age|Subject, data=Orth4)
  $ fixef(fm5)
  (Intercept)         age 
   16.7611111   0.6601852 

Version info:

  $ sessionInfo()
  R version 3.0.1 (2013-05-16)
  Platform: x86_64-suse-linux-gnu (64-bit)
  
  locale:
   [1] LC_CTYPE=nn_NO.UTF-8       LC_NUMERIC=C
       LC_TIME=nn_NO.UTF-8        LC_COLLATE=nn_NO.UTF-8      
       LC_MONETARY=nn_NO.UTF-8   
   [6] LC_MESSAGES=nn_NO.UTF-8    LC_PAPER=C                
       LC_NAME=C                    LC_ADDRESS=C              
       LC_TELEPHONE=C            
  [11] LC_MEASUREMENT=nn_NO.UTF-8 LC_IDENTIFICATION=C       

  attached base packages:
  [1] stats     graphics  grDevices utils     datasets  methods   base     

  other attached packages:
  [1] nlme_3.1-110

  loaded via a namespace (and not attached):
  [1] grid_3.0.1      lattice_0.20-15 tools_3.0.1
#
Ben Bolker <bbolker at ...> writes:
Hi Ben,
Thanks for having a look at this.
I think lme is using the category with the largest number of observations as
the reference level.
I have no idea how to change this.
Emma