Skip to content

why NA coefficients

8 messages · array chip, David Winsemius

#
Hi, I am trying to run ANOVA with an interaction term on 2 factors (treat has 7 levels, group has 2 levels). I found the coefficient for the last interaction term is always 0, see attached dataset and the code below:
Call:
lm(formula = y ~ factor(treat) * factor(group), data = test)

Coefficients:
????????????????? (Intercept)???????????????? factor(treat)2???????????????? factor(treat)3? 
???????????????????? 0.429244?????????????????????? 0.499982?????????????????????? 0.352971? 
?????????????? factor(treat)4???????????????? factor(treat)5???????????????? factor(treat)6? 
??????????????????? -0.204752?????????????????????? 0.142042?????????????????????? 0.044155? 
?????????????? factor(treat)7???????????????? factor(group)2? factor(treat)2:factor(group)2? 
??????????????????? -0.007775????????????????????? -0.337907????????????????????? -0.208734? 
factor(treat)3:factor(group)2? factor(treat)4:factor(group)2? factor(treat)5:factor(group)2? 
??????????????????? -0.195138?????????????????????? 0.800029?????????????????????? 0.227514? 
factor(treat)6:factor(group)2? factor(treat)7:factor(group)2? 
???????????????????? 0.331548???????????????????????????? NA 


I guess this is due to model matrix being singular or collinearity among the matrix columns? But I can't figure out how the matrix is singular in this case? Can someone show me why this is the case?

Thanks

John
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111107/430e48a8/attachment.txt>
#
On Nov 7, 2011, at 7:33 PM, array chip wrote:

            
Because you have no cases in one of the crossed categories.
#
On Nov 7, 2011, at 10:07 PM, array chip wrote:

            
Well, it had to omit one of them didn't it?

(But I don't know why that level was chosen.)
#
On Nov 8, 2011, at 1:19 AM, David Winsemius wrote:

            
But this output suggests there may be alligators in the swamp:

 > predict(lmod, newdata=data.frame(treat=1, group=2))
          1
0.09133691
Warning message:
In predict.lm(lmod, newdata = data.frame(treat = 1, group = 2)) :
   prediction from a rank-deficient fit may be misleading
#
On Nov 8, 2011, at 12:36 PM, array chip wrote:

            
Have you considered redefining the implicit base level for "treat" so  
it does not create the missing crossed-category?

 > test$treat2_ <- factor(test$treat, levels=c(2:7, 1) )
 > lm(y~treat2_*factor(group),test)

Call:
lm(formula = y ~ treat2_ * factor(group), data = test)

Coefficients:
             (Intercept)                 treat2_3                  
treat2_4
               0.9292256               -0.1470106                
-0.7047343
                treat2_5                 treat2_6                  
treat2_7
              -0.3579398               -0.4558269                
-0.5077571
                treat2_1           factor(group)2   
treat2_3:factor(group)2
              -0.4999820               -0.5466405                 
0.0135963
treat2_4:factor(group)2  treat2_5:factor(group)2   
treat2_6:factor(group)2
               1.0087628                0.4362479                 
0.5402821
treat2_7:factor(group)2  treat2_1:factor(group)2
               0.2087338                       NA

All the "group-less" coefficients are for group1 , so  now get a  
prediction for group=1:treat=2 == "Intercept", group=1:treat=3 , ....  
a total of 7 values.

And there are 6 predictions for group2.

The onus is obviously on you to check the predictions against the  
data. 'aggregate' is a good function for that purpose.