Skip to content
Prev 75721 / 398502 Next

error in predict glm (new levels cause problems)

Have you considered replacing the offending factor with explicit 
coding of your own choosing?  See "?contr.helmert" and in library(MASS) 
"contr.sdif" plus the "contrasts" attribute in "options", obtained, 
e.g., via 'options("contrasts")'.  Every k-level factor is by default 
converted into a set of (k-1) columns of numeric codes.  The exact 
numbers do not matter to p values obtained from anova, though they will 
matter to the coefficients estimated.

	  Have you tried something like the following with "glm.nb":

 > set.seed(1)
 > DF <- data.frame(a=sample(letters[1:3], 30, replace=TRUE),
+                  b=sample(LETTERS[1:3], 30, replace=TRUE),
+                  y=rnorm(30))
 > with(DF, table(a, b))
    b
a   A B C
   a 3 3 3
   b 2 6 3
   c 2 4 4
 > fit1 <- lm(y~a+b, DF[DF$a!="a",])
 > fit1$contrasts
$a
[1] "contr.treatment"

$b
[1] "contr.treatment"

 > options(contrasts=c(unordered="contr.helmert", ordered="contr.poly"))
 > fit2 <- lm(y~a+b, DF[DF$a!="a",])
 > fit2$contrasts
$a
[1] "contr.helmert"

$b
[1] "contr.helmert"

 > anova(fit1)
Analysis of Variance Table

Response: y
           Df Sum Sq Mean Sq F value Pr(>F)
a          1 0.0008  0.0008  0.0015 0.9698
b          2 0.6186  0.3093  0.5775 0.5719
Residuals 17 9.1050  0.5356
 > anova(fit2)
Analysis of Variance Table

Response: y
           Df Sum Sq Mean Sq F value Pr(>F)
a          1 0.0008  0.0008  0.0015 0.9698
b          2 0.6186  0.3093  0.5775 0.5719
Residuals 17 9.1050  0.5356
 >
	  spencer graves
K. Steinmann wrote: