Error with nlme and fixed effects defined with a minus 1
Thanks for the quick reply. lm() doesn't have the same issue (at least using what I think is the equivalent construction). See the first test below for that. And, when I try to manually add the contrasts, nlme drops the contrasts due to missing levels; see the second example with simdata4. Other than regenerating the model or using a contrast setup that I don't want (due to the ease of interpreting the contrasts with -1), I don't see a way around it other than augmenting the data with some data that I don't want to use. Also, I am apparently a consistent person because I had the same issue about 4 years ago (https://bugs.r-project.org/show_bug.cgi?id=17226), and in that bug, I found that someone on stackoverflow had the same issue (https://stats.stackexchange.com/questions/29513/error-in-getting-predictions-from-a-lme-object). In the stackoverflow answer, there is a patch that may help. (I hesitate to make that fix into a real fix myself because I don't fully understand the internals of nlme.) ``` r simdata <- merge( merge( data.frame(treatment=factor(c("A", "B"))), data.frame(ID=factor(1:10)) ), data.frame(time=1:10) ) set.seed(5) simdata$obs <- rnorm(nrow(simdata)) model_minus1 <- lm(obs~treatment-1, data=simdata) model <- lm(obs~treatment, data=simdata) # Generate new data with the correct treatment factor levels, but only one of # the levels represented within the data. simdata2 <- merge( data.frame(treatment=factor("A", levels=c("A", "B"))), data.frame(time=1:10) ) # Generate new data with all factor levels represented simdata3 <- merge( data.frame(treatment=factor(c("A", "B"))), data.frame(time=1:10) ) predict(model, newdata=simdata2) #> 1 2 3 4 5 6 7 #> 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 #> 8 9 10 #> 0.02690558 0.02690558 0.02690558 predict(model_minus1, newdata=simdata2) #> 1 2 3 4 5 6 7 #> 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 #> 8 9 10 #> 0.02690558 0.02690558 0.02690558 predict(model_minus1, newdata=simdata3) #> 1 2 3 4 5 6 7 #> 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 #> 8 9 10 11 12 13 14 #> 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 #> 15 16 17 18 19 20 #> 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 ``` ``` r library(nlme) simdata <- merge( merge( data.frame(treatment=factor(c("A", "B"))), data.frame(ID=factor(1:10)) ), data.frame(time=1:10) ) set.seed(5) simdata$obs <- rnorm(nrow(simdata)) model_minus1 <- nlme(obs~e0, data=simdata, fixed=e0~treatment - 1, random=e0~1|ID, start=c(e0=c(0, 0))) model <- nlme(obs~e0, data=simdata, fixed=e0~treatment, random=e0~1|ID, start=c(e0=c(0, 0))) # Generate new data with the correct treatment factor levels, but only one of # the levels represented within the data. simdata2 <- merge( data.frame(treatment=factor("A", levels=c("A", "B"))), data.frame(time=1:10) ) # Generate new data with all factor levels represented simdata3 <- merge( data.frame(treatment=factor(c("A", "B"))), data.frame(time=1:10) ) simdata4 <- simdata2 contrasts(simdata4$treatment) <- contrasts(simdata$treatment) predict(model, newdata=simdata2, level=0) #> [1] 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 #> [7] 0.02690558 0.02690558 0.02690558 0.02690558 #> attr(,"label") #> [1] "Predicted values" # Without all levels in newdata$treatment, it fails predict(model_minus1, newdata=simdata2, level=0) #> Error in pars[, nm] <- f %*% beta[fmap[[nm]]]: number of items to replace is not a multiple of replacement length # With all levels in newdata$treatment, it works predict(model_minus1, newdata=simdata3, level=0) #> [1] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 #> [7] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 #> [13] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 #> [19] 0.02690558 0.02123989 #> attr(,"label") #> [1] "Predicted values" # Same error as simdata2, but now a warning that factor levels are dropped (when # I don't think they should be) predict(model_minus1, newdata=simdata4, level=0) #> Warning: contrasts dropped from factor treatment due to missing levels #> Error in pars[, nm] <- f %*% beta[fmap[[nm]]]: number of items to replace is not a multiple of replacement length ``` <sup>Created on 2021-08-12 by the [reprex package](https://reprex.tidyverse.org) (v2.0.0)</sup> -----Original Message----- From: Phillip Alday <me at phillipalday.com> Sent: Thursday, August 12, 2021 10:08 PM To: bill at denney.ws; r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] Error with nlme and fixed effects defined with a minus 1 This has more to do with how R expands formulae and contrast coding than anything specific to nlme, I think. I suspect that if you did this with just a normal (non mixed) linear model, the same problem would arise. When you have -1, the intercept is suppressed, which in the presence of a categorical variable means that all k levels of that variable are represented as coefficients instead of k-1 (with the final contrast thrown in with the intercept). This is of course not ideal behavior and could probably be addressed or worked around in nlme, but it seems like a bit of an edge case. Maybe setting the contrasts manually will help? contrasts(simdata2$treatment) <- contrasts(simdata$treatment) (Good call on setting the factor levels in simdata2!)
On 12/8/21 6:59 pm, bill at denney.ws wrote:
Hello,
I think that I have found a bug in nlme. When I generate a model with
fixed effects defined with a factor - 1 (see model_minus1 below) and I
simulate with level=0 in a data.frame that does not have all the
levels (see predict into simdata2 with model_minus1 below), there is
an error. But, if the model is defined without the minus 1 or all
factor levels are represented within the simulation data, there is no error.
Did I find a bug in nlme?
Thanks,
Bill
``` r
library(nlme)
simdata <-
merge(
merge(
data.frame(treatment=factor(c("A", "B"))),
data.frame(ID=factor(1:10))
),
data.frame(time=1:10)
)
set.seed(5)
simdata$obs <- rnorm(nrow(simdata))
model_minus1 <- nlme(obs~e0, data=simdata, fixed=e0~treatment - 1,
random=e0~1|ID, start=c(e0=c(0, 0)))
model <- nlme(obs~e0, data=simdata, fixed=e0~treatment,
random=e0~1|ID, start=c(e0=c(0, 0)))
# Generate new data with the correct treatment factor levels, but only
one of
# the levels represented within the data.
simdata2 <-
merge(
data.frame(treatment=factor("A", levels=c("A", "B"))),
data.frame(time=1:10)
)
# Generate new data with all factor levels represented
simdata3 <-
merge(
data.frame(treatment=factor(c("A", "B"))),
data.frame(time=1:10)
)
predict(model, newdata=simdata2, level=0)
#> [1] 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558
0.02690558
#> [7] 0.02690558 0.02690558 0.02690558 0.02690558
#> attr(,"label")
#> [1] "Predicted values"
# Without all levels in newdata$treatment, it fails
predict(model_minus1, newdata=simdata2, level=0)
#> Error in pars[, nm] <- f %*% beta[fmap[[nm]]]: number of items to
replace is not a multiple of replacement length
# With all levels in newdata$treatment, it works
predict(model_minus1, newdata=simdata3, level=0)
#> [1] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558
0.02123989
#> [7] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558
0.02123989
#> [13] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558
0.02123989
#> [19] 0.02690558 0.02123989
#> attr(,"label")
#> [1] "Predicted values"
```
<sup>Created on 2021-08-12 by the [reprex
package](https://reprex.tidyverse.org) (v2.0.0)</sup>
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models