Dear all,?a quick question regarding AIC & quasi-GAMM. I'm investigating age-dependent variation in body mass in 2 different populations, and decided to go for a GAM approach. As my data are grouped within years & areas, these have been fitted as random intercepts. In the attempt to fix heterogeneity issues in residual variance, I fitted the model with a "quasi" family, so that it looks like: mod1 <- gamm(mass ~ s(age, by= population) + population,? ? ? ? ? ? ? ? ? ? ? ? ? data = my.data,? ? ? ? ? ? ? ? ? ? ? ? ? random = list(year = ~ 1, area = ~ 1),?? ? ? ? ? ? ? ? ? ? ? ? ? family = quasi(link = "identity", variance = "mu")) Now, if I try to extract the AIC from this model, I actually get a value (16620.34), and a seemingly reasonable one (if compared to a corresponding full-likelihood Tweedie GAMM, which returns the same AIC). My question is, how is it possible that I get an AIC from a quasi-family? Re-fitting the same model without random terms:? mod2 <- gam(mass ~ s(age, by= population) + population,? ? ? ? ? ? ? ? ? ? ? ?data = my.data,? ? ? ? ? ? ? ? ? ? ? ?family=quasi(link="identity", variance = "mu")) AIC(mod2) gives, as expected, a "NA". What allows GAMM to return an AIC value even when using a quasi-family? Thanks in advance for your help!Luca
Quasi-GAMM AIC?
3 messages · Luca Corlatti, Ben Bolker, Boku mail
On 12/4/20 5:33 PM, luca corlatti via R-sig-mixed-models wrote:
Dear all,?a quick question regarding AIC & quasi-GAMM. I'm investigating age-dependent variation in body mass in 2 different populations, and decided to go for a GAM approach. As my data are grouped within years & areas, these have been fitted as random intercepts. In the attempt to fix heterogeneity issues in residual variance, I fitted the model with a "quasi" family, so that it looks like:
mod1 <- gamm(mass ~ s(age, by= population) + population,? ? ? ? ? ? ? ? ? ? ? ? ? data = my.data,? ? ? ? ? ? ? ? ? ? ? ? ? random = list(year = ~ 1, area = ~ 1),?? ? ? ? ? ? ? ? ? ? ? ? ? family = quasi(link = "identity", variance = "mu")) Now, if I try to extract the AIC from this model, I actually get a value (16620.34), and a seemingly reasonable one (if compared to a corresponding full-likelihood Tweedie GAMM, which returns the same AIC). My question is, how is it possible that I get an AIC from a quasi-family? Re-fitting the same model without random terms: mod2 <- gam(mass ~ s(age, by= population) + population,? ? ? ? ? ? ? ? ? ? ? ?data = my.data,? ? ? ? ? ? ? ? ? ? ? ?family=quasi(link="identity", variance = "mu")) AIC(mod2) gives, as expected, a "NA". What allows GAMM to return an AIC value even when using a quasi-family? Thanks in advance for your help!
Luca
tl;dr I wouldn't trust it !
It took me a while, but I think I found the answer.
Your AIC calculation only 'works' (for some value of 'works') because
you have the MuMIn package loaded.
library(mgcv)
library(MuMIn)
data(sleepstudy,package="lme4")
mod1 <- gamm(Reaction ~s(Days), random=list(Subject = ~1),
data=sleepstudy,
family=quasi(link="identity", variance="mu"))
The mystery of why MuMIn provides a logLik method for gamm objects is
explained in a document called "Model selection with MuMIn and GAMM"
which can be found here ...
https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/inst/doc/gamm.pdf?revision=91&root=mumin&pathrev=92
"In the case of gamm and gamm4, the returned object has no special
class, it is a list with two items: lme or mer, and gam (with some
information stripped from it). Therefore no specific methods can be
applied.The solution is to provide a wrapper function for gamm that
evaluates the model and adds a class attribute onto it ...
<technical details>
It should be noted here that the issue of what the log-likelihood for
GAMM should be is not entirely clear. The documentation for gamm states
that the log-likelihood of lme is not the one of the fitted GAMM.
However, comparing alternative models shows some evidence that it may be
still appropriate for gamm. Namely the log-likelihood of fitted lme, and
one of the lme part of gamm (including only linear terms to make the
comparison adequate), have identical value ..."
?mgcv::gamm says:
?gamm? assumes that you know what you are doing! For example, unlike
?glmmPQL? from ?MASS? it will return the complete ?lme? object from the
working model at convergence of the PQL iteration, including the `log
likelihood', even though this is not the likelihood of the fitted GAMM.
THere's an argument for using "quasi-AIC" in model selection problems
<https://cran.r-project.org/web/packages/bbmle/vignettes/quasi.pdf> ,
but it seems mostly confined to wildlife ecologists ...
https://stat.ethz.ch/pipermail/r-help/2003-July/035898.html
Thank you Ben, that explains a lot! Just to add up to the discussion, interestingly, when comparing: library(mgcv) library(MuMIn) data(sleepstudy,package="lme4") mod1 <- gamm(Reaction ~s(Days), random=list(Subject = ~1), ? data=sleepstudy, family=quasi(link="identity", variance="mu")) to: mod2 <- gamm(Reaction ~s(Days), random=list(Subject = ~1), ? data=sleepstudy, ? family=Tweedie(p=1.0001, link=power(1))) AIC(mod1, mod2) mod1 1797.943 mod2 1797.943 the AIC values are indeed identical. I suppose it would be interesting to know under which circumstances Barton?s heuristic works or not. (qAIC is a minor pain, but from a practitioner?s perspective, AIC is handier!) Luca Il 5 dic 2020, 01:13 +0100, Ben Bolker <bbolker at gmail.com>, ha scritto:
On 12/4/20 5:33 PM, luca corlatti via R-sig-mixed-models wrote:
Dear all,?a quick question regarding AIC & quasi-GAMM. I'm investigating age-dependent variation in body mass in 2 different populations, and decided to go for a GAM approach. As my data are grouped within years & areas, these have been fitted as random intercepts. In the attempt to fix heterogeneity issues in residual variance, I fitted the model with a "quasi" family, so that it looks like:
mod1 <- gamm(mass ~ s(age, by= population) + population,? ? ? ? ? ? ? ? ? ? ? ? ? data = my.data,? ? ? ? ? ? ? ? ? ? ? ? ? random = list(year = ~ 1, area = ~ 1),?? ? ? ? ? ? ? ? ? ? ? ? ? family = quasi(link = "identity", variance = "mu")) Now, if I try to extract the AIC from this model, I actually get a value (16620.34), and a seemingly reasonable one (if compared to a corresponding full-likelihood Tweedie GAMM, which returns the same AIC). My question is, how is it possible that I get an AIC from a quasi-family? Re-fitting the same model without random terms: mod2 <- gam(mass ~ s(age, by= population) + population,? ? ? ? ? ? ? ? ? ? ? ?data = my.data,? ? ? ? ? ? ? ? ? ? ? ?family=quasi(link="identity", variance = "mu")) AIC(mod2) gives, as expected, a "NA". What allows GAMM to return an AIC value even when using a quasi-family? Thanks in advance for your help!
Luca tl;dr I wouldn't trust it ! It took me a while, but I think I found the answer. Your AIC calculation only 'works' (for some value of 'works') because you have the MuMIn package loaded. library(mgcv) library(MuMIn) data(sleepstudy,package="lme4") mod1 <- gamm(Reaction ~s(Days), random=list(Subject = ~1), data=sleepstudy, family=quasi(link="identity", variance="mu")) The mystery of why MuMIn provides a logLik method for gamm objects is explained in a document called "Model selection with MuMIn and GAMM" which can be found here ... https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/inst/doc/gamm.pdf?revision=91&root=mumin&pathrev=92 "In the case of gamm and gamm4, the returned object has no special class, it is a list with two items: lme or mer, and gam (with some information stripped from it). Therefore no specific methods can be applied.The solution is to provide a wrapper function for gamm that evaluates the model and adds a class attribute onto it ... <technical details> It should be noted here that the issue of what the log-likelihood for GAMM should be is not entirely clear. The documentation for gamm states that the log-likelihood of lme is not the one of the fitted GAMM. However, comparing alternative models shows some evidence that it may be still appropriate for gamm. Namely the log-likelihood of fitted lme, and one of the lme part of gamm (including only linear terms to make the comparison adequate), have identical value ..." ?mgcv::gamm says: ?gamm? assumes that you know what you are doing! For example, unlike ?glmmPQL? from ?MASS? it will return the complete ?lme? object from the working model at convergence of the PQL iteration, including the `log likelihood', even though this is not the likelihood of the fitted GAMM. THere's an argument for using "quasi-AIC" in model selection problems <https://cran.r-project.org/web/packages/bbmle/vignettes/quasi.pdf> , but it seems mostly confined to wildlife ecologists ... https://stat.ethz.ch/pipermail/r-help/2003-July/035898.html
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models