[FORGED] Re: Cross validated log likelihood, redux.
Hi Ben. Thanks for getting back to me on this.
On 17/08/19 1:43 PM, Ben Bolker wrote:
Hmmm. The proximal problem is that the updated deviance function thinks it needs coefficients for all of the parameters (theta = 3 + beta=12), not just the theta parameters. It's not obvious to me (I know this might be code I wrote in the first place!) why you're trying to pass the full theta+beta coefficient vector to a model defined with nAGQ=0 (which estimates the beta coefficients as part of the penalized weighted resid sum of squares (pwrss) computation). Can you explain/remind us of the reason for the mismatch here?
The short answer is, I haven't got a clue. (Or perhaps, to put it differently, that I'm clueless. :-) ) Basically I have just been (mindlessly) following instructions from your very good self. Initially (20 April 2019) you said:
Here is an **inefficient** method for computing the likelihood
coefs <- unlist(getME(fit,c("theta","beta"))
newdev <- update(fit, data=VS, devFunOnly=TRUE)
newdev(coefs)
I tried that, and it didn't work. (Errors were thrown.) Then you updated the recipe (on 26 April 2019):
coefs <- unlist(getME(f.trn,"theta"))
newdev <- update(f.trn, data=VS, devFunOnly=TRUE,
control=glmerControl(check.nobs.vs.nRE="ignore"))
newdev(coefs)
having remarked that there were *two* issues. (One issue involved the specification of "coefs", the other involved having to "override some checking that glmer does".) In my implementation of the code to do a cross-check with the results from mixed_model() I neglected to take cognizance of the first issue. I.e. I left the assignment of "coefs" as:
coefs <- unlist(getME(fit,c("theta","beta"))
whereas I should have changed it to
coefs <- unlist(getME(f.trn,"theta"))
i.e. leaving out the "beta" term. Duhhhh!!! Having corrected my error I find that the code now works as before. Not surprisingly. Thank you for asking for an explanation of what I was doing, which prompted me to go back over the story with the result that I spotted the botch-up that I had made. (It's very difficult to spot errors when you are flying intellectually blind.) Bottom line: the harmony of the universe now seems to have been essentially restored. :-) Thanks again. cheers, Rolf P. S. In case anyone is interested, the log likelihood of the "validation" data set that I get from mixed_model() in the example that I provided is -35.26048. That from glmer() is -21.37535. I guess that one can say that they are "in the same ball park". Since I have no real understanding of the underlying intricacies, I have no idea what the implications of the difference between the two answers might be. R. T.
Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276