I'm using lme4 0.999999-0 to fit some 0/1 response data with a logistic regression and I get glm.fit: "fitted probabilities numerically 0 or 1 occurred". I've read Ted Harding's explanation <http://r.789695.n4.nabble.com/glm-fit-quot-fitted-probabilities-numerically-0-or-1-occurred-quot-td849242.html>, but I still don't know how to work around this. More specifically, this is a Rasch-style model. When I fit a smaller model that has only effects (intercepts), not covariates, glmer() converges fine. When I add a matrix of covariates, which are mainly low-frequency counts (lots of zeros), I get "fitted probabilities numerically 0 or 1 occurred". I can fit the model including covariates under JAGS, but I was hoping to be able to fit it under lmer because MCMC can be very slow. Perhaps there are some ways I could reparametrize the model? Another idea is that calling glmer(..., verbose=TRUE) gives the error on the very first iteration, i.e., 0: nan: 0.180462 0.0723839 ... Warning messages: 1: glm.fit: fitted probabilities numerically 0 or 1 occurred 2: In mer_finalize(ans) : gr cannot be computed at initial par (65) Would it be worth trying to specify good starting values? If so, is there a way to extract estimates from an object of class mer that can be easily passed to glmer(..., start=...)?
working around glm.fit: "fitted probabilities numerically 0 or 1 occurred"
4 messages · Chris Howden, S Ellison, Jack Tanner
Jack this could be occurring due to separation. Which is when a predictor perfectly separates your response into 1's or 0's. U can check for this by running some tables, perhaps guided by removing predictors form your model until u no longer receive the message. The last predictor removed is then perhaps causing separation. Next think about what this means. We do this analysis to understand what predictors are related to an event. Well if the event either always occurs or never occurs for some values of a predictor than maybe I don't need an analysis to tell me what's going on? When I get this I often find I can simply remove the predictor from the model, but comment on its effect in my discussion. Or if I am doing predictive modelling than I use a 2 stage model, that predicts using a predictor *response table and the a logistic regression for those cells that don't have prob 0 or 1. Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) chris at trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement.
On 24/08/2012, at 1:40, Jack Tanner <ihok at hotmail.com> wrote:
I'm using lme4 0.999999-0 to fit some 0/1 response data with a logistic regression and I get glm.fit: "fitted probabilities numerically 0 or 1 occurred". I've read Ted Harding's explanation <http://r.789695.n4.nabble.com/glm-fit-quot-fitted-probabilities-numerically-0-or-1-occurred-quot-td849242.html>, but I still don't know how to work around this. More specifically, this is a Rasch-style model. When I fit a smaller model that has only effects (intercepts), not covariates, glmer() converges fine. When I add a matrix of covariates, which are mainly low-frequency counts (lots of zeros), I get "fitted probabilities numerically 0 or 1 occurred". I can fit the model including covariates under JAGS, but I was hoping to be able to fit it under lmer because MCMC can be very slow. Perhaps there are some ways I could reparametrize the model? Another idea is that calling glmer(..., verbose=TRUE) gives the error on the very first iteration, i.e., 0: nan: 0.180462 0.0723839 ... Warning messages: 1: glm.fit: fitted probabilities numerically 0 or 1 occurred 2: In mer_finalize(ans) : gr cannot be computed at initial par (65) Would it be worth trying to specify good starting values? If so, is there a way to extract estimates from an object of class mer that can be easily passed to glmer(..., start=...)?
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
This does not necessarily mean the fit has not converged; it is usually just a warning that some predicted probabilities at some point during the process of fitting were so close to 0 or 1 that they cannot be properly represented in finite precision arithmetic. That does not, of itself, prevent convergence, and you have already nicely demonstrated that the problem occurred on early iterations but not later, better, estimates.
Are you sure you need to work round it?
S
*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}
On 8/24/2012 6:43 AM, S Ellison wrote:
This does not necessarily mean the fit has not converged; it is usually just a warning that some predicted probabilities at some point during the process of fitting were so close to 0 or 1 that they cannot be properly represented in finite precision arithmetic. That does not, of itself, prevent convergence, and you have already nicely demonstrated that the problem occurred on early iterations but not later, better, estimates. Are you sure you need to work round it?
I got it! But I still have a question. Why does the start parameter to lmer() only take ST and fixef, and not starting ranef values? I did need to work around it lmer's warnings. On the one hand, lmer did produce a fitted mer object. On the other, it printed only a single iteration under verbose=TRUE. It never performed additional iterations. The estimates from verbose contained a lot of NaN values. The warnings I got were Warning messages: 1: glm.fit: fitted probabilities numerically 0 or 1 occurred 2: In mer_finalize(ans) : gr cannot be computed at initial par (65) What worked was that I initialized a new run of lmer(..., start=my_start), where my_start used ST and fixef values from a successful fit that did not include the covariant matrices. The first time I tried this, I got an additional warning. Warning messages: 1: glm.fit: fitted probabilities numerically 0 or 1 occurred 2: In sort(names(start)) == sort(names(FL)) : longer object length is not a multiple of shorter object length 3: In mer_finalize(ans) : gr cannot be computed at initial par (65) The new warning (2) made sense, because the covariant matrices require the estimation of additional parameters not included in the previous fit. I padded the fixef component of my_start with some rnorm(mean=0, sd=.2) values, and presently, lmer is on iteration 67! So, why doesn't lmer's start parameter take ranef values?