Skip to content

Logistic modelling with guessing parameter

4 messages · Peter Harrison, Ben Bolker, Ken Knoblauch

#
Hello everyone,

I'm currently modelling response data for a musical error detection experiment. Participants' responses can either be correct (1) or incorrect (0); the difficulty of the question is predicted by a number of factors, including accuracy (accurate performance -> errors are difficult to detect) and musical piece ("audio_name").

The model I'm fitting is similar to a 3-PL IRT model, with a constrained guessing parameter of 0.5 (i.e. the most difficult questions should be answered with a 50% success rate). However, instead of the normal conception of item difficulty, probability of correct response is given by a linear combination of difficulty-related predictors (such as accuracy and audio_name). Therefore:

probability of correct response = 0.5 + 0.5 / (1 + exp(predictors))

where predictors = e.g. a*accuracy + b*audio_name + c (but with audio_name dummy coded, with different b for each level of audio_name).

I've been using a script kindly posted by Ken Knoblauch back in 2010 (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q4/004531.html) to achieve this using lme4, simply changing a few terms to keep it up-to-date with current package versions (see bottom of message for the script). It seems to work fine with "accuracy" as a fixed effect and "p_ID" as a random effect. However, I want to add "audio_name" as a fixed effect, i.e. responses ~ accuracy + audio_name + (1|p_ID). But when I do this, I get the following error message: (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate.

I'm not sure what this error message means, but have been assuming it is talking about a convergence problem. I've tried various things to fix it, including control=glmerControl(optimizer="bobyqa"), control=glmerControl(optimizer="Nelder_Mead"), and nAgQ=3, but none of these have worked.

Would anyone be able to help me with this error message, and what to do about it?

Thanks so much!
Peter


Key to data file:

accuracy = accuracy of the music clip (higher accuracy -> question is more difficult) = continuous variable
audio_name = musical piece played = categorical variable (27 levels, coded with text strings)
p_ID = participant ID = categorical variable (229 levels, coded 1:229)
behind_beat = whether clip was behind the beat or not = dichotomous (1 = TRUE, 0 = FALSE)

Script:

library(lme4)
data <- read.csv("https://dl.dropboxusercontent.com/s/szq2e2sxwfsuxo6/data.csv", header = TRUE)

mafc.logit <- function (.m = 2)
{
  .m <- as.integer(.m)
  if (.m < 2)
    stop(".m must be an integer > 1")
  linkfun <- function(mu) {
    mu <- pmax(mu, 1/.m + .Machine$double.eps)
    qlogis((.m * mu - 1)/(.m - 1))
  }
  linkinv <- function(eta) {
    1/.m + (.m - 1)/.m * binomial()$linkinv(eta)
  }
  mu.eta <- function(eta) ((.m - 1)/.m) * binomial()$mu.eta(eta)
  valideta <- function(eta) TRUE
  link <- paste("mafc.logit(", .m, ")", sep = "")
  structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta,
                 valideta = valideta, name = link), class = "link-glm")
}

mod1 <- glmer(responses ~ accuracy + (1|p_ID), family = binomial(mafc.logit(2)), data=data)
summary(mod1)
#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


  Quick possibility: PIRLS failures are often a symptom of the
algorithm hitting a non-finite (infinite/NaN) value internally (this
should be better instrumented).  I would try instrumenting your
link/inverse-link/dmu.eta functions to have them warn/stop if they
would be returning a non-finite value ...
On 15-05-07 01:13 PM, Peter Harrison wrote:
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJVS7pJAAoJEOCV5YRblxUHovcIAMYJkVuhG3eY6fCp8gI85feb
KLDB5K30K+LEF6duOZq2RTOqcLenQUJOrTPRzKpq/t6Hz6w+D0YAVA1ojuHI7/25
+KJrU31ndu2YGW7KB8YGj3BJ6Yn4gGMPD3MIHE0cmKaWYLWZ/+oTjOFhwYx1hqhU
erRaNdzvb4kT3m3YbAEluTuijTtHaMaFisjZiyBZ85FJ3MTc6rvicEaLTRyzul2l
Oduk8eNZvP0AXTIpvKP2SetT7gygsjq1+TAgi41XGvWWvXqqGA0PHC7colXEuAgh
OZKCn2aEb7YowsYCQuQ7In8o0S0UQU2Fzhfw/dewE9uguBXYYtLiOQYfHXK4qTE=
=7VFS
-----END PGP SIGNATURE-----
#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

  PS I had a closer look and most of your individual groups seem to
show complete separation ... this is generally going to be a problem ...

making audio_name a *random* effect seems to work OK.  You could also
look into doing some shrinkage/penalization via blmer ...


library(plyr)
sumvals <- ddply(data,c("audio_name","accuracy"),
      summarise,
      responses=mean(responses))

library(ggplot2); theme_set(theme_bw())
ggplot(data,aes(accuracy,responses,colour=audio_name))+
    stat_sum()+geom_smooth(method="glm",
                           family = binomial(mafc.logit(2)),
                           se=FALSE)+
        geom_point(data=sumvals,alpha=0.25)+
        geom_line(data=sumvals,alpha=0.25)
m1 <- lme4::lmList(response~accuracy|audio_name,
                   family = binomial(mafc.logit(2)),
                   data=data)

debug(lme4::lmList)
ff2 <- function(dat, formula=responses~accuracy) {
    data <- as.data.frame(dat)
    glm(formula=formula, family=binomial(mafc.logit(2)), data)
}
ss <- split(data,data$audio_name)
## 4: OK
sapply(ss,function(d) {
           !any(coef(ff2(d))>1e6)
       })
ff2(ss[[6]])
mod1 <- glmer(responses ~ accuracy + (1|p_ID),
              family = binomial(mafc.logit(2)), data=data)
mod2 <- update(mod1,. ~. + audio_name)
mod3 <- update(mod1,. ~. + (1|audio_name))
summary(mod1)
On 15-05-07 01:13 PM, Peter Harrison wrote:
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJVS78SAAoJEOCV5YRblxUHt2QIAK5g3BvoQxKO3RLO9wwOFiU8
eIz77IpkDDMt+uec6dLp30I1/HlwVUuW2zlpK3mkEy4kAWF7YU8m4BtztvvLrDPX
mHfSH6oEgCqQHRbQwvH1FyqKt2Th7JW3pdfAsDfNKa0ctIwnK7qNmJJSKEG/jWtm
Z2ERNit8U2kHY0eEPhBwjPMBjxRqsR2gBncckqq0KntWw8l8QqBFJM3+qhPzLPB0
ftkA2tJfDkIZ0dqPTHqdeGyzW1IK84Pr8HObDiRBUMOgmS0UNhBXQWYutjCz6pwE
zuFzQoXvtvVBVVVPTgO8lOacX4IEgydYFncKh3L4wM1AdmJx73Wg8wi4eppXtuI=
=vRQM
-----END PGP SIGNATURE-----
#
Peter Harrison <pharr011 at ...> writes:
<< snip >>
back in 2010
/2010q4/004531.html) to achieve this using
error message: (maxstephalfit) PIRLS step-halvings
You ought to be able to use the link directly from 
the psyphy package.  A lot has changed with lme4
and a little with psyphy so that the modification from
then isn't necessary.

When I try your problem, I get
library(lme4)
library(psyphy)
data <- read.csv("https://dl.dropboxusercontent.com/
s/szq2e2sxwfsuxo6/data.csv", header = TRUE)
mod1 <- glmer(responses ~ accuracy + (1|p_ID), 
family = binomial(mafc.logit(2)), data=data)
summary(mod1)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( mafc.logit(2) )
Formula: responses ~ accuracy + (1 | p_ID)
   Data: data

     AIC      BIC   logLik deviance df.resid 
  6742.6   6762.8  -3368.3   6736.6     6180 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9571 -1.1363  0.4520  0.6233  0.9299 

Random effects:
 Groups Name        Variance Std.Dev.
 p_ID   (Intercept) 0.4454   0.6674  
Number of obs: 6183, groups:  p_ID, 229

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   4.1656     0.2771   15.03   <2e-16 ***
accuracy     -5.6355     0.3772  -14.94   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
         (Intr)
accuracy -0.971

with no errors