Skip to content

IRT: discrimination parameters from LMER?

8 messages · dave fournier, Paul Johnson, Doran, Harold +1 more

#
On Sat, Oct 30, 2010 at 4:54 PM, Doran, Harold <HDoran at air.org> wrote:
Thanks for answering!

I'm glad to learn I'm not grossly misunderstanding IRT/lmer. What do
you think about my observation about the fact that the difficulty
estimate from ltm seems to be systematically different/better than the
lmer estimate?

Do you care to weigh in on the question of whether lmer could ever be
made to fit a 2PL or 3PL model?  I mean, could I write a family for
lmer that would do that, or is the calculation required too grossly
different?

If 2PL or 3PL is really a whole different structure, I don't mind
using "ltm."  But defeats the ideal that you proposed in that 2007 JSS
article (which I really like!), that we might be better off fitting
specialized models with general purpose software.

Here's a similar example.  Sometimes I have wanted to fit ordinal
dependent variables, and have learned that those models do not fall
into the family of GLM and a tool different from lmer is necessary to
work on them.   I wish it weren't so, but have accepted reality.

pj

  
    
#
I don't think this is true necessarily. I've done a small experiment all reproducible code is at bottom of email. I simulate some rasch data and then estimate item parameters using ltm, lmer, jml (in MiscPsycho), and mml (a function I wrote that is below using the traditional mml methods). To make a long story short, I compute the bias under each model.
[1] -0.173386111948639
[1] -0.173386111948639
[1] 0.39007242168395
[1] -0.173386111948639

It seems to me that jml (as expected) performs the worst, but all other functions are comparable and seem to recover parameter estimates rather well. If you run the mml function below, it is slow. But, notice that the standard deviation in mml is exactly the same as what lmer gives. That is because a multilevel rasch model with the parameterization I have used (items as fixed respondents as random) is exactly the same as mml. But, lmer's capacity can go much, much further. Had I not constrained the ltm function to have a discrimination of 1, the common discrimination parameter would also be the sd given by mml and lmer.
Yes, I'm sure it can. The issue is who will write the code. Doug Bates is extremely generous with his help and his openness to others making contributions to lme4. But, (as he might do so on this thread), he will forewarn you that the underlying code in lmer is very complex and doesn't resemble methods commonly used in other packages. So, because Doug is so amazingly smart, and I am just a mortal, I have decided to steer away from contributing to lmer other than being a "toy tester" for Doug.

I am not a fan of the 3PL for many reasons. First, it is a model that cannot be identified without putting a very slim prior on the guessing parameter. In the end, because the prior is so slim, guess what your parameter estimates are close to? Yes, the prior. So, I just have never been made comfortable with that.
I think ltm is a really great package. It may have some capacity to do what you want, I don't fully know its capacities. Perhaps Dimitris can offer some thought.
This is very true. I have been doing some work on polytomous models, but am working very slowly on functions to estimate them. There are packages in R that can currently do this. I wonder if MCMCglmm can do it; I don't know.


library(ltm)
library(lmer)
library(MiscPsycho)

Nitems <- 20
Nperson <- 500

set.seed(12345)
dat <- simRasch(Nperson, Nitems) 
itemDat <- dat$data

### Fit data using rasch in LTM
fm1 <- rasch(itemDat, constraint = cbind(ncol(itemDat) + 1, 1))

### Fit using lmer
itemDat$id <- 1:nrow(itemDat)
### reshape into long format
testScores <- reshape(itemDat, idvar='id', varying=list(names(itemDat)[1:Nitems]), v.names=c('answer'), timevar='question', direction='long')

# Treat items as fixed but students as random
fm2 <- lmer(answer ~ factor(question) - 1 + (1|id), testScores, family = binomial(link='logit'))


### Fit using JML
fm3 <- jml(dat$data, bias=TRUE)

### Fit using mml function below
fm4 <- mml(dat$data, Q =10, startVal = coef(fm1)[,1])

### Compute bias
trueVals <- dat$gen

### ltm bias
mean((coef(fm1)[,1] - mean(coef(fm1)[,1])) - trueVals)

### lmer bias
mean((fixef(fm2) - mean(fixef(fm2))) - trueVals)

### jml bias
mean(coef(fm1) - trueVals)

### mml bias
mean((coef(fm4)[1:20] - mean(coef(fm4)[1:20])) - trueVals)



mml <- function(data, Q, startVal = NULL, ...){
	if(!is.null(startVal) && length(startVal) != ncol(data) ){
		stop("Length of argument startVal not equal to the number of parameters estimated")
	}	
	if(!is.null(startVal)){
		startVal <- startVal
		} else {
		p <- colMeans(data)
		startVal <- as.vector(log((1 - p)/p))
	}
	qq <- gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
	rr1 <- matrix(0, nrow = Q, ncol = nrow(data))
	data <- as.matrix(data)
	L <- nrow(data)
	C <- ncol(data)
	u <- 0
	fn <- function(b){
	s <- b[C+1]
	b <- b[1:C]	
		for(j in 1:Q){
			for(i in 1:L){
				rr1[j,i] <- exp(sum(dbinom(data[i,], 1, plogis(qq$nodes[j]-b), log = TRUE))) * 
				((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-u)^2/(2*s^2))))/dnorm(qq$nodes[j]) * qq$weights[j]
			}
		}
	-sum(log(colSums(rr1)))
	}	
	opt <- optim(c(startVal,1), fn, method = "BFGS", hessian = TRUE)
	list("coefficients" = opt$par, "LogLik" = -opt$value, "Std.Error" = sqrt(diag(solve(opt$hessian))))
}
#
Just noticed the bias for jml was wrong. It should be
[1] -0.173386111948639
#
On 11/2/2010 3:33 PM, Doran, Harold wrote:
well, the basic capabilities of ltm are described in the JSS paper 
(http://www.jstatsoft.org/v17/a5/paper), and more information about it 
including R script files with many example can be found in the R-wiki 
webpage: http://rwiki.sciviews.org/doku.php?id=packages:cran:ltm

Best,
Dimitris

  
    
4 days later
#
Dear Harold.

Thanks for the example.  I've not heard of MiscPsycho before, that's
interesting.

Your test case focuses on difficulty parameters, and that's not where
I'm seeing trouble.  I agree with you difficulty estimates match the
fixed effects from lmer.

The discrimination estimate from ltm compared against lmer's random
effect estimate is my concern.  If the discrimination parameter is not
1.0, then I wonder if your argument holds up.  In that one simulation
I reported last time, I saw pretty big difference.

I'm adapting your simulation to do a lot of data sets.

But I can't amaze you with my result now because of this weird kernel
error I've never seen before, so I think I need to close everything
down. But I'll get back to you.
OMP: Error #15: Initializing libguide.so, but found libguide.so
already initialized.
OMP: Hint: This may cause performance degradation and correctness
issues. Set environment variable KMP_DUPLICATE_LIB_OK=TRUE to ignore
this problem and force the program to continue anyway. Please note
that the use of KMP_DUPLICATE_LIB_OK is unsupported and using it may
cause undefined behavior. For more information, please contact
Intel(R) Premier Support.


Weird, huh?

pj

  
    
1 day later
#
I can't speak to the error below, perhaps the package author/maintainer can do so. I'm not sure I follow this last thread exactly. But, here is what you should be seeing, if not I too think there is an issue.

1) When you use lmer to fit the rasch model, you get b-parameters for the item (difficulties) and a population distribution that is N(0, s^2)
2) When you use the rasch function in ltm and the discrimination is not set at 1, then the items have a common slope. This common slope should be equivalent to s (the standard deviation from lmer). The parameter estimates will differ because they are identified differently. I think if you want to compare the difficulties from ltm to lmer then you need b/s where b is the difficulty of the item.

Dimitris, does rasch use D = 1 or 1.7 when the discrimination is not 1?
#
Paul,

Just a quick example of what I meant if you want to compare item parameters between lmer and rasch with discrimination estimated at some value other than 1. I figured I might as well make this thread more replicable.

Here is some code to simulate the data.

Nitems <- 20
Nperson <- 500

set.seed(12345)
dat <- simRasch(Nperson, Nitems) 
itemDat <- dat$data

### Fit data using rasch in LTM
fm1 <- rasch(itemDat)

### Fit using lmer
itemDat$id <- 1:nrow(itemDat)
### reshape into long format
testScores <- reshape(itemDat, idvar='id', varying=list(names(itemDat)[1:Nitems]), v.names=c('answer'), timevar='question', direction='long')

# Treat items as fixed but students as random
fm2 <- lmer(answer ~ factor(question) - 1 + (1|id), testScores, family = binomial(link='logit'))

Now, if you do:
The estimates are different. I won't paste all the output here since the code is replicable and the seed is set so we can compare output. Now, the discrimination value from rasch in this case is .949 and the standard deviation of the random effects from lmer is .946. Now, the log-likelihood under rasch and lmer is -4630.15799618799 and -4631.25819268, respectively. So, the sd and the log-like are pretty much the same between the different functions, so good so far. But, the item difficulties appear different. Why is this?

If you think about the parameterization of the rasch model, you have an "a" parameter, which is the slope of the item. Under lmer, that value is fixed at 1 for all items. But, for rasch, it is estimated (the value is .949) and it is common across all items. 

So now, if you want to compare the difficulties between rasch and lmer when the discrimination is not 1 under rasch, then you need to do:
V1                  V2                  V3                  V4                  V5                  V6                  V7 
-2.4363375076723894  2.1282095647448513 -3.2456020170777404 -0.9922074612193649  1.0940596551348656 -1.4747653022333751 -1.9674236315390756 
                 V8                  V9                 V10                 V11                 V12                 V13                 V14 
 1.1286690341041918  3.0183007620878102  2.7669665964796186 -0.5520550496701296  2.2850817917913258  0.6125781984255096 -0.8937278122026954 
                V15                 V16                 V17                 V18                 V19                 V20 
 1.7286473472431816  0.0286736514649762 -2.7387309033941731  1.2960560094822793 -0.6529545363519687  1.3839487252561851

Now if we look at the lmer difficulties we have
factor(question)1   factor(question)2   factor(question)3   factor(question)4   factor(question)5   factor(question)6   factor(question)7 
 2.4361348266711036 -2.1291740694356793  3.2416948820925300  0.9945299229893042 -1.0961119907031844  1.4771362900874057  1.9688762349930169 
  factor(question)8   factor(question)9  factor(question)10  factor(question)11  factor(question)12  factor(question)13  factor(question)14 
-1.1307455947353959 -3.0161274136566423 -2.7657709971218583  0.5536329523565644 -2.2855834341023469 -0.6139641860890157  0.8959339499969335 
 factor(question)15  factor(question)16  factor(question)17  factor(question)18  factor(question)19  factor(question)20 
-1.7304539905559007 -0.0286143300422378  2.7371985048319152 -1.2981830712160072  0.6547480328455485 -1.3866471195258758

So, we can see that lmer and rasch generate the same estimates of all parameters. Is this helpful?