Skip to content

How to use mixed-effects models on multinomial data

6 messages · Linda Mortensen, Emmanuel Charpentier, Jonathan Baron +1 more

#
Dear list members,
 
In the past, I have used the lmer function to model data sets with crossed random effects (i.e., of subjects and items) and with either a continuous response variable (reaction times) or a binary response variable (correct vs. incorrect response). For the reaction time data, I use the formula:
lmer(response ~ predictor1 * predictor2 ....  + (1 + predictor1 * predictor2 .... | subject) + (1 + predictor1 * predictor2 .... | item), data)
And for the binomial data, I use the formula: 
lmer(response ~ predictor1 * predictor2 ....  + (1 + predictor1 * predictor2 .... | subject) + (1 + predictor1 * predictor2 .... | item), data, family="binomial").
 
I'm currently working on a data set for which the response variable is number of correct items with accuracy ranging from 0 to 5. So, here the response variable is not binomial but multinomial. I want to stay within the mixed-effects model framework, but am not sure how to modify the lmer function formula so that it will work on ordered multinomial data. I am not even sure whether this function can handle this kind of data at all.
I have tried to model the same data using the DPolmm function in the DPpackage, but this function doesn't seem to accept two random effect terms, at least it produces an error message when I enter "random = ..." twice. 
 
Does anyone know which function to use here? Any advice is very much appreciated. 
 
If this mailing list does not deal with inquiries of this kind, I apologise, but would appreciate if someone would re-direct me to another more suitable list. Thanks.   
 
Linda 
 
 
Linda Mortensen
Post-doctoral research fellow
Department of Psychology
University of Copenhagen 
?ster Farimagsgade 2A
1353 Copenhagen K
Denmark
Tel.: +45 3532 4889
E-mail: linda.mortensen at psy.ku.dk
#
Le mercredi 27 mai 2009 ? 18:08 +0200, Linda Mortensen a ?crit :
Huh ?

Treating it as a "pure class" variable loses the (essential) ordering
information. Unless this ordering information (which seems to an
ignorant outsider the most important information about your subjects) is
essentially irrelevant to you problem, I'd rather use your number of
correct items as a "rough" measure of a numeric variable, and accept, as
a first approximation, its non-continuity as part of the experimental
error.

This approximation may be too rough with only 5 items, though.
Furthermore, depending on your beliefs on the cognitive model involved
in giving a "correct" response, the distance between 0 and 1 correct
response(s) may be close to or very different from the distance between
4 and 5 correct responses, which is exactly what proportional risks
model (polr) tries to explain away.

V&R4 (p 204 & sqq), explains that an ordered logistic regression is but
a set of logistic regressions on the (nested) orders induced by the
ordered response. It points to a "seminal" paper : McCullagh (1980) :
regresion models for ordinal data (with discussion), JRSS B 42:109-42,
and to McCullagh's book (to which I do not have access).

Maybe working with glmer's mixed effect logistic regression as a
building block would allow to build somewhat inneficiently) something
close to what polr does ?

What do you think ?
I didn't know this one...
IMHO, you are on the suitable list. But your problem isn't probably very
usual...
#
I had already replied to Linda Mortensen, but Emmanuel Charpentier's
reply gives me the courage to say to the whole list roughly what I
said before, plus a little more.

The assumption that 0-1, 1-2, ... 4-5 are equally spaced measures of
the underlying variable of interest may indeed be incorrect, but so
may the assumption that the difference between 200-300 msec reaction
time is equivalent to the difference between 300-400 msec (etc.).
Failure of the assumptions will lead to some additional error, but, as
argued by Dawes and Corrigan (Psych. Bull., 1974), not much.  (And you
can look at the residuals as a function of the predictions to see how
bad the situation is.)  In general, in my experience (for what that is
worth), you lose far less power by assuming equal spacing than you
lose by using a more "conservative" model that treats the dependent
measure as ordinal only.

Occasionally you may have a theoretical reason for NOT treating the
dependent measure as equally spaced (e.g., when doing conjoint
analysis), or for treating it as equally spaced (e.g., when testing
additive factors in reaction time).

In the former sort of case, it might be appropriate to fit a model to
each subject using some other method, then look at the coefficients
across subjects.  (This is what I did routinely before lmer.)

Jon
On 05/28/09 14:35, Emmanuel Charpentier wrote:
I think that the second random effect term should be (0 + ...), since
there is already an intercept in the first one.

  
    
#
On Thu, May 28, 2009 at 9:24 AM, Jonathan Baron <baron at psych.upenn.edu> wrote:
I'm glad to see you write that, Jonathon.  I don't have a lot of
experience modeling ordinal response data but my impression is that
there is more to lose by resorting to comparatively exotic models for
an ordinal response than by modeling it with a Gaussian "noise" term.
In cases like this where there are six levels, 0 to 5, I think your
suggestion of beginning with a linear mixed-effects model and checking
the residuals for undesirable behavior is a good start.
I don't think so.  It is quite legitimate to have random effects of
the form (1|subject) + (1|item) and the formula above is a
generalization of this.  A additive random effect for each subject is
not confounded with an additive random effect for each item.

I would be a more concerned about the number of random effects per
subject and per item when you have a complex formula like 1 +
predictor1 * predictor2 on the left hand side of the random-effects
term.  If predictor1 and predictor2 are both numeric predictors this
might be justified but I would look at it carefully.
#
Thanks to all of you for your detailed comments. I find them very useful, although some of them point in different directions.

First, I should explain the structure of my data set in more detail: 

In the data set, each "item" is a list of 5 words. In an earlier analysis I carried out on these data, the response variable was the accuracy of recalling each word list (list recall). So, either subjects recalled a list correctly (i.e., recalled all 5 words in the list correctly), or they did not recall the list correctly (i.e., did not recall all 5 words correctly). Because the response in this analysis was binary, I used the mixed logit model. (Note that in my original e-mail, I only wanted to show the general structure of the lmer() formula that I'm using. The formula I'm actually using looks like this: lmer(response ~ predictor1 + predictor2 + predictor2 * predictor3 + (1 + predictor1 + predictor2 + predictor2 * predictor3 | subject) + (1 |item), data, family="binomial"). In short, I have random slopes for my subjects, but no random slopes for my items. This is because all three predictors are item-specific properties, and because I want to control for any variation between subjects in their sensitivity to these properties. On the basis of model comparisons, I then gradually simplify this initial model.

Now, the analysis I'm currently struggling with is carried out on the same data set, but the response variable is now the accuracy of recalling each word in a list (item recall), with subjects recalling either 0, 1, 2, 3, 4, or 5 words correctly. So, there are six, rather than two, possible responses. It is true that for each item, the response is still either correct or incorrect, but since it is the response for the entire list that concerns me, I would describe the responses as multinomial. Below, you see a subset of the trials in my data set:  

Subject Trial Item   W1 W2 W3 W4 W5     Predictor1    Predictor2    Predictor3   Correct

 1           3       9      1    1    0    1   1                  1               1               1               4

 1           4     12      1    0    0    0   1                  1                1               0              2

 1           5       4      0    0    0    0   0                  1                1               1              0

 1           6       6      1    1    1    1    1                 1                2               1              5

 

Profesors Baron's and Bates' suggest that I use a linear mixed-effects model, and as a consequence, disregard the information that is contained in the ordering of my 6 possible responses. They further suggest that I plot the residuals against each of my predictors. This is to get an idea of how well the model fits the observed pattern of each of my predictors, right? If, say, for predictor1 the residuals are very large, that would mean that the model has fitted the pattern of this predictor very poorly, right? I have produced the lmer model and have tried to make the residual plots, but have not succeeded. I can plot the residuals against the fitted values (but have to admit that I find it difficult to make sense of the plot), but how do I make separate plots for each of my predictors? Please let me know if I have misunderstood something here. 

Linda

 

________________________________

Fra: r-sig-mixed-models-bounces at r-project.org p? vegne af Douglas Bates
Sendt: to 28-05-2009 20:13
Til: Jonathan Baron
Cc: r-sig-mixed-models at r-project.org; Emmanuel Charpentier
Emne: Re: [R-sig-ME] How to use mixed-effects models on multinomial data
On Thu, May 28, 2009 at 9:24 AM, Jonathan Baron <baron at psych.upenn.edu> wrote:
I'm glad to see you write that, Jonathon.  I don't have a lot of
experience modeling ordinal response data but my impression is that
there is more to lose by resorting to comparatively exotic models for
an ordinal response than by modeling it with a Gaussian "noise" term.
In cases like this where there are six levels, 0 to 5, I think your
suggestion of beginning with a linear mixed-effects model and checking
the residuals for undesirable behavior is a good start.
I don't think so.  It is quite legitimate to have random effects of
the form (1|subject) + (1|item) and the formula above is a
generalization of this.  A additive random effect for each subject is
not confounded with an additive random effect for each item.

I would be a more concerned about the number of random effects per
subject and per item when you have a complex formula like 1 +
predictor1 * predictor2 on the left hand side of the random-effects
term.  If predictor1 and predictor2 are both numeric predictors this
might be justified but I would look at it carefully.
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
On 05/29/09 19:13, Linda Mortensen wrote:
I don't think you "disregard order."  You simply count the number of
correct recalls, 0-5, and use that number as your dependent variable.
I don't see how that disregards order, unless you meant something else
by order, like the order in which the items were recalled.  What you
disregard is "ordinal regression".

 They further suggest that I plot the residuals against each of
I didn't mean "each predictor."  Rather, plot a graph of the residual
as a function of the predicted response, just as you would do with
ordinary regression.  (It is one of the default outputs for lm().)

 I have produced the lmer model and have tried to make the
Yes.  I think you have what you want.  If the residuals are in one
direction (high, or low) at one end or the other (left side or right
side), then your assumption that the response is predicted linearly
from the predictors is wrong.  You can also check for
homeoscedascicity.  (My spell checker chokes on this one no matter how
I spell it.)

Jon