Skip to content

by-item random intercepts

3 messages · Chunyun Ma, Jake Westfall

#
Hello all,

I am facing a dilemma of whether or not I should include by-item random
intercepts in my model. Here are the details of my problem.

I have a dataset of repeated measure in which participants solved
single-digit arithmetic problems (e.g., 4x5, 2+7, ) and their response
latencies were recorded.

The dependent variable is response latency. The independent variables
include characteristics of the stimuli (i.e., level 1) and of the
participants (i.e., level 2).

I set up the structure of random effects following recommendations from
Barr et al. (2013). For simplicity, let's say the model contains one IV.

DVti = gamma00 + gamma10IVti + u0i + u1iIVti + I0i + rti

gamma00, gamma10 are fixed effects
u0i is the random intercept
u1j is the random slope
I0i is the by-item random intercept
rti is the residual

I used lme4 to test the model
lmer(DV ~ IV + (1 + IV|sub) + (1|item), data= DT)

As I mentioned, the stimuli in my experiments are single-digit arithmetic
problems. Unlike stimuli such as English words, there are only 100
single-digit arithmetic problems for each operation and all of them were
included in my experiment. So here is my dilemma:

On one hand, a random by-item intercept would allow me to account for the
fact that there are repeated observations on each item and they are not
independent from each other.
On the other, a random by-item intercept implies there exists more items
which were not included in my experiment. However, this is not the case. I
have included all single-digit arithmetic problems in my experiment.

I could adopt a fixed-effect approach and use 100 dummy variables to
account for the item-based clustering but this would be practically
impossible.

To iterate my question:
should I include a random by-item intercept given the special feature of my
dataset?
A few follow-up questions:
what's the consequence of including/excluding this random effect?  How are
type-I error and power affected?
Should I use a nested structure instead of the crossed one I have mentioned
above? For example, if each participant contributed multiple observations
on each item, should I nest the by-item random intercept under subject?

Thank you very much!

Chunyun
#
Hi Chunyun,

As I mentioned, the stimuli in my experiments are single-digit arithmetic
If you've really exhaustively sampled all possible stimuli that could have
appeared in your study, then I would argue that it doesn't make conceptual
sense to analyze the stimuli as random effects.

I could adopt a fixed-effect approach and use 100 dummy variables to
Is it? Have you tried it? Adding fixed effects usually increases the
computational burden *far* less than adding random effects. So while this
analysis might be a bit unwieldy, is it actually infeasible?

If the answer is yes, then a reasonable alternative is to simply ignore the
stimulus effects altogether. Practically speaking, the result is usually
much the same as explicitly adding stimulus fixed effects to the model. The
reason is because ignoring the stimulus effects (vs. adding them as fixed)
mainly just serves to throw the stimulus variance into the residual
variance, but unless your experiment is quite tiny, the residual variance
probably already contributes *very* little to the standard errors of the
fixed effect parameter estimates of interest. (Getting more into the
mathematical weeds, the residual variance enters the standard error
*roughly* as var(resid)/sqrt(n), where n is the number of rows -- this term
is probably already tiny unless your experiment is tiny, and it should
remain tiny even if you increase var(resid) by a lot.)

Note however that the above is assuming that the stimulus effects are at
best weakly correlated with the other regressors. That assumption is likely
true in an experimental context, but to the extent that it is false,
omitting the stimulus effects could also alter the other fixed effect
parameter estimates.

Should I use a nested structure instead of the crossed one I have mentioned
I don't see why you would do that.

Jake
On Wed, Sep 13, 2017 at 9:02 PM, Chunyun Ma <mcypsy at gmail.com> wrote:

            

  
  
#
Er, small correction, I meant that the residual variance enters the
standard error as roughly sqrt(var(resid)/n), not var(resid)/sqrt(n) :p

Jake

On Wed, Sep 13, 2017 at 9:21 PM, Jake Westfall <jake.a.westfall at gmail.com>
wrote: