Skip to content
Prev 15008 / 20628 Next

Specify the appropriate model for an Event Related Potentials (ERPs) study: what should I do with trial order (and other terms)

Dear Phillip,

Thanks for your reply, I will try to better motivate my choices and add 
further observations.
On 10/11/2016 13:26, Phillip Alday wrote:
Thanks for the links. I have already coded my categorical variable with 
sum, or actually with deviation coding. Continuous predictors were 
centered around 0 instead. This should also help in reducing correlation 
in the variance-covariance matrices, and indeed the correlations are 
less than 0.2.
I would add that also the work of van Rij and Wieling is inspiring.
I think we already had an exchange on this point, a few months ago. I 
may have not specified that the analysis are carried out on two (small) 
subsets of contiguous electrodes (11 and 13 electrodes out of 60), where 
I would assume that all electrodes are representative of the effect, 
with some variation. For this reason I included channel as an 
intercept-only random-effect term. I keep on doing this because model's 
fit increases and I had no problems in convergence. I do not use this 
factor in the fixed effect structure as I am assuming that differences 
between selected electrodes are not relevant.
I do not think this is the way to go for our experimental design. In 
fact we focused on a large number (85) of semantic categories (maybe you 
are thinking of a handful of categories such as living/non-living) and 
we were interested in Typicality across all these categories. And I also 
think that it is correct to model random slopes for Typicality for each 
different semantic category because for each semantic category we have 2 
different words (one Typical and one Atypical member) so the 
manipulation is within item and the single item is the semantic 
category. I believe that the inclusion of Typicality as random slope for 
item (semantic categories) serves the idea of making the analysis more 
conservative, and the model "more" maximal: in fact each semantic 
category is associated with larger or smaller differences in typicality 
between the pair of words, and adding the slopes should relax the 
assumption that this difference is the same across categories.
I would agree with you if we were looking at differences between few 
categories. But because we are not, I guess the following would be ok:

mean_voltage ~ 1 + typicality * education * frequency + (1+... | 
subject) + (1+ ... | item) [not necessarily + (1|ch)]
As explained above I would keep slopes for item because item is not a 
single lemma but a pair of lemmas.

(1 +  typicality + frequency  | subject) + (1 +  education + typicality  
| item)

Now, this is a random structure that I like because it is simpler than 
the one I had.
OK so the random adjustments for trial order would only mitigate the 
population effect of trial order but does not help in better estimate 
the the other terms ("absorbing" some variance) if I do not include the 
term in the fixed effect structure. If I include it only in the fixed 
part, as a main effect, the model can explain the variance associated 
with fatigue or adaptation to the experimental setting, but it should 
not affect (interact with) the manipulation of Typicality, if lists were 
properly "randomized" (so I can motivate the choice of not looking at 
the interaction between trial and experimental factors)... it makes sense.
To sum up. I will start with the following model, driven by the 
experimental design:

mod_between=lmer(eeg~trial + typicality * frequency * edu ...)

the random structure will be conservative but not over-specified:

(1 +  typicality + frequency  | subject) + (1 +  education + typicality  
| item)

to test the influence of cognitive variables on the ERP pattern 
associated with typicality I will keep the very same random structure 
and perform likelihood ratio tests between nested models such as

model_PredX=lmer(eeg~trial + typicality * frequency * edu + PredictorX ...)
model_PredXinteraction=lmer(eeg~trial + typicality * frequency * edu + 
PredictorX + typicality:PredictorX ...)
anova(model_PredX,model_PredXinteraction)

Keeping a less complex random structure has also the benefit of saving 
some degrees of freedom thus allowing for more easily detect higher 
order interactions.

My mind seems now a little bit clearer.
Thank you very much!
Paolo