Dear Paolo,
Using Subject (or Participant, if you want to avoid some ambiguity in a linguistic context) and Item as grouping factors is fairly standard in the ERP literature as is use LMM with the mean amplitude in a given time window as the dependent variable (at least when mixed models are used, many colleagues seem reticent to abandon ANOVA despite Clark 1973 and Judd, Westfall and Kenny 2012 and many other papers emphasising the advantages of explicit regression over ANOVA). GAMMs over the whole time course of the ERP are still relatively new and not widely used, although people like Harald Baayen and Stefanie Nickels are working on this. Beyond the enhanced computational complexity, GAMMs also suffer from the whole *additive* bit, which can be addressed, but is difficult for cognitive neuroscience, where the interactions are often the most interesting bits.
I hesitate to use Channel as a grouping variable, although this is the approach taken by e.g. Payne et al (2015), because the the distribution of effects for channels is not multivariate normal (the assumed distribution in lme4) for most references. Indeed, we know that Channel effects vary systematically (this the whole notion of "topography" in EEG), and I personally feel that we should actually model channel effects parametrically using a suitable coordinate system, such as the one that the 10-20 system is actually based on (angular deviations from the apical electrode). However, this is again much more complex. Including Channel as a categorical fixed effect is also not particularly satisfying as this will add n_chans-1 coefficients for the main effect of channel as well as many interaction terms. You could potentially have regions of interest (ROIs) / topographical factors (left-right, anterior-posterior) in your model fixed effects and then either ignore channel (as is actually done for the traditional rmANOVA analysis of ERP data) or include an intercept-only random-effect term for channel under the assumption that there is a multivariate normal distribution of effects within a given ROI. However, this assumption will generally only hold for high-density setups with topographically small ROIs. Larger ROIs will of course show systematic variation as you move from one edge to another. And you will also run into problems if the number of channels within each ROI are small as this will bias your random-effect estimates: remember that random effects are *variance* components and like all variance estimates, they require several observed levels for accurate estimation. (One rule of thumb I've heard is 10ish.) And as Payne et al saw in their own data, the channel factor typically doesn't help with model fit anyway and can hurt convergence, so I would just leave it out completely if you don't want to model it parametrically.
I'm not sure why you mentioned "semantic category" in your random-effect structure. In my experience, semantic category is typically something for which we care about the individual levels (of which there are not that many in any one experiment) and so are better modelled by fixed effects. (In other words, we care about the differences in processing between Furniture and People, not just that different categories show differences.) Items are good random effects, semantic categories are not. And it's not a problem if each item only belong to certain semantic categories. lme4 can handle such nesting structures. If you only have a few semantic categories, then you'll also run into computational / statistical trouble with treating them as random effects (see the last paragraph).
In short, I would propose the following model structure:
mean_voltage ~ 1 + typicality * education * frequency * semantic_category + (1+... | subject) + (1+ ... | item)
Your particular choice of which slopes to include for each random-effect grouping term is a difficult one, as has been highlighted by the Baayen et al (2008), Barr et al (2013), Barr (2013) and the recent set of Bates preprints on parsimonious mixed models as well as a number of threads on this mailing list. Generally, I start off with main effects and if that model converges, great, if not, then I reduce more. In my experience with EEG studies on language, interactions in the random-effects structure just lead to overly complex models that take a long time to compute, fail to converge or show others signs of being degenerate. In other words, I would consider the following RE structure for your data:
(1 + typicality + frequency + semantic_category | subject) + (1 + education | item)
I left a lot out of the RE structure for Item because, assuming that each Item represents a single lemma / word, then it doesn't have different frequencies / categories / typicalities and so it doesn't make sense to consider a variable effect for something that is constant within the grouping unit. Similarly for education and subject.
If you don't model semantic category explicitly, then your item random effect should absorb the variance due to it. You just won't have an explicit term in the model to point to that only describes the effect of semantic category (as item-level variance will cover a whole host of other effects related to the differences between words).
(For posterity -- I think we discussed some of these issues previously on r-help: https://stat.ethz.ch/pipermail/r-help/2015-September/432561.html )
To address some of your explicit questions more directly:
- Am I allowed to use the same complex random structure to compare the
likelihood of models that have "simpler" fixed effects? In principle I
guess it is correct to have the same random structure across comparisons.
Not quite. You should not have random slopes for effects not in the fixed-effect model structure because the mixed-model formulation used by lme4 assumes zero-mean for the random effects. In other words, lme4 random effects are estimates of how much the different grouping factors lead to variance around the population-level estimates delivered by the fixed effects.
- I am not interested in the effect of serial presentation (trial
order), as it increases the order of the highest interaction. Is it
appropriate to use it in the random structure only, or should I always
discuss it in interaction with my factors of interest?
No, for the reason above. But you could have the order of serial presentation a non-interacting / main-effect only fixed effect. Also, if you did the usual thing and you counterbalanced presentation order (e.g. via several different pseudo-random presentation orders/lists) across participants, then the usual assumption is that any effects of presentation order cancel out across participants. The item grouping factor will also absorb some of this variance.
Best,
Phillip
On 8 Nov 2016, at 21:48, Paolo Canal <paolo.canal at iusspavia.it> wrote:
Dear Mixed-Group,
I have acquired my data from one Experiment using a rather common
paradigm in psycholinguistics. The experiment aimed at investigating the
electro-physiological correlates of reading Typical (e.g., /chair/) vs
Atypical (e.g., /foot rest/) members of a number (N=85) of semantic
categories (e.g., /a kind of //Furniture/). In particular, we were
interested in looking at differences associated with Education level
(University N=24 vs non-University students N=23), and a three
individual predictors. My issue is how to deal with some factors that
are absolutely important in allowing for a better fit of the model, but
make interpretations too "complicated".
The two main factors of interest thus Typicality (categorical, Typical
vs Atypical) and Education (categorical, Hi vs Low Education). I already
know that the choice of taking these factors as dichotomic is
questionable, but I believe, defensible: in fact, although the measure
of Typicality is actually continuous (a proportion varying from 0 to 1)
it is paired within each semantic category, because when we selected the
materials we took the pair of exemplars that showed the largest
difference in Typicality, so within each category is the difference in
typicality that actually matters. Treating Education as categorical is
less defensible, but in some way we wanted to compare the predictive
power of this variable with more continuous variables representing a set
of abilities (3 cognitive measure, one of which moderated by years of
education and age), in some way to possibly show that some brain
mechanisms are better described when accounting for individual variation
rather than group differences.
I used lmer in lme4 to analyze the effect of my independent variables on
the average EEG voltage (continuous) from a set of EEG channels in two
different time-windows of interest (I know GAMM would be even more
appropriate than LMM, as what I am dealing with here are time-series,
but I am not yet ready to try).
I first determined the random effect structure, selecting three grouping
factors (subject, semantic category and channel) which are clusters of
repeated measures: for each item I have several subjects, for each
subject I have several items and for each channel I have several items
and subjects (perhaps channel might be nested in subject and item rather
than stand alone, any hints?). For each grouping factor, I allowed
intercepts to vary (e.g., 1|subject). Moreover, because I wanted to be
conservative and data are rather malleable (no convergence failure, no
variance = 0 or 1, not too high correlations between terms) I included a
set of terms to adjust by-subject and by-item slopes. I allowed
by-subject and by-item slope adjustments for Typicality (as it varies
within subjects and within semantic category) and by-item slope
adjustments for Education level.
Things get more complicated when thinking of the influence of two
variables that actually account for a lot of variation in the data:
frequency of use of words and trial order. The first variable is also
theoretically important and I want to include it as fixed effect; the
second variable increases models' fit but because it makes the results
less straightforward to interpret, I would not like to include in the
fixed part of the model.
This brings me to the fixed effect structure and the actual questions to
the list:
The initial design was very simple (2X2 plus covariates). The strategy
was to fit the simple model Typicality + Frequency and evaluate if
adding the interaction between Education (or the three covariates) and
Typicality leads to relevant increase in likelihood, using always with
the same random structure (the complex one).
Now I am not so sure this is appropriate and I have a list of doubts:
- Am I allowed to use the same complex random structure to compare the
likelihood of models that have "simpler" fixed effects? In principle I
guess it is correct to have the same random structure across comparisons.
- I am not interested in the effect of serial presentation (trial
order), as it increases the order of the highest interaction. Is it
appropriate to use it in the random structure only, or should I always
discuss it in interaction with my factors of interest?
Thanks for any help
Paolo
[[alternative HTML version deleted]]