Skip to content

Related fixed and random factors and planned comparisons in a 2x2 design

6 messages · Houslay, Tom, Phillip Alday, Paul Bivand

#
Hi Paul,

I don't think anyone's responded to this yet, but my main point would be that you should check out Schielzeth & Nakagawa's 2012 paper 'Nested by design' ( http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210x.2012.00251.x/abstract ) for a nice rundown on structuring your model for this type of data. 

It may also be worth thinking about how random intercepts work in a visual sense; there are a variety of tools that help you do this from a model (packages sjplot, visreg, broom), or you can just plot different levels yourself (eg consider plotting the means for AP, AQ, BP, BQ; the same with mean values from each individual overplotted around these group means; and even the group means with all points shown, perhaps coloured by individual - ggplot is really useful for getting this type of figure together quickly).

As to some of your other questions:

1) You need to keep participant ID in. I'm not 100% on your data structure from the question, but you certainly seem to have repeated measures for individuals (I'm assuming that groups A and B each contain multiple individuals, none of whom were in both groups, and each of which were shown both objects P and Q, in a random order). It's not surprising that the effects of group are weakened if you remove participant ID, because you're then effectively entering pseudoreplication into your model (ie, telling your model that all the data points within a group are independent, when that isn't the case).

2) I think channel should be nested within individual, with a model something like model <- lmer(voltage ~ group * item + (1|participant/channel), data = ...)

3) This really depends on what your interest is. If you simply want to show that there is an overall interaction effect, then your p-value from a likelihood ratio test of the model with/without the interaction term gives significance of this interaction, and then a plot of predicted values for the fixed effects (w/ data overplotted if possible) should show the trends. You could also use binary dummy variables to make more explicit contrasts, but it's worth reading up on these a bit more. I don't really use these type of comparisons very much, so I can't comment further I'm afraid.

4) Your item is like treatment in this case - you appear to be more interested in the effect of different items (rather than how much variation 'item' explains), so keep this as a fixed effect and not as random.

Hope some of this is useful,

Tom


________________________________________


Message: 1
Date: Fri, 3 Jun 2016 14:28:59 +0200
From: paul <graftedlife at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Related fixed and random factors and planned
        comparisons     in a 2x2 design
Message-ID:
        <CALS4JYfoTbhwhy8S0kHePuw9pPv-NTkrsLrB2Z2YO5ks5gnnOA at mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"

Dear All,

I am trying to use mixed-effect modeling to analyze brain wave data from
two groups of participants when they were presented with two distinct
stimulus. The data points (scalp voltage) were gathered from the same set
of 9 nearby channels from each participant. And so I have the following
factors:

   - voltage: the dependent variable
   - group: the between-participant/within-item variable for groups A and B
   - item: the within-participant variable (note there are exactly only 2
   items, P and Q)
   - participant: identifying each participant across the two groups
   - channel: identifying each channel (note that data from these channels
   in a nearby region tend to display similar, thus correlated, patterns in
   the same participant)

The hypothesis is that only group B will show difference between P and Q
(i.e., there should be an interaction effect). So I established a
mixed-effect model using the lme4 package in R:

model <- lmer(voltage~1+group+item+(group:item)+(1|participant)+(1|channel),
              data=data, REML=FALSE)

Questions:

   1.

   I'm not sure if it is reasonable to add in participant as a random
   effect, because it is related to group and seems to weaken the effects of
   group. Would it be all right if I don't add it in?
   2.

   Because the data from nearby channels of the same participant tend to be
   correlated, I'm not sure if modeling participant and channel as crossed
   random effects is all right. But meanwhile it seems also strange if I treat
   channel as nested within participant, because they are the same set of
   channels across participants.
   3.

   The interaction term is significant. But how should planned comparisons
   be done (e.g., differences between groups A and B for P) or is it even
   necessary to run planned comparisons? I saw suggestions for t-tests,
   lsmeans, glht, or for more complicated methods such as breaking down the
   model and subsetting the data:

   data[, P_True:=(item=="P")]
   posthoc<-lmer(voltage~1+group
       +(1|participant)+1|channel)
       , data=data[item=="P"]
       , subset=data$P_True
       , REML=FALSE)

   But especially here comparing only between two groups while modeling
   participant as a random effect seems detrimental to the group effects. And
   I'm not sure if it is really OK to do so. On the other hand, because the
   data still contain non-independent data points (from nearby channels), I'm
   not sure if simply using t-tests is all right. Will non-parametric tests
   (e.g., Wilcoxon tests) do in such cases?
   4.

   I suppose I don't need to model item as a random effect because there
   are only two of them, one for each level, right?

I would really appreciate your help!!

Best regards,

Paul
#
Dear Tom,

Many thanks for these very helpful comments and suggestions! Would you just
allow me to ask some further questions:

1. I've been considering whether to cross or to nest the random effects for
quite a while. Data from the same channel across participants do show
corresponding trends (thus a bit different from the case when, e.g.,
sampling nine neurons from the same individual). Would nesting channel
within participant deal with that relationship?

2. I actually also tried nesting channel within participant. However, when
I proceeded to run planned comparisons (I guess I'd better have them done
because of their theoretical significance) based on this mixed-effect
modeling approach (as illustrated in the earlier mail but with the random
factor as (1|participant/channel), to maintain consistency of analytical
methods), R gave me an error message:

Error: number of levels of each grouping factor must be < number of observations


I think this is because in my data, each participant only contributes one
data point per channel and thus the data points are not enough. I guess
that probably means I can't go on in this direction to run the planned
comparisons... (?) I'm not pretty sure how contrasts based on binary dummy
variables may be done and will try to further explore that. But before I
establish the mixed model I already set up orthogonal contrasts for group
and item in the dataset using the function contrasts(). Does this have
anything to do with what you meant?

3. I worried about pseudoreplicability when participant ID is not included.
Concerning this point, later it came to me that pseudoreplicability usually
occurred in cases when multiple responses from the same individual are
grouped in the same cell, rendering the data within the same cell
non-independent (similar to the case of repeated-measure ANOVA? sorry if I
got a wrong understanding...). But as mentioned earlier in my data, each
participant only contributes one data point per channel, when channel alone
is already modeled as a random factor, would that mean all data points
within a cell all come from different participants and thus in this case
may deal with the independence assumption? (Again I'm sorry if my concept
is wrong and would appreciate instructions on this point...)

Many, many thanks!

Paul














2016-06-06 19:10 GMT+02:00 Houslay, Tom <T.Houslay at exeter.ac.uk>:

  
  
#
Hi Paul,


I think you're right here in that actually you don't want to nest channel inside participant (which led to that error message - sorry, should have seen that coming!).


It's hard to know without seeing data plotted, but my guess from your email is that you probably see some clustering both at individual level and at channel level? Perhaps separate random effects, ie (1|Participant) + (1|Channel), is the way to go (and then you shouldn't have the problem as regards number of observations - instead you'll have an intercept deviation for each of your N individuals, and also intercept deviations for each of your 9 channels). You certainly want to keep the participant intercept in though, as each individual gets both items (right?), so you need to model that association. You can use your variance components output from lmer to determine what proportion of the phenotypic variance (conditional on your fixed effects) is explained by each of these components, eg V(individual)/(V(individual) + V(channel) + V(residual) would give you the proportion explained by differences among individuals in their voltage. It would be cool to know if differences among individuals, or among channels, is driving the variation that you find. I think using the sjplot function for lmer would be useful to look at the levels of your random effects:


http://strengejacke.de/sjPlot/sjp.lmer/


As for 'contrasts', again I haven't used that particular package, but from a brief glance it looks like you're on the right track - binary coding is the 'simple coding' as set out here:


http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm


Good luck!


Tom
#
Dear Tom,

Thank you so much for these detailed replies and I appreciate your help!

Sincerely,

Paul

2016-06-06 21:51 GMT+02:00 Houslay, Tom <T.Houslay at exeter.ac.uk>:

  
  
#
In terms of contrast coding, two more helpful resources are:

http://talklab.psy.gla.ac.uk/tvw/catpred/

http://palday.bitbucket.org/stats/coding.html

Channel makes sense as a random effect / grouping term for your particular design, *not* nested within participant. The implicit crossing given by (1|Participant) + (1|Channel) models [omitting any slope terms to focus on the grouping variables] (1) interindividual differences in the EEG and (2) differences between electrodes because closely located electrodes can be thought of as samples from a population consisting of a given Region of Interest (ROI), especially if the electrode placement is somewhat symmetric. The differences resulting from variance in electrode placement between participants will be covered by the implicit crossing of these two random effects. 

Note that using channel as a random effect is somewhat more difficult if you're doing a whole scalp analysis as sampling across the whole scalp can be viewed as sampling from multiple ROIs, i.e. multiple populations. Two possible solutions are (1) to include ROI in the fixed effects and keep channel in the random effects and (2) model channel as a two or three continuous spatial variables (e.g. displacement from midline or displacement from center based on 10-20 coordinates, or spatial coordinates of the sort used in source localisation) in the fixed effects.  In the case of (1), the channel random effect would then be modelling the typical variance within ROIs (because that's hopefully the major source of variance structured  by channel left over after modelling ROI and your experimental manipulation). If this within-variance differs greatly between between ROIs, then this may be a sub-optimal modelling choice. In the case of (2), it might still make sense to additionally model channel as a random effect (i.e. the RE with the factor consisting of channel names, the FE with the continuous coordinates), see Thierry Onkelinx's posts on the subject and http://rpubs.com/INBOstats/both_fixed_random , but I haven't thought about this enough nor examined the resulting model fits.

Best,
Phillip

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of paul
Sent: Tuesday, 7 June 2016 5:27 AM
To: Houslay, Tom <T.Houslay at exeter.ac.uk>
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Related fixed and random factors and planned comparisons in a 2x2 design

Dear Tom,

Thank you so much for these detailed replies and I appreciate your help!

Sincerely,

Paul

2016-06-06 21:51 GMT+02:00 Houslay, Tom <T.Houslay at exeter.ac.uk>:
_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
Dear Phillip,

Many thanks for these resources and replies. They are indeed very helpful.
I suppose after I've done the contrast coding, I still have to subset the
data (e.g., singling out the data for P) to do planned comparisons between
groups, using a reduced mixed model as illustrated earlier, then? Or are
there any alternative ways to do so?

Best regards,

Paul

2016-06-07 2:57 GMT+02:00 Phillip Alday <Phillip.Alday at unisa.edu.au>: