Greetings,
Is it possible/appropriate to use lme4::lmer() to compare the effect of an
independent variable across two designs: within-subjects and
between-subjects?
Data below from Erlebacher (1977), used to illustrate his methodology:
###
dataset <- NULL
dataset$A <- c(rep.int(1, 20), rep.int(2, 20), rep.int(1, 20), rep.int(2,
20))
#A is the independent variable.
#1, 2 represent the two levels of the IV
dataset$D <- c(rep.int(1, 40), rep.int(2, 40))
#D is the design factor.
#1 represents a within-ss measurement; 2 a between-ss measurement.
dataset$S <- c(60, 73, 93, 10, 90, 80, 83, 37, 83, 70, 77, 7, 100, 70, 100,
43, 43, 83, 40, 73, 36, 53, 66, 0, 73, 43, 20, 10, 26, 40, 60, 3, 53, 26,
63, 6, 3, 30, 7, 10, 53, 77, 2, 38, 68, 92, 3, 15, 67, 53, 58, 20, 17, 40,
85, 60, 25, 3, 82, 67, 62, 0, 57, 42, 3, 55, 22, 28, 45, 47, 52, 75, 38,
45, 65, 50, 2, 0, 10, 60)
dataset <- as.data.frame(dataset)
###
Erlebacher's analysis on these data can be computed using code developed by
Merritt, Cook, and Wang (2014):
https://www.researchgate.net/publication/264158186_Erlebacher's_Method_for_Contrasting_the_Within_and_Between-Subjects_Manipulation_of_the_Independent_Variable_using_R_and_SPSS
<https://www.researchgate.net/publication/264158186_Erlebacher's_Method_for_Contrasting_the_Within_and_Between-Subjects_Manipulation_of_the_Independent_Variable_using_R_and_SPSS>
The output of an Erlebacher's ANOVA for these data is:
Effect of A: F(1, 51) = 21.25,
Effect of D: F(1, 42) = 0.89
Effect of A x D = F(1, 51) = 7.88
(df obtained via Satterthwaite's (1946) Method)
Some have suggested a multilevel model with the IV and the design as fixed
effects; subject as a random effect, instead of the Erlebacher's ANOVA. For
example, this Stack Exchange discussion:
https://stats.stackexchange.com/questions/414995/statistically-testing-the-impact-of-a-within-subject-vs-between-subject-design
While the following gives similar results, I am unable to determine if this
is the correct approach:
###
dataset$A <- as.factor(dataset$A)
dataset$D <- as.factor(dataset$D)
dataset$subject <- c(rep(1:20, times = 2), 21:60)
library(lme4)
library(lmerTest)
anova(lmer(S ~ A + D + A*D + (1|subject),
dataset,
contrasts = list(A = "contr.sum", D = "contr.sum")))
###
Which outputs:
-------
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
A 3097.70 3097.70 1 75.004 21.7904 0.00001304 ***
D 123.74 123.74 1 53.030 0.8704 0.355067
A:D 1148.50 1148.50 1 75.004 8.0790 0.005765 **
-------
As of yet, I am unable to manage a theoretical manipulation of Erlebacher's
model to fit a multilevel model like the one above, which adds to my
confusion regarding whether one can use a MLM approach for this type of
data.
Thank you in advance for any advice.
Michelle Ashburner, MMATH, MA (Psych), BEd
[[alternative HTML version deleted]]
For the fixed effects, lme4 doesn't care whether things are between-,
within- or mixed. Older software cared a lot about this because it made
it possible to make various simplifying assumptions and thus speed up
computations, but it's not necessary using modern approaches.
I'm not sure I understand what D is representing. If you have multiple
measurements from a single subject, then that should present itself
simply as multiple rows in the dataframe. Likewise, if you don't have
multiple measurements, then that will also be obvious from the data. If
it's simply a matter of which "original" experiment the data came from;
well then you can include that as factor in the analysis, but I would
expect that effect to be null (unless of course there is some
domain-specific reason why a within-subjects manipulation would yield
different results than a between-subjects manipulation). Within-subjects
designs generally provide better estimates, so I wouldn't be surprised
if the interaction effect is present but small (look at the model
coefficients, not ANOVA for this).
Regarding the random effects: you could actually fit a by-subjects slope
for A (i..e (1+A|subject) ). This may seem strange at first because "A"
would not seem to be directly estimable for subjects who only saw one
level of A. But that's where the magic of mixed models kicks in: in such
cases, the model can use the "estimates" (technically "predictions")
from the other subjects as well as the population level estimate to fill
in the gaps. The reason why this works is that uncertain estimates are
*shrunk* towards the population level estimate. John Kruschke has an
example of this shrinkage with figures here:
https://doingbayesiandataanalysis.blogspot.com/2019/07/shrinkage-in-hierarchical-models-random.html
Best,
Phillip
On 20/09/2020 19:22, Michelle Ashburner wrote:
Greetings,
Is it possible/appropriate to use lme4::lmer() to compare the effect of an
independent variable across two designs: within-subjects and
between-subjects?
Data below from Erlebacher (1977), used to illustrate his methodology:
###
dataset <- NULL
dataset$A <- c(rep.int(1, 20), rep.int(2, 20), rep.int(1, 20), rep.int(2,
20))
#A is the independent variable.
#1, 2 represent the two levels of the IV
dataset$D <- c(rep.int(1, 40), rep.int(2, 40))
#D is the design factor.
#1 represents a within-ss measurement; 2 a between-ss measurement.
dataset$S <- c(60, 73, 93, 10, 90, 80, 83, 37, 83, 70, 77, 7, 100, 70, 100,
43, 43, 83, 40, 73, 36, 53, 66, 0, 73, 43, 20, 10, 26, 40, 60, 3, 53, 26,
63, 6, 3, 30, 7, 10, 53, 77, 2, 38, 68, 92, 3, 15, 67, 53, 58, 20, 17, 40,
85, 60, 25, 3, 82, 67, 62, 0, 57, 42, 3, 55, 22, 28, 45, 47, 52, 75, 38,
45, 65, 50, 2, 0, 10, 60)
dataset <- as.data.frame(dataset)
###
Erlebacher's analysis on these data can be computed using code developed by
Merritt, Cook, and Wang (2014):
https://www.researchgate.net/publication/264158186_Erlebacher's_Method_for_Contrasting_the_Within_and_Between-Subjects_Manipulation_of_the_Independent_Variable_using_R_and_SPSS
<https://www.researchgate.net/publication/264158186_Erlebacher's_Method_for_Contrasting_the_Within_and_Between-Subjects_Manipulation_of_the_Independent_Variable_using_R_and_SPSS>
The output of an Erlebacher's ANOVA for these data is:
Effect of A: F(1, 51) = 21.25,
Effect of D: F(1, 42) = 0.89
Effect of A x D = F(1, 51) = 7.88
(df obtained via Satterthwaite's (1946) Method)
Some have suggested a multilevel model with the IV and the design as fixed
effects; subject as a random effect, instead of the Erlebacher's ANOVA. For
example, this Stack Exchange discussion:
https://stats.stackexchange.com/questions/414995/statistically-testing-the-impact-of-a-within-subject-vs-between-subject-design
While the following gives similar results, I am unable to determine if this
is the correct approach:
###
dataset$A <- as.factor(dataset$A)
dataset$D <- as.factor(dataset$D)
dataset$subject <- c(rep(1:20, times = 2), 21:60)
library(lme4)
library(lmerTest)
anova(lmer(S ~ A + D + A*D + (1|subject),
dataset,
contrasts = list(A = "contr.sum", D = "contr.sum")))
###
Which outputs:
-------
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
A 3097.70 3097.70 1 75.004 21.7904 0.00001304 ***
D 123.74 123.74 1 53.030 0.8704 0.355067
A:D 1148.50 1148.50 1 75.004 8.0790 0.005765 **
-------
As of yet, I am unable to manage a theoretical manipulation of Erlebacher's
model to fit a multilevel model like the one above, which adds to my
confusion regarding whether one can use a MLM approach for this type of
data.
Thank you in advance for any advice.