Specifying the correct LMM for 'unsual' data
Hi Maarten, I would not collapse the task and the kind of response (hit/miss) into one condition predictor. They are conceptually independent as task is a manipulated factor and response a measured value (covariate in this model). Also, one of them can vary within pictures the other not (see model specification below). So my suggestion would be to have those two predictors: task: 2-level factor: PM, VS response: 2-level predictor: hit, miss Beware of how you specify the contrasts for (all of) the categorical predictors. The default treatment contrast is most likely not the most straight-forward way to interpret the model estimates. Regarding your questions: 1. Am I correct with the maximal linear mixed model specifications? With the changed predictors I think that this would be the maximal model. Response can vary also within pictures as each can be a hit or miss. lmer(dwell_time ~ age_group * task * response + (1 + task * response | participant) + (1 + response | picture), data) 2. I think that the data points in the PM-miss-condition (or PM-hit-condition) are not missing at random because they are missing if (and only if) there are 6 data point for the same participant in the PM-hit-condition (and vice versa). Do you think one has to worry about this and are there any suggestions how to deal with it? Imbalanced data sets and even missing design cells are not a problem for mixed models as they take the number of the observation into account (shrinkage). Best, Tom --- Tom Fritzsche University of Potsdam Department of Linguistics Karl-Liebknecht-Str. 24-25 14476 Potsdam Germany office: 14.140 phone: +49 331 977 2296 fax: +49 331 977 2095 e-mail: tom.fritzsche at uni-potsdam.de web: www.ling.uni-potsdam.de/~fritzsche On 25 January 2018 at 15:35, Maarten Jung
<Maarten.Jung at mailbox.tu-dresden.de> wrote:
Dear list,
a colleague of mine asked me to help her planing a linear mixed models
analysis and, as handling her data and the corresponding research questions
with lmer seems kind of tricky to me, I hope one of you can help me along.
+++++++++++++++++++++++++++++++++++++
The experiment is as follows:
Participants (46 younger and 45 older children) looked at a series of
pictures (one picture per trial) and had to solve two tasks consecutively:
- Task block 1: Prospective memory (PM) task: while doing other tasks,
participants had to remember to press a specified button when they saw a
certain object
- Task block 2. Visual search: participants had only this one task ?
pressing a button as soon as possible when seeing a certain object
Each child saw the same pictures in the same task block ? pictures 1-6 in
task block 1 and pictures 7-18 in task block 2. Each picture was presented
only once, so there were different pictures in the task blocks.
Trials with target object in task 1 are allocated regarding the
participant?s reactions in PM hits (participants did press the button) and
PM misses (participants did not press the button). (Therefore, a certain
picture can be a PM hit trial for one child and a PM miss trial for the
other.) As there were six trials (= pictures), which contained the target
object, each participant can have a minimum of zero and a maximum of six PM
hits with the according number of PM misses.
Here is the number of PM hits per age group:
Younger children:
- 2 children: 0 hits
- 9 children: 1 hit
- 8 children : 2 hits
- 12 children: 3 hits
- 4 children: 4 hits
- 4 children: 5 hits
- 7 children: 6 hits
Older children
- 2 children: 0 hits
- 3 children: 1 hit
- 4 children: 2 hits
- 6 children: 3 hits
- 7 children: 4 hits
- 11 children: 5 hits
- 12 children: 6 hits
(In the visual search task almost all children have pressed the button
correctly in all 12 visual search target trials).
She is interested in how long participants looked at the PM and visual
search target, respectively, depending on if it was a PM hit, a PM miss or
a visual search hit and how this is influenced by the age group. Therefore,
she has got only one data point per trial. And if a participant has no PM
misses there is no data point at all in this condition for this participant.
The variables are defined as follows:
- age_group: categorical predictor with 2 levels (younger and older
children)
- condition: categorical predictor with 3 levels (PM hit, PM miss, visual
search hit)
+++++++++++++++++++++++++++++++++++++
My suggestion for the maximal linear mixed model would be:
lmer(dwell_time ~ age_group*condition + (1 + condition|participant) +
(1|picture), data)
I intentionally didn`t use (1 + condition|picture) here because there are
different pictures in the task blocks (see above) - hope this makes sense.
I have two questions:
1. Am I correct with the maximal linear mixed model specifications?
2. I think that the data points in the PM-miss-condition (or
PM-hit-condition) are not missing at random because they are missing if
(and only if) there are 6 data point for the same participant in the
PM-hit-condition (and vice versa). Do you think one has to worry about this
and are there any suggestions how to deal with it?
Best,
Maarten
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models