Skip to content

Modeling truncated counts with glmer

4 messages · João C P Santiago, Thierry Onkelinx

#
Hi,

In my experiment 20 participants did a word-pairs learning task in two  
conditions (repeated measures):
40 pairs of nouns are presented on a monitor, each for 4s and with an  
interval of 1s. The words of each pair were moderately semantically  
related (e.g., brain, consciousness and solution, problem). Two  
different word lists were used for the subject?s two experimental  
conditions, with the order of word lists balanced across subjects and  
conditions. The subject had unlimited time to recall the appropriate  
response word, and did three trials in succession for each list:

Condition 1, List A > T1, T2, T3
Condition 2, List B > T1, T2, T3

No feedback was given as to whether the remembered word was correct or not.

I've seen some people go at this with anova, others subtract the total  
number of correct pairs in one condition from the other per subject  
and run a t-test. Since this is count data, a generalized linear model  
should be more appropriate, right?

head(data)
   subjectNumber expDay      bmi treatment tones       hour abruf  
correctPair incorrectPair
           <dbl>  <chr>    <dbl>    <fctr> <dbl>     <time> <dbl>       
  <dbl>         <dbl>
1             1     N2 22.53086   Control     0 27900 secs     1        
    26            14
2             1     N2 22.53086   Control     0 27900 secs     2        
    40             0
3             1     N2 22.53086   Control     0 27900 secs     3        
    40             0
4             2     N1 22.53086   Control     0 27900 secs     1        
    22            18
5             2     N1 22.53086   Control     0 27900 secs     2        
    33             7
6             2     N1 22.53086   Control     0 27900 secs     3        
    36             4



I fitted a model with glmer.nb(correctPair ~ I((abruf - 1)^2) *  
treatment + (1|subjectNumber), data=data). The residuals don't look so  
good to me http://imgur.com/a/AJXGq and the model is fitting values  
above 40, which will never happen in real life (not sure if this is  
important).

I'm interested in knowing if there is any difference between  
conditions (are the values at timepoint (abruf) 1 different? do people  
remember less in one one condition than in the other (different number  
of pairs at timepoint 3?)


If the direction I'm taking is completely wrong please let me know.

Best,
Santiago
#
Dear Jo?o,

A binomial distribution seems more relevant to me.

glmer(cbind(correctPair, incorrectPair) ~ I((abruf - 1)^2) * treatment +
(1|subjectNumber), data=data, family = binomial)

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2017-01-23 8:46 GMT+01:00 Jo?o C P Santiago <joao.santiago at uni-tuebingen.de>
:

  
  
#
Thank you! Could you be a bit more specific as to why? I will most  
likely encounter similar data in the future and I want to know how to  
think about it.

Fitting the model with abruf as a factor resulted in a better fit, but  
that answers a different question right? Namely how different is the  
intercept at a timepoint in comparison with the main level (abruf 0 in  
my code)?

Best

Quoting Thierry Onkelinx <thierry.onkelinx at inbo.be>:

  
    
#
It looks like you participants performed a known number of trials which
resulted in either success or failure. The binomial distribution models
exactly that. The model fit would be the probability of success.

Once you have the relevant distribution, you can set the relevant
covariates. Which and in which form (linear, polynomial, factor) depends on
the hypotheses which are relevant for your experiment.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2017-01-23 10:01 GMT+01:00 Jo?o C P Santiago <joao.santiago at uni-tuebingen.de