An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20130416/44919dfd/attachment.pl>
How to deal with outcomes assessed by raters?
6 messages · Chris Howden, David Duffy, Joseph Bulbulia +1 more
I'm no expert, but I believe that with only 4 judges that's not enough to
get an accurate estimate of the variability associated with judges.
So U may be better to include them as fixed effects with 4 levels.
That said if U just want a random intercept for each judge and don't want
an accurate measure of their variance it may still be OK to include them
as a random effect? But I'm a little unclear on this point myself.
Chris Howden B.Sc. (Hons) GStat.
Founding Partner
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training
(mobile) 0410 689 945
(fax) +612 4782 9023
chris at trickysolutions.com.au
Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information.?If you are
not the named or intended recipient, please delete this communication and
contact us immediately.?Please note you are not authorised to copy, use or
disclose this communication or any attachments without our consent.
Although this email has been checked by anti-virus software, there is a
risk that email messages may be corrupted or infected by viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the company.
Tricky Solutions always does our best to provide accurate forecasts and
analyses based on the data supplied, however it is possible that some
important predictors were not included in the data sent to us. Information
provided by us should not be solely relied upon when making decisions and
clients should use their own judgement.
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Joseph
Bulbulia
Sent: Tuesday, 16 April 2013 3:07 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] How to deal with outcomes assessed by raters?
Hi all,
Id like to model emotional dynamics in a highly arousing firewalk ritual.
Four judges rated images from 42 participants for arousal and valence. The
predictor variables are ritual phase and role.
Question 1
Any thoughts about how best to handle the rater assessments?
Specifically, is it nuts to explicitly include a component for raters in
the random component of the model?
E.g.
library(MCMCglmm)
prior.fw.0 = list(
B = list(mu=rep(0,4),V = diag(4)*1e+10),
R = list(V =diag(2), fix = 1),
G = list(G1 = list(V = diag(2), n = 2, alpha.mu = c(0,0), alpha.V =
diag(2)*1000),
G2 = list (V = diag(2),n = 2, alpha.mu = c(0,0), alpha.V =
diag(2)*1000),
G3 = list (V = diag(2), fix=1)))
firemodel.test <-MCMCglmm(cbind(arousal, valence) ~ trait:role:trait:phase
-1,
random = ~us(trait):phase:id
+ idh(trait):event:id
+ idh(trait):rater,
rcov= ~ idh(trait):units,
family = rep("ordinal",2),
data=Firewalkdata, burnin=5000,
thin = 10,
nitt=20000,
prior=prior.fw.0)
Thanks everyone. Very grateful for and advice.
Joseph
Disclaimer
I'm new to GLMMs,so apologies if this doesn't make sense.
Data sample below
(Only 20 data points, just to get a sense of the structure)
Firewalkdata <- structure(list(obs = c("1", "2", "3", "4", "5", "6", "7",
"8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19",
"20"), id = structure(c(24L, 4L, 26L, 37L, 32L, 3L, 20L, 9L, 3L, 2L, 5L,
19L, 23L, 28L, 29L, 8L, 3L, 18L, 40L, 26L), .Label = c("a", "b", "c", "d",
"e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "pa", "pb",
"pc", "pd", "pe", "pf", "pg", "ph", "pi", "pj", "pk", "pl", "pm", "pn",
"po", "pp", "q", "r", "s", "t", "u", "v", "w", "x", "y"), class =
"factor"), phase = c(1, 2, 5, 5, 2, 3, 1, 4, 5, 2, 5, 1, 5, 4, 3, 4, 5, 3,
2, 4), event = structure(c(11L, 4L, 13L, 21L, 22L, 3L, 4L, 9L, 3L, 2L, 5L,
3L, 9L, 16L, 17L, 8L, 3L, 2L, 24L, 13L), .Label = c("a", "b", "c", "d",
"e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s",
"t", "u", "v", "w", "x", "y"), class = "factor"), role = structure(c(2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L,
2L), .Label = c("FW", "PS"), class = "factor"), dyad = structure(c(2L, 2L,
2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L),
.Label = c("n", "y"), class = "factor"), gender = structure(c(1L, 2L, 2L,
2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L),
.Label = c("f", "m"), class = "factor"), rater = structure(c(4L, 1L, 3L,
1L, 3L, 4L, 1L, 2L, 4L, 3L, 4L, 3L, 4L, 1L, 1L, 4L, 4L, 1L, 4L, 4L),
.Label = c("rat1", "rat2", "rat3", "rat4"), class = "factor"),
arousal = c(4, 5, NA, 6, 6, 6, 4, 4, 7, 4, 3, 7, 5, 5, 5,
6, 5, NA, 4, 6), valence = c(4, 3, NA, 2, 6, 1, 2, 6, 3,
5, 5, 7, 5, 2, 2, 1, 4, NA, 4, 6)), .Names = c("obs", "id", "phase",
"event", "role", "dyad", "gender", "rater", "arousal", "valence"),
row.names = c(3977L, 83L, 2996L, 525L, 3134L, 3213L, 726L, 1267L, 3221L,
2134L, 3273L, 2801L, 3975L, 944L, 964L, 3344L, 3223L, 688L, 3758L, 4041L),
class = "data.frame")
IMAGE SAMPLE (for the curious)
https://www.dropbox.com/s/xmbci5814h73i0l/4_MF_e5.png
GRAPH bootstrapped means
https://www.dropbox.com/s/509sgh5zqxq18tn/Figure_FireWalk.pdf
Crude overview of the design
https://www.dropbox.com/s/0o0a9kkrh5ttsd2/plot.plan.emotions_firewalk.pdf
Joseph Bulbulia
Senior Lecturer, Religious Studies
Faculty of Humanities and Social Sciences Victoria University, New Zealand
+64 21 95 94 23
http://www.metaphysicalclub.com
On Tue, 16 Apr 2013, Joseph Bulbulia wrote:
Hi all, I?d like to model emotional dynamics in a highly arousing firewalk ritual. Four judges rated images from 42 participants for arousal and valence. The predictor variables are ritual ?phase? and ?role.? Question 1 Any thoughts about how best to handle the rater assessments? Specifically, is it nuts to explicitly include a component for raters in the random component of the model?
So what do the inter-rater agreements look like? I presume rater is actually a nuisance variable? The path model I would usually use would have phase and role acting on the averaged-over-raters a and v scores (measurement model bit). Just 2c, David Duffy.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20130416/c2cf1fcf/attachment.pl>
2 days later
Hi Joseph, thanks for this detailed summary. Based on my understanding, I think that it is defensible to include the raters as random effects in the model, and I think that doing so provides a more faithful representation of your experimental design than would excluding them. Definitely not nuts. On the niggles: I'm not sure what exactly you mean by "averaging over the ratings". It sounds risky to me. Cheers Andrew On Tue, Apr 16, 2013 at 7:50 PM, Joseph Bulbulia
<joseph.bulbulia at icloud.com> wrote:
Hi all, Two of you asked for more information. Sorry for the long-winded account, written in haste. THE QUASI EXPERIMENT * The fire-walking ritual consisted of a series of 26 ordeals by fire. * Each fire-walker traversed a burning bed of coals (677 Celsius -- I actually measured it with a pyrometer. Such instruments exit!). * In sixteen of these events, fire-walkers were carrying a passenger. * Total duration of each fire-walk = < 5 seconds, which we carved up into five phases. I constructed a make-shift plot plan of the ritual here (following ideas in Walter Stroup's recent book on GLMMs). Not sure I'm happy with it, but it will give you the gist. https://www.dropbox.com/s/0o0a9kkrh5ttsd2/plot.plan.emotions_firewalk.pdf Hypotheses 1: Anthropologists have long maintained that rituals cause a melding of emotions, what they call "collective effervescence". You've felt this surely. Being connected with others at a big event ? This "merge" model predicts that arousal and valence will tend to be coupled among participants, irrespective of ritual roles. (In another paper, we demonstrated heart rhythm coupling among fire walkers and observers; the study was published in 2011.) Hypothesis 2: Another anthropological tradition predicts differentiation in emotions depending on ritual role. This "verge" model predicts that ritual participants who undertake a rite of passage will express different emotions. Think of a PhD thesis defence. The candidate's ordeal is the inquisitor's delight! (In our heart rhythm paper we also found differences in synchrony which were predicted by ritual role and social distance. This study is just a follow up using another biomarker.) To assess whether emotional dynamics merge or verge, we sampled images for each ritual participant (n=42, 26 firewalkers and 16 passengers) at five different phases of the fire-walk. There's evidence of cultural variability in emotions, so the images were independently rated by four judges from the part of Spain, roughly from the area where the ritual happened. (Note: if I could do this over again, I'd get more raters, but this sort of number is typical in psychological research: it is probably OK for the task a hand, which does not require exact estimates, only rough assessments of trends among each ritual group). See how you do here. Merge or Verge? https://www.dropbox.com/s/xmbci5814h73i0l/4_MF_e5.png Images were rated for ?valence? and ?arousal? on Likert scales from 1-7. I didn't run an ICC because I wasn't sure whether this is appropriate for ordinal data (If anyone knows I'd be grateful, but I didn't want to bog down the list with too many questions). Kendall's coefficient of concordance was 0.513. As is typical in emotions research, then, judgements were not all that concordant. But again noisy signals are ok in the context of this study. There's a larger philosophical discussion about whether emotions are intrinsically vague and context-dependent creatures. We can set that to the side though. Crude signals, in this case, are fine. The Model Fixed effects for Phase x Role strongly improve the intercept only model, and show merge for arousal and verge for valence. This finding is supported in all other models. Random slopes for participants by Phase do better than random intercepts and slopes. Random effects for Events improve the model, but there's no improvement by including effects for Dyadic pairs. I used an ordinal family because the data are ordinal. I fixed the R variance to 1 because this is what Jarrod Hadfield's Course Notes recommend, and he is a man who knows what he's talking about. Key point Nothing hangs on putting raters into the model!! The outcome remains the same with respect to the hypotheses. I could leave them out (and probably will). However it seems to me that the raters are somehow part of the effect, in a way that is very roughly analogous to meta-analysis studies. (However I did not attempt MEV? seemed a bit extreme, but who knows?!!) Other niggles My psychologist collaborator (experienced with LMMs using HLM and MPLUS) suggested averaging over the ratings. This is standard practice in psychology. In fact, psychologist do this all the time wherever they have highly correlated measures for the same trait (e.g. personality). This strikes me as OK for most purposes, but it is also odd, because you loose a signal for the variance of your measures. Again, sorry for stealing time. Thanks for any help. On 16/04/2013, at 8:05 PM, David Duffy <David.Duffy at qimr.edu.au> wrote:
On Tue, 16 Apr 2013, Joseph Bulbulia wrote:
Hi all, I?d like to model emotional dynamics in a highly arousing firewalk ritual. Four judges rated images from 42 participants for arousal and valence. The predictor variables are ritual ?phase? and ?role.? Question 1 Any thoughts about how best to handle the rater assessments? Specifically, is it nuts to explicitly include a component for raters in the random component of the model?
So what do the inter-rater agreements look like? I presume rater is actually a nuisance variable? The path model I would usually use would have phase and role acting on the averaged-over-raters a and v scores (measurement model bit). Just 2c, David Duffy.
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Andrew Robinson Deputy Director, ACERA Senior Lecturer in Applied Statistics Tel: +61-3-8344-6410 Department of Mathematics and Statistics Fax: +61-3-8344 4599 University of Melbourne, VIC 3010 Australia Email: a.robinson at ms.unimelb.edu.au Website: http://www.ms.unimelb.edu.au FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/ SPuR: http://www.ms.unimelb.edu.au/spuRs/
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20130419/a90e12f0/attachment.pl>