Using variance components of lmer for ICC computation in reliability study

13 messages · Rolf Turner, Doran, Harold, Pierre de Villemereuil +2 more

Original

1

13

Thu, Jun 14, 2018 10:35 AM #

Well no, you?re specification is not right because your variable is not
continuous as you note. Continuous means it is a real number between
-Inf/Inf and you have boundaries between 1 and 10. So, you should not be
using a linear model assuming the outcome is continuous.

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Rolf Turner

Thu, Jun 14, 2018 3:33 PM #

On 15/06/18 05:35, Doran, Harold wrote:

I think that the foregoing is a bit misleading.  For a variable to be 
continuous it is not necessary for it to have a range from -infinity to 
infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is not 
clear to me what this means.  The "obvious"/usual meaning or 
interpretation would be that dv can take (only) the (positive integer) 
values 1, 2, ..., 10.  If this is so, then a continuous model is not 
appropriate.  (It should be noted however that people in the social 
sciences do this sort of thing --- i.e. treat discrete variables as 
continuous --- all the time.)

It is *possible* that dv can take values in the real interval [1,10], in
which case it *is* continuous, and a "continuous model" is indeed 
appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

> On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:
> 
>> Dear Community,
>>
>>
>> I am doing a reliability study, using the methods of
>> https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question on the
>> lmer formulation and the use of the variance components.
>>
>> Background: I have 20 subjects, 2 fixed raters, 2 testing sessions, and
>> 10 trials per sessions. my dependent variable is a continuous variable
>> (scale 1-10). Sessions are nested within each subject-assessor
>> combination. I desire a ICC (3) formulation of inter-rater and
>> inter-session reliability from the variance components.
>>
>> My lmer model is:
>>
>> lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)
>>
>> Question:
>>
>>   1.  is the model formulation right? and is my interpretation of the
>> variance components for ICC below right?
>>   2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) # I
>> read that the variation of raters will be lumped with the residual
>>   3.  inter-session ICC =( var (subj) + var (residual)) /( var (subj) +
>> var (subj:session) + var (residual))
>> some simulated data:
>> df = expand.grid(subj = c(1:20), rater = c(1:2), session = c(1:2), trial
>> = c(1:10))
>> df$vas = rnorm (nrow (df_sim), mean = 3, sd = 1.5)
>>
>> I appreciate the kind response.

Thu, Jun 14, 2018 3:58 PM #

That?s a helpful clarification, Rolf. However, with gaussian normal errors
in the linear model, we can?t *really* assume they would asymptote at 1 or
10. My suspicion is that these are likert-style ordered counts of some
form, although the OP should clarify. In which case, the 1 or 10 are
limits with censoring, as true values for some measured trait could exist
outside those boundaries (and I suspect the model is forming predicted
values outside of 1 or 10).

On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable is not
continuous as you note. Continuous means it is a real number between
-Inf/Inf and you have boundaries between 1 and 10. So, you should not be
using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to be
continuous it is not necessary for it to have a range from -infinity to
infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is not
clear to me what this means.  The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive integer)
values 1, 2, ..., 10.  If this is so, then a continuous model is not
appropriate.  (It should be noted however that people in the social
sciences do this sort of thing --- i.e. treat discrete variables as
continuous --- all the time.)

It is *possible* that dv can take values in the real interval [1,10], in
which case it *is* continuous, and a "continuous model" is indeed
appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

-- 
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

Dear Community,


I am doing a reliability study, using the methods of
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question on the
lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing sessions, and
10 trials per sessions. my dependent variable is a continuous variable
(scale 1-10). Sessions are nested within each subject-assessor
combination. I desire a ICC (3) formulation of inter-rater and
inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation of the
variance components for ICC below right?
  2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) # I
read that the variation of raters will be lumped with the residual
  3.  inter-session ICC =( var (subj) + var (residual)) /( var (subj) +
var (subj:session) + var (residual))
some simulated data:
df = expand.grid(subj = c(1:20), rater = c(1:2), session = c(1:2),
trial
= c(1:10))
df$vas = rnorm (nrow (df_sim), mean = 3, sd = 1.5)

I appreciate the kind response.

Ben Bolker

Thu, Jun 14, 2018 6:27 PM #

More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform.  However, I'm not sure how one would go about
computing an ICC from ordinal data ... (the 'ordinal' package is the
place to look for the model-fitting procedures). Googling it finds
some stuff, but it seems that it doesn't necessarily apply to complex
designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliability-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org> wrote:

That?s a helpful clarification, Rolf. However, with gaussian normal errors
in the linear model, we can?t *really* assume they would asymptote at 1 or
10. My suspicion is that these are likert-style ordered counts of some
form, although the OP should clarify. In which case, the 1 or 10 are
limits with censoring, as true values for some measured trait could exist
outside those boundaries (and I suspect the model is forming predicted
values outside of 1 or 10).



On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable is not
continuous as you note. Continuous means it is a real number between
-Inf/Inf and you have boundaries between 1 and 10. So, you should not be
using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to be
continuous it is not necessary for it to have a range from -infinity to
infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is not
clear to me what this means.  The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive integer)
values 1, 2, ..., 10.  If this is so, then a continuous model is not
appropriate.  (It should be noted however that people in the social
sciences do this sort of thing --- i.e. treat discrete variables as
continuous --- all the time.)

It is *possible* that dv can take values in the real interval [1,10], in
which case it *is* continuous, and a "continuous model" is indeed
appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

Dear Community,


I am doing a reliability study, using the methods of
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question on the
lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing sessions, and
10 trials per sessions. my dependent variable is a continuous variable
(scale 1-10). Sessions are nested within each subject-assessor
combination. I desire a ICC (3) formulation of inter-rater and
inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation of the
variance components for ICC below right?
  2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) # I
read that the variation of raters will be lumped with the residual
  3.  inter-session ICC =( var (subj) + var (residual)) /( var (subj) +
var (subj:session) + var (residual))
some simulated data:
df = expand.grid(subj = c(1:20), rater = c(1:2), session = c(1:2),
trial
= c(1:10))
df$vas = rnorm (nrow (df_sim), mean = 3, sd = 1.5)

I appreciate the kind response.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Bernard Liew

Thu, Jun 14, 2018 10:20 PM #

Thanks all,



My original question appears to be now two (1) the distribution of my DV (hence what models to use; (2) specification of my lmer model to parse out variance components.



Topic (1): DV distribution

Yes, my measure is a sliding rule between 1-10 of subjective pain, so any number up to a single decimal is plausible. Is a linear model automatically excluded, or can (a) do a fitted/residual plot for checking; (b) log transform the dv if (a) shows evidence of  non-normality.



Going back to Rolf's point of social science, you are right. But realistically, many measures in biomechanics (which I am in), are analyzed using linear models, even though they are bounded. Example, a simple scalar height is bounded to a lower limit of zero, and an upper limit of what ever instrument is created. Joint angles are bounded physiologically too. So when are measures really -inf/inf?



Topic (2): lmer



Assuming my DV is appropriate for lmer, base on the experimental design used, I hope to receive some feedback on my fixed and random effects specification still ?



Thanks again all, for the kind response



Bernard



-----Original Message-----
From: bbolker at gmail.com <bbolker at gmail.com>
Sent: Friday, June 15, 2018 2:28 AM
To: Doran, Harold <HDoran at air.org>
Cc: Rolf Turner <r.turner at auckland.ac.nz>; r-sig-mixed-models at r-project.org; Bernard Liew <B.Liew at bham.ac.uk>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study



More generally, the best way to fit this kind of model is to use an

*ordinal* model, which assumes the responses are in increasing sequence but does not assume the distance between levels (e.g. 1 vs 2,

2 vs 3 ...) is uniform.  However, I'm not sure how one would go about computing an ICC from ordinal data ... (the 'ordinal' package is the place to look for the model-fitting procedures). Googling it finds some stuff, but it seems that it doesn't necessarily apply to complex designs ...



https://stats.stackexchange.com/questions/3539/inter-rater-reliability-for-ordinal-or-interval-data

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org<mailto:HDoran at air.org>> wrote:

That?s a helpful clarification, Rolf. However, with gaussian normal

errors in the linear model, we can?t *really* assume they would

asymptote at 1 or 10. My suspicion is that these are likert-style

ordered counts of some form, although the OP should clarify. In which

case, the 1 or 10 are limits with censoring, as true values for some

measured trait could exist outside those boundaries (and I suspect the

model is forming predicted values outside of 1 or 10).

On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz<mailto:r.turner at auckland.ac.nz>> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable is

not continuous as you note. Continuous means it is a real number

between -Inf/Inf and you have boundaries between 1 and 10. So, you

should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to be

continuous it is not necessary for it to have a range from -infinity

to infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is

not clear to me what this means.  The "obvious"/usual meaning or

interpretation would be that dv can take (only) the (positive integer)

values 1, 2, ..., 10.  If this is so, then a continuous model is not

appropriate.  (It should be noted however that people in the social

sciences do this sort of thing --- i.e. treat discrete variables as

continuous --- all the time.)

It is *possible* that dv can take values in the real interval [1,10],

in which case it *is* continuous, and a "continuous model" is indeed

appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--

Technical Editor ANZJS

Department of Statistics

University of Auckland

Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk<mailto:B.Liew at bham.ac.uk>> wrote:

Dear Community,

I am doing a reliability study, using the methods of

https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question on

the lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing sessions,

and

10 trials per sessions. my dependent variable is a continuous

variable (scale 1-10). Sessions are nested within each

subject-assessor combination. I desire a ICC (3) formulation of

inter-rater and inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation of

the  variance components for ICC below right?

  2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) #

I  read that the variation of raters will be lumped with the residual

  3.  inter-session ICC =( var (subj) + var (residual)) /( var

(subj) +  var (subj:session) + var (residual))  some simulated data:

df = expand.grid(subj = c(1:20), rater = c(1:2), session = c(1:2),

trial  = c(1:10))  df$vas = rnorm (nrow (df_sim), mean = 3, sd =

1.5)

I appreciate the kind response.

_______________________________________________

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Fri, Jun 15, 2018 1:57 AM #

It seems to me you actually have censored data of three types, left, right and within the intervals. You might find it helpful to review this paper and see how models like yours can be estimated.

https://cran.r-project.org/web/packages/censReg/vignettes/censReg.pdf

From: Bernard Liew <B.Liew at bham.ac.uk<mailto:B.Liew at bham.ac.uk>>
Date: Friday, June 15, 2018 at 1:20 AM
To: Ben Bolker <bbolker at gmail.com<mailto:bbolker at gmail.com>>, AIR <hdoran at air.org<mailto:hdoran at air.org>>
Cc: Rolf Turner <r.turner at auckland.ac.nz<mailto:r.turner at auckland.ac.nz>>, "r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>" <r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>>
Subject: RE: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

Thanks all,

My original question appears to be now two (1) the distribution of my DV (hence what models to use; (2) specification of my lmer model to parse out variance components.

Topic (1): DV distribution

Yes, my measure is a sliding rule between 1-10 of subjective pain, so any number up to a single decimal is plausible. Is a linear model automatically excluded, or can (a) do a fitted/residual plot for checking; (b) log transform the dv if (a) shows evidence of  non-normality.

Going back to Rolf's point of social science, you are right. But realistically, many measures in biomechanics (which I am in), are analyzed using linear models, even though they are bounded. Example, a simple scalar height is bounded to a lower limit of zero, and an upper limit of what ever instrument is created. Joint angles are bounded physiologically too. So when are measures really -inf/inf?

Topic (2): lmer

Assuming my DV is appropriate for lmer, base on the experimental design used, I hope to receive some feedback on my fixed and random effects specification still ?

Thanks again all, for the kind response

Bernard

-----Original Message-----
From: bbolker at gmail.com<mailto:bbolker at gmail.com> <bbolker at gmail.com<mailto:bbolker at gmail.com>>
Sent: Friday, June 15, 2018 2:28 AM
To: Doran, Harold <HDoran at air.org<mailto:HDoran at air.org>>
Cc: Rolf Turner <r.turner at auckland.ac.nz<mailto:r.turner at auckland.ac.nz>>; r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>; Bernard Liew <B.Liew at bham.ac.uk<mailto:B.Liew at bham.ac.uk>>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

More generally, the best way to fit this kind of model is to use an

*ordinal* model, which assumes the responses are in increasing sequence but does not assume the distance between levels (e.g. 1 vs 2,

2 vs 3 ...) is uniform.  However, I'm not sure how one would go about computing an ICC from ordinal data ... (the 'ordinal' package is the place to look for the model-fitting procedures). Googling it finds some stuff, but it seems that it doesn't necessarily apply to complex designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliability-for-ordinal-or-interval-data

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org<mailto:HDoran at air.org>> wrote:

That?s a helpful clarification, Rolf. However, with gaussian normal

errors in the linear model, we can?t *really* assume they would

asymptote at 1 or 10. My suspicion is that these are likert-style

ordered counts of some form, although the OP should clarify. In which

case, the 1 or 10 are limits with censoring, as true values for some

measured trait could exist outside those boundaries (and I suspect the

model is forming predicted values outside of 1 or 10).

On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz<mailto:r.turner at auckland.ac.nz>> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable is

not continuous as you note. Continuous means it is a real number

between -Inf/Inf and you have boundaries between 1 and 10. So, you

should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to be

continuous it is not necessary for it to have a range from -infinity

to infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is

not clear to me what this means.  The "obvious"/usual meaning or

interpretation would be that dv can take (only) the (positive integer)

values 1, 2, ..., 10.  If this is so, then a continuous model is not

appropriate.  (It should be noted however that people in the social

sciences do this sort of thing --- i.e. treat discrete variables as

continuous --- all the time.)

It is *possible* that dv can take values in the real interval [1,10],

in which case it *is* continuous, and a "continuous model" is indeed

appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--

Technical Editor ANZJS

Department of Statistics

University of Auckland

Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk<mailto:B.Liew at bham.ac.uk>> wrote:

Dear Community,

I am doing a reliability study, using the methods of

https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question on

the lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing sessions,

and

10 trials per sessions. my dependent variable is a continuous

variable (scale 1-10). Sessions are nested within each

subject-assessor combination. I desire a ICC (3) formulation of

inter-rater and inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation of

the  variance components for ICC below right?

  2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) #

I  read that the variation of raters will be lumped with the residual

  3.  inter-session ICC =( var (subj) + var (residual)) /( var

(subj) +  var (subj:session) + var (residual))  some simulated data:

df = expand.grid(subj = c(1:20), rater = c(1:2), session = c(1:2),

trial  = c(1:10))  df$vas = rnorm (nrow (df_sim), mean = 3, sd =

1.5)

I appreciate the kind response.

_______________________________________________

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Pierre de Villemereuil

Fri, Jun 15, 2018 8:05 AM #

Hi,

I've never used the package "ordinal", but if it's anything like the "ordinal" family of MCMCglmm, then computing an ICC on the liability scale would be fairly easy, as one would just proceed as always and simply add the so-called "link variance" corresponding to the chosen link function (1 for probit, (pi^2)/3 for logit). E.g. for a given variance component Vcomp and a probit link:
ICC = Vcomp / (sum(variance components of the model) + 1)

However, computing an ICC on the data scale would be much more difficult as it is actually multivariate...

I think in the case when such scores were used, having the estimate on the liability scale make sense though, as the binning is more due to our inability of finely measuring this scale rather than an actual property of the system.

Cheers,
Pierre.

Le vendredi 15 juin 2018, 03:27:54 CEST Ben Bolker a ?crit :

More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform.  However, I'm not sure how one would go about
computing an ICC from ordinal data ... (the 'ordinal' package is the
place to look for the model-fitting procedures). Googling it finds
some stuff, but it seems that it doesn't necessarily apply to complex
designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliability-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/


On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org> wrote:

That?s a helpful clarification, Rolf. However, with gaussian normal errors
in the linear model, we can?t *really* assume they would asymptote at 1 or
10. My suspicion is that these are likert-style ordered counts of some
form, although the OP should clarify. In which case, the 1 or 10 are
limits with censoring, as true values for some measured trait could exist
outside those boundaries (and I suspect the model is forming predicted
values outside of 1 or 10).



On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable is not
continuous as you note. Continuous means it is a real number between
-Inf/Inf and you have boundaries between 1 and 10. So, you should not be
using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to be
continuous it is not necessary for it to have a range from -infinity to
infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is not
clear to me what this means.  The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive integer)
values 1, 2, ..., 10.  If this is so, then a continuous model is not
appropriate.  (It should be noted however that people in the social
sciences do this sort of thing --- i.e. treat discrete variables as
continuous --- all the time.)

It is *possible* that dv can take values in the real interval [1,10], in
which case it *is* continuous, and a "continuous model" is indeed
appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

Dear Community,


I am doing a reliability study, using the methods of
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question on the
lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing sessions, and
10 trials per sessions. my dependent variable is a continuous variable
(scale 1-10). Sessions are nested within each subject-assessor
combination. I desire a ICC (3) formulation of inter-rater and
inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation of the
variance components for ICC below right?
  2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) # I
read that the variation of raters will be lumped with the residual
  3.  inter-session ICC =( var (subj) + var (residual)) /( var (subj) +
var (subj:session) + var (residual))
some simulated data:
df = expand.grid(subj = c(1:20), rater = c(1:2), session = c(1:2),
trial
= c(1:10))
df$vas = rnorm (nrow (df_sim), mean = 3, sd = 1.5)

I appreciate the kind response.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Fri, Jun 15, 2018 8:31 AM #

This seems to me like the tail is wagging the dog. I don't think the question should be framed around, "what can I do to get an ICC?" The question should be around what distributional family is appropriate given the observed data?

If an ICC cannot be made available after the appropriate link function is chosen, then I'm not sure that's really of any consequence. The point of a random effects model is *not* to yield an ICC. But, to yield appropriate estimates for the fixed effects and marginal variances.

If these data are treated as ordinal, that's a LOT of threshold parameters to estimate and might be quite expensive to compute. 


-----Original Message-----
From: Pierre de Villemereuil [mailto:pierre.de.villemereuil at mailoo.org] 
Sent: Friday, June 15, 2018 11:05 AM
To: r-sig-mixed-models at r-project.org
Cc: Ben Bolker <bbolker at gmail.com>; Doran, Harold <HDoran at air.org>; Bernard Liew <B.Liew at bham.ac.uk>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

Hi,

I've never used the package "ordinal", but if it's anything like the "ordinal" family of MCMCglmm, then computing an ICC on the liability scale would be fairly easy, as one would just proceed as always and simply add the so-called "link variance" corresponding to the chosen link function (1 for probit, (pi^2)/3 for logit). E.g. for a given variance component Vcomp and a probit link:
ICC = Vcomp / (sum(variance components of the model) + 1)

However, computing an ICC on the data scale would be much more difficult as it is actually multivariate...

I think in the case when such scores were used, having the estimate on the liability scale make sense though, as the binning is more due to our inability of finely measuring this scale rather than an actual property of the system.

Cheers,
Pierre.

Le vendredi 15 juin 2018, 03:27:54 CEST Ben Bolker a ?crit :

More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing 
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform.  However, I'm not sure how one would go about 
computing an ICC from ordinal data ... (the 'ordinal' package is the 
place to look for the model-fitting procedures). Googling it finds 
some stuff, but it seems that it doesn't necessarily apply to complex 
designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliability
-for-ordinal-or-interval-data 
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/


On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org> wrote:

That?s a helpful clarification, Rolf. However, with gaussian normal 
errors in the linear model, we can?t *really* assume they would 
asymptote at 1 or 10. My suspicion is that these are likert-style 
ordered counts of some form, although the OP should clarify. In 
which case, the 1 or 10 are limits with censoring, as true values 
for some measured trait could exist outside those boundaries (and I 
suspect the model is forming predicted values outside of 1 or 10).



On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable 
is not continuous as you note. Continuous means it is a real 
number between -Inf/Inf and you have boundaries between 1 and 10. 
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to 
be continuous it is not necessary for it to have a range from 
-infinity to infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is 
not clear to me what this means.  The "obvious"/usual meaning or 
interpretation would be that dv can take (only) the (positive 
integer) values 1, 2, ..., 10.  If this is so, then a continuous 
model is not appropriate.  (It should be noted however that people 
in the social sciences do this sort of thing --- i.e. treat discrete 
variables as continuous --- all the time.)

It is *possible* that dv can take values in the real interval 
[1,10], in which case it *is* continuous, and a "continuous model" 
is indeed appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

Dear Community,


I am doing a reliability study, using the methods of 
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question 
on the lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing 
sessions, and
10 trials per sessions. my dependent variable is a continuous 
variable (scale 1-10). Sessions are nested within each 
subject-assessor combination. I desire a ICC (3) formulation of 
inter-rater and inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation of 
the  variance components for ICC below right?
  2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) 
# I  read that the variation of raters will be lumped with the residual
  3.  inter-session ICC =( var (subj) + var (residual)) /( var 
(subj) +  var (subj:session) + var (residual))  some simulated 
data:
df = expand.grid(subj = c(1:20), rater = c(1:2), session = 
c(1:2), trial  = c(1:10))  df$vas = rnorm (nrow (df_sim), mean = 
3, sd = 1.5)

I appreciate the kind response.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

1 day later

Bernard Liew

Sat, Jun 16, 2018 10:02 AM #

Thanks Pierre for the suggestion to use MCMCglmm. Very useful.

1) Can I ask why when taking the ratio of the variance components, the denominator is ICC = Vcomp / (sum(variance components of the model) + 1)? Why is the one added?

2) I have tried mixed ordinal modelling using "ordinal" or MCMCglmm, and noticed the variance of the residuals of the model is not produced. I have also read that the residual is assumed to be as you mentioned (pi^2)/3. Is there a reason (maybe a not so technical one?)

Kind regards,
Bernard

-----Original Message-----
From: pierre.de.villemereuil at mailoo.org <pierre.de.villemereuil at mailoo.org> 
Sent: Friday, June 15, 2018 4:05 PM
To: r-sig-mixed-models at r-project.org
Cc: Ben Bolker <bbolker at gmail.com>; Doran, Harold <HDoran at air.org>; Bernard Liew <B.Liew at bham.ac.uk>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

Hi,

I've never used the package "ordinal", but if it's anything like the "ordinal" family of MCMCglmm, then computing an ICC on the liability scale would be fairly easy, as one would just proceed as always and simply add the so-called "link variance" corresponding to the chosen link function (1 for probit, (pi^2)/3 for logit). E.g. for a given variance component Vcomp and a probit link:
ICC = Vcomp / (sum(variance components of the model) + 1)

However, computing an ICC on the data scale would be much more difficult as it is actually multivariate...

I think in the case when such scores were used, having the estimate on the liability scale make sense though, as the binning is more due to our inability of finely measuring this scale rather than an actual property of the system.

Cheers,
Pierre.

Le vendredi 15 juin 2018, 03:27:54 CEST Ben Bolker a ?crit :

More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing 
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform.  However, I'm not sure how one would go about 
computing an ICC from ordinal data ... (the 'ordinal' package is the 
place to look for the model-fitting procedures). Googling it finds 
some stuff, but it seems that it doesn't necessarily apply to complex 
designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliability
-for-ordinal-or-interval-data 
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/


On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org> wrote:

That?s a helpful clarification, Rolf. However, with gaussian normal 
errors in the linear model, we can?t *really* assume they would 
asymptote at 1 or 10. My suspicion is that these are likert-style 
ordered counts of some form, although the OP should clarify. In 
which case, the 1 or 10 are limits with censoring, as true values 
for some measured trait could exist outside those boundaries (and I 
suspect the model is forming predicted values outside of 1 or 10).



On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable 
is not continuous as you note. Continuous means it is a real 
number between -Inf/Inf and you have boundaries between 1 and 10. 
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to 
be continuous it is not necessary for it to have a range from 
-infinity to infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is 
not clear to me what this means.  The "obvious"/usual meaning or 
interpretation would be that dv can take (only) the (positive 
integer) values 1, 2, ..., 10.  If this is so, then a continuous 
model is not appropriate.  (It should be noted however that people 
in the social sciences do this sort of thing --- i.e. treat discrete 
variables as continuous --- all the time.)

It is *possible* that dv can take values in the real interval 
[1,10], in which case it *is* continuous, and a "continuous model" 
is indeed appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

Dear Community,


I am doing a reliability study, using the methods of 
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question 
on the lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing 
sessions, and
10 trials per sessions. my dependent variable is a continuous 
variable (scale 1-10). Sessions are nested within each 
subject-assessor combination. I desire a ICC (3) formulation of 
inter-rater and inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation of 
the  variance components for ICC below right?
  2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) 
# I  read that the variation of raters will be lumped with the residual
  3.  inter-session ICC =( var (subj) + var (residual)) /( var 
(subj) +  var (subj:session) + var (residual))  some simulated 
data:
df = expand.grid(subj = c(1:20), rater = c(1:2), session = 
c(1:2), trial  = c(1:10))  df$vas = rnorm (nrow (df_sim), mean = 
3, sd = 1.5)

I appreciate the kind response.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Sat, Jun 16, 2018 10:08 AM #

These extra terms are (I think) the 'residual' variances corresponding
to different GLM families, which are fixed rather than estimated.  You
might find Nakagawa and Schielzeth "Repeatability for Gaussian and
non-Gaussian data" (Biological Reviews 2010) useful ...

On 2018-06-16 01:02 PM, Bernard Liew wrote:

Thanks Pierre for the suggestion to use MCMCglmm. Very useful.

1) Can I ask why when taking the ratio of the variance components, the denominator is ICC = Vcomp / (sum(variance components of the model) + 1)? Why is the one added?

2) I have tried mixed ordinal modelling using "ordinal" or MCMCglmm, and noticed the variance of the residuals of the model is not produced. I have also read that the residual is assumed to be as you mentioned (pi^2)/3. Is there a reason (maybe a not so technical one?)

Kind regards,
Bernard

-----Original Message-----
From: pierre.de.villemereuil at mailoo.org <pierre.de.villemereuil at mailoo.org> 
Sent: Friday, June 15, 2018 4:05 PM
To: r-sig-mixed-models at r-project.org
Cc: Ben Bolker <bbolker at gmail.com>; Doran, Harold <HDoran at air.org>; Bernard Liew <B.Liew at bham.ac.uk>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

Hi,

However, I'm not sure how one would go about computing an ICC from 
ordinal data

I've never used the package "ordinal", but if it's anything like the "ordinal" family of MCMCglmm, then computing an ICC on the liability scale would be fairly easy, as one would just proceed as always and simply add the so-called "link variance" corresponding to the chosen link function (1 for probit, (pi^2)/3 for logit). E.g. for a given variance component Vcomp and a probit link:
ICC = Vcomp / (sum(variance components of the model) + 1)

However, computing an ICC on the data scale would be much more difficult as it is actually multivariate...

I think in the case when such scores were used, having the estimate on the liability scale make sense though, as the binning is more due to our inability of finely measuring this scale rather than an actual property of the system.

Cheers,
Pierre.

Le vendredi 15 juin 2018, 03:27:54 CEST Ben Bolker a ?crit :

More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing 
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform.  However, I'm not sure how one would go about 
computing an ICC from ordinal data ... (the 'ordinal' package is the 
place to look for the model-fitting procedures). Googling it finds 
some stuff, but it seems that it doesn't necessarily apply to complex 
designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliability
-for-ordinal-or-interval-data 
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/


On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org> wrote:

That?s a helpful clarification, Rolf. However, with gaussian normal 
errors in the linear model, we can?t *really* assume they would 
asymptote at 1 or 10. My suspicion is that these are likert-style 
ordered counts of some form, although the OP should clarify. In 
which case, the 1 or 10 are limits with censoring, as true values 
for some measured trait could exist outside those boundaries (and I 
suspect the model is forming predicted values outside of 1 or 10).



On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable 
is not continuous as you note. Continuous means it is a real 
number between -Inf/Inf and you have boundaries between 1 and 10. 
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to 
be continuous it is not necessary for it to have a range from 
-infinity to infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is 
not clear to me what this means.  The "obvious"/usual meaning or 
interpretation would be that dv can take (only) the (positive 
integer) values 1, 2, ..., 10.  If this is so, then a continuous 
model is not appropriate.  (It should be noted however that people 
in the social sciences do this sort of thing --- i.e. treat discrete 
variables as continuous --- all the time.)

It is *possible* that dv can take values in the real interval 
[1,10], in which case it *is* continuous, and a "continuous model" 
is indeed appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

Dear Community,


I am doing a reliability study, using the methods of 
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question 
on the lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing 
sessions, and
10 trials per sessions. my dependent variable is a continuous 
variable (scale 1-10). Sessions are nested within each 
subject-assessor combination. I desire a ICC (3) formulation of 
inter-rater and inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation of 
the  variance components for ICC below right?
  2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) 
# I  read that the variation of raters will be lumped with the residual
  3.  inter-session ICC =( var (subj) + var (residual)) /( var 
(subj) +  var (subj:session) + var (residual))  some simulated 
data:
df = expand.grid(subj = c(1:20), rater = c(1:2), session = 
c(1:2), trial  = c(1:10))  df$vas = rnorm (nrow (df_sim), mean = 
3, sd = 1.5)

I appreciate the kind response.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Pierre de Villemereuil

Sat, Jun 16, 2018 11:28 AM #

Dear Bernard,

Glad this was helpful.

You can find an explanation for this in the Supplementary File on our article on quantitative genetics interpretation of GLMMs:
http://www.genetics.org/content/204/3/1281.supplemental

It is fairly technical. Trying to put it simply: there is an equivalence between a binomial/probit model and the threshold model. This is because a probit link is actually the CDF of a normal distribution (same goes for a logit and a logistic distribution): using a probit link and a binomial is equivalent to using a threshold model after adding a random noise normally distributed (this is Fig. S1 in our Suppl. File above). 

And the variance of this random noise is thus the variance of a standard normal distribution which is 1 (or the variance of a standard logistic distribution, which is (pi^2)/3, for a logit link).

Computing the ICC on the liability scale "as if" a threshold model was used thus only requires to add that extra-variance on the denominator.

It all depends what you call "residual". In MCMCglmm, there is for example a "residual" variance called "units" which value should be fixed to e.g. 1 when running families such as ordinal. This variance is required to be added in the denominator of the ICC. If you have only a random effect, you would have:
Vrand / (Vrand + Vunits + 1)

Note that there is also the "threshold" family in MCMCglmm, which is special in the sense that (to say it quickly) the "units" variance is the variance of the probit link and should be fixed to 1. In that case, you would have:
Vrand / (Vrand + Vunits)

Sometimes this "extra-variance" (1 or (pi^2)/3) is indeed called "residual variance". This is because, as per my explanation above, probit and logit links can be seen as "adding a random noise then taking a threshold", this random noise is sometimes seen as a "residual error" in the model.

I hope this will clarify things for you. I'm afraid it's difficult to explain clearly the whys without going too much into the technicality of the models.

Cheers,
Pierre.

Kind regards,
Bernard

-----Original Message-----
From: pierre.de.villemereuil at mailoo.org <pierre.de.villemereuil at mailoo.org> 
Sent: Friday, June 15, 2018 4:05 PM
To: r-sig-mixed-models at r-project.org
Cc: Ben Bolker <bbolker at gmail.com>; Doran, Harold <HDoran at air.org>; Bernard Liew <B.Liew at bham.ac.uk>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

Hi,

However, I'm not sure how one would go about computing an ICC from 
ordinal data

I've never used the package "ordinal", but if it's anything like the "ordinal" family of MCMCglmm, then computing an ICC on the liability scale would be fairly easy, as one would just proceed as always and simply add the so-called "link variance" corresponding to the chosen link function (1 for probit, (pi^2)/3 for logit). E.g. for a given variance component Vcomp and a probit link:
ICC = Vcomp / (sum(variance components of the model) + 1)

However, computing an ICC on the data scale would be much more difficult as it is actually multivariate...

I think in the case when such scores were used, having the estimate on the liability scale make sense though, as the binning is more due to our inability of finely measuring this scale rather than an actual property of the system.

Cheers,
Pierre.

Le vendredi 15 juin 2018, 03:27:54 CEST Ben Bolker a ?crit :

More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing 
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform.  However, I'm not sure how one would go about 
computing an ICC from ordinal data ... (the 'ordinal' package is the 
place to look for the model-fitting procedures). Googling it finds 
some stuff, but it seems that it doesn't necessarily apply to complex 
designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliability
-for-ordinal-or-interval-data 
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/


On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org> wrote:

That?s a helpful clarification, Rolf. However, with gaussian normal 
errors in the linear model, we can?t *really* assume they would 
asymptote at 1 or 10. My suspicion is that these are likert-style 
ordered counts of some form, although the OP should clarify. In 
which case, the 1 or 10 are limits with censoring, as true values 
for some measured trait could exist outside those boundaries (and I 
suspect the model is forming predicted values outside of 1 or 10).



On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable 
is not continuous as you note. Continuous means it is a real 
number between -Inf/Inf and you have boundaries between 1 and 10. 
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to 
be continuous it is not necessary for it to have a range from 
-infinity to infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It is 
not clear to me what this means.  The "obvious"/usual meaning or 
interpretation would be that dv can take (only) the (positive 
integer) values 1, 2, ..., 10.  If this is so, then a continuous 
model is not appropriate.  (It should be noted however that people 
in the social sciences do this sort of thing --- i.e. treat discrete 
variables as continuous --- all the time.)

It is *possible* that dv can take values in the real interval 
[1,10], in which case it *is* continuous, and a "continuous model" 
is indeed appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

Dear Community,


I am doing a reliability study, using the methods of 
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question 
on the lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing 
sessions, and
10 trials per sessions. my dependent variable is a continuous 
variable (scale 1-10). Sessions are nested within each 
subject-assessor combination. I desire a ICC (3) formulation of 
inter-rater and inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation of 
the  variance components for ICC below right?
  2.  inter-rater ICC = var (subj) / (var(subj) + var (residual)) 
# I  read that the variation of raters will be lumped with the residual
  3.  inter-session ICC =( var (subj) + var (residual)) /( var 
(subj) +  var (subj:session) + var (residual))  some simulated 
data:
df = expand.grid(subj = c(1:20), rater = c(1:2), session = 
c(1:2), trial  = c(1:10))  df$vas = rnorm (nrow (df_sim), mean = 
3, sd = 1.5)

I appreciate the kind response.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Bernard Liew

Sun, Jun 17, 2018 8:42 AM #

Dear Pierre,

Again, thanks for referring me to the paper (and thanks to Ben for the review article). It's becoming clearer now.

When I refer to residuals, as I understand from lmer modelling, it is the level 1 variance. Back to my design, I have many subjects, two fixed raters, sessions nested within a subject, and trials nested within sessions. Hence, level 1 variance would typically reflect inter-trial variance.

If I were to use lmer modelling, I do not have to specify the level 1 random effects (i.e. ~1| SUBJ:SESSION:TRIAL), as it is in the residuals. 

However, using mcmcglmm, there appears to be an even lower level of random effects which is fixed (as you mentioned). So my question is, should I specify also the level 1 random effects?

Sample formula using your suggestion of probit (I just like to know if the priors and model are appropriate):
mcglmm_mod = MCMCglmm(vas ~ RATER, 
                      random =  ~ SUBJ +
                        SUBJ:SESSION +
                        SUBJ:SESSION:TRIAL,
                      data = as.data.frame (df %>% filter (SIDE == "R")),
                      family = "ordinal",
                      prior=list(R=list(V=1, fix=1), G=list(G1=list(V=1, nu=0),
                                                            G2=list(V=1, nu=0),
                                                            G3=list(V=1, nu=0))))
Many thanks again for your (and everyone's) help thus far.

Kind regards,
Bernard
-----Original Message-----
From: pierre.de.villemereuil at mailoo.org <pierre.de.villemereuil at mailoo.org> 
Sent: Saturday, June 16, 2018 7:28 PM
To: Bernard Liew <B.Liew at bham.ac.uk>
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

Dear Bernard,

Glad this was helpful.

You can find an explanation for this in the Supplementary File on our article on quantitative genetics interpretation of GLMMs:
http://www.genetics.org/content/204/3/1281.supplemental

It is fairly technical. Trying to put it simply: there is an equivalence between a binomial/probit model and the threshold model. This is because a probit link is actually the CDF of a normal distribution (same goes for a logit and a logistic distribution): using a probit link and a binomial is equivalent to using a threshold model after adding a random noise normally distributed (this is Fig. S1 in our Suppl. File above). 

And the variance of this random noise is thus the variance of a standard normal distribution which is 1 (or the variance of a standard logistic distribution, which is (pi^2)/3, for a logit link).

Computing the ICC on the liability scale "as if" a threshold model was used thus only requires to add that extra-variance on the denominator.

It all depends what you call "residual". In MCMCglmm, there is for example a "residual" variance called "units" which value should be fixed to e.g. 1 when running families such as ordinal. This variance is required to be added in the denominator of the ICC. If you have only a random effect, you would have:
Vrand / (Vrand + Vunits + 1)

Note that there is also the "threshold" family in MCMCglmm, which is special in the sense that (to say it quickly) the "units" variance is the variance of the probit link and should be fixed to 1. In that case, you would have:
Vrand / (Vrand + Vunits)

Sometimes this "extra-variance" (1 or (pi^2)/3) is indeed called "residual variance". This is because, as per my explanation above, probit and logit links can be seen as "adding a random noise then taking a threshold", this random noise is sometimes seen as a "residual error" in the model.

I hope this will clarify things for you. I'm afraid it's difficult to explain clearly the whys without going too much into the technicality of the models.

Cheers,
Pierre.

Kind regards,
Bernard

-----Original Message-----
From: pierre.de.villemereuil at mailoo.org 
<pierre.de.villemereuil at mailoo.org>
Sent: Friday, June 15, 2018 4:05 PM
To: r-sig-mixed-models at r-project.org
Cc: Ben Bolker <bbolker at gmail.com>; Doran, Harold <HDoran at air.org>; 
Bernard Liew <B.Liew at bham.ac.uk>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer 
for ICC computation in reliability study

Hi,

However, I'm not sure how one would go about computing an ICC from 
ordinal data

I've never used the package "ordinal", but if it's anything like the "ordinal" family of MCMCglmm, then computing an ICC on the liability scale would be fairly easy, as one would just proceed as always and simply add the so-called "link variance" corresponding to the chosen link function (1 for probit, (pi^2)/3 for logit). E.g. for a given variance component Vcomp and a probit link:
ICC = Vcomp / (sum(variance components of the model) + 1)

However, computing an ICC on the data scale would be much more difficult as it is actually multivariate...

I think in the case when such scores were used, having the estimate on the liability scale make sense though, as the binning is more due to our inability of finely measuring this scale rather than an actual property of the system.

Cheers,
Pierre.

Le vendredi 15 juin 2018, 03:27:54 CEST Ben Bolker a ?crit :

More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing 
sequence but does not assume the distance between levels (e.g. 1 vs 
2,
2 vs 3 ...) is uniform.  However, I'm not sure how one would go 
about computing an ICC from ordinal data ... (the 'ordinal' package 
is the place to look for the model-fitting procedures). Googling it 
finds some stuff, but it seems that it doesn't necessarily apply to 
complex designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliabili
ty
-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/


On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org> wrote:

That?s a helpful clarification, Rolf. However, with gaussian 
normal errors in the linear model, we can?t *really* assume they 
would asymptote at 1 or 10. My suspicion is that these are 
likert-style ordered counts of some form, although the OP should 
clarify. In which case, the 1 or 10 are limits with censoring, as 
true values for some measured trait could exist outside those 
boundaries (and I suspect the model is forming predicted values outside of 1 or 10).



On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable 
is not continuous as you note. Continuous means it is a real 
number between -Inf/Inf and you have boundaries between 1 and 10.
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to 
be continuous it is not necessary for it to have a range from 
-infinity to infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It 
is not clear to me what this means.  The "obvious"/usual meaning 
or interpretation would be that dv can take (only) the (positive
integer) values 1, 2, ..., 10.  If this is so, then a continuous 
model is not appropriate.  (It should be noted however that people 
in the social sciences do this sort of thing --- i.e. treat 
discrete variables as continuous --- all the time.)

It is *possible* that dv can take values in the real interval 
[1,10], in which case it *is* continuous, and a "continuous model"
is indeed appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

Dear Community,


I am doing a reliability study, using the methods of 
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question 
on the lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing 
sessions, and
10 trials per sessions. my dependent variable is a continuous 
variable (scale 1-10). Sessions are nested within each 
subject-assessor combination. I desire a ICC (3) formulation of 
inter-rater and inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation 
of the  variance components for ICC below right?
  2.  inter-rater ICC = var (subj) / (var(subj) + var 
(residual)) # I  read that the variation of raters will be lumped with the residual
  3.  inter-session ICC =( var (subj) + var (residual)) /( var
(subj) +  var (subj:session) + var (residual))  some simulated
data:
df = expand.grid(subj = c(1:20), rater = c(1:2), session = 
c(1:2), trial  = c(1:10))  df$vas = rnorm (nrow (df_sim), mean = 
3, sd = 1.5)

I appreciate the kind response.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Pierre de Villemereuil

Mon, Jun 18, 2018 12:45 AM #

Hi,

Thing is: there is nothing like this for GLMMs. The lowest level of "residual variance" is basically the distribution variance.

You could think there would be such a thing with a threshold model, but it turns out that the total variance of the liability scale is non identifiable, so nothing there either.

Indeed.

No, you shouldn't. There is already the "residual" variance in MCMCglmm (the R part of the variance) which should account for that, but that you need to fix to 1 (because of the non identifiability issue I'm mentioning above). If you had such kind of random effects, it means you re-model such a non identifiable variance.

As I'm saying above, I suggest you remove the SUBJ:SESSION:TRIAL effect, if it corresponds, as you state to a "level-1 variance". 

Another thing is that I don't think it is wise to use nu = 0 for such a model prior. I'd recommend using the parameter expansion of MCMCglmm's prior and especially a chi-square like prior that has been shown to perform quite well for such models:
G1 = list(V = 1, nu = 1000, alpha.mu = 0, alpha.V = 1)

And a last thing is that the "threshold" family is supposed to have better mixing and be faster than the "ordinal" family according to Jarrod Hadfield. You might want to give it a try (and beware that the + 1 should be removed from the ICC denominator when using "threshold", as per my previous email).

Best,
Pierre.

Kind regards,
Bernard
-----Original Message-----
From: pierre.de.villemereuil at mailoo.org <pierre.de.villemereuil at mailoo.org> 
Sent: Saturday, June 16, 2018 7:28 PM
To: Bernard Liew <B.Liew at bham.ac.uk>
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

Dear Bernard,

Thanks Pierre for the suggestion to use MCMCglmm. Very useful.

Glad this was helpful.

1) Can I ask why when taking the ratio of the variance components, the denominator is ICC = Vcomp / (sum(variance components of the model) + 1)? Why is the one added?

You can find an explanation for this in the Supplementary File on our article on quantitative genetics interpretation of GLMMs:
http://www.genetics.org/content/204/3/1281.supplemental

It is fairly technical. Trying to put it simply: there is an equivalence between a binomial/probit model and the threshold model. This is because a probit link is actually the CDF of a normal distribution (same goes for a logit and a logistic distribution): using a probit link and a binomial is equivalent to using a threshold model after adding a random noise normally distributed (this is Fig. S1 in our Suppl. File above). 

And the variance of this random noise is thus the variance of a standard normal distribution which is 1 (or the variance of a standard logistic distribution, which is (pi^2)/3, for a logit link).

Computing the ICC on the liability scale "as if" a threshold model was used thus only requires to add that extra-variance on the denominator.

2) I have tried mixed ordinal modelling using "ordinal" or MCMCglmm, 
and noticed the variance of the residuals of the model is not 
produced. I have also read that the residual is assumed to be as you 
mentioned (pi^2)/3. Is there a reason (maybe a not so technical one?)

It all depends what you call "residual". In MCMCglmm, there is for example a "residual" variance called "units" which value should be fixed to e.g. 1 when running families such as ordinal. This variance is required to be added in the denominator of the ICC. If you have only a random effect, you would have:
Vrand / (Vrand + Vunits + 1)

Note that there is also the "threshold" family in MCMCglmm, which is special in the sense that (to say it quickly) the "units" variance is the variance of the probit link and should be fixed to 1. In that case, you would have:
Vrand / (Vrand + Vunits)

Sometimes this "extra-variance" (1 or (pi^2)/3) is indeed called "residual variance". This is because, as per my explanation above, probit and logit links can be seen as "adding a random noise then taking a threshold", this random noise is sometimes seen as a "residual error" in the model.

I hope this will clarify things for you. I'm afraid it's difficult to explain clearly the whys without going too much into the technicality of the models.

Cheers,
Pierre.

Kind regards,
Bernard

-----Original Message-----
From: pierre.de.villemereuil at mailoo.org 
<pierre.de.villemereuil at mailoo.org>
Sent: Friday, June 15, 2018 4:05 PM
To: r-sig-mixed-models at r-project.org
Cc: Ben Bolker <bbolker at gmail.com>; Doran, Harold <HDoran at air.org>; 
Bernard Liew <B.Liew at bham.ac.uk>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer 
for ICC computation in reliability study

Hi,

However, I'm not sure how one would go about computing an ICC from 
ordinal data

I've never used the package "ordinal", but if it's anything like the "ordinal" family of MCMCglmm, then computing an ICC on the liability scale would be fairly easy, as one would just proceed as always and simply add the so-called "link variance" corresponding to the chosen link function (1 for probit, (pi^2)/3 for logit). E.g. for a given variance component Vcomp and a probit link:
ICC = Vcomp / (sum(variance components of the model) + 1)

However, computing an ICC on the data scale would be much more difficult as it is actually multivariate...

I think in the case when such scores were used, having the estimate on the liability scale make sense though, as the binning is more due to our inability of finely measuring this scale rather than an actual property of the system.

Cheers,
Pierre.

Le vendredi 15 juin 2018, 03:27:54 CEST Ben Bolker a ?crit :

More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing 
sequence but does not assume the distance between levels (e.g. 1 vs 
2,
2 vs 3 ...) is uniform.  However, I'm not sure how one would go 
about computing an ICC from ordinal data ... (the 'ordinal' package 
is the place to look for the model-fitting procedures). Googling it 
finds some stuff, but it seems that it doesn't necessarily apply to 
complex designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliabili
ty
-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/


On Thu, Jun 14, 2018 at 6:58 PM, Doran, Harold <HDoran at air.org> wrote:

That?s a helpful clarification, Rolf. However, with gaussian 
normal errors in the linear model, we can?t *really* assume they 
would asymptote at 1 or 10. My suspicion is that these are 
likert-style ordered counts of some form, although the OP should 
clarify. In which case, the 1 or 10 are limits with censoring, as 
true values for some measured trait could exist outside those 
boundaries (and I suspect the model is forming predicted values outside of 1 or 10).



On 6/14/18, 6:33 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:

On 15/06/18 05:35, Doran, Harold wrote:

Well no, you?re specification is not right because your variable 
is not continuous as you note. Continuous means it is a real 
number between -Inf/Inf and you have boundaries between 1 and 10.
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading.  For a variable to 
be continuous it is not necessary for it to have a range from 
-infinity to infinity.

The OP says that dv  "is a continuous variable (scale 1-10)".  It 
is not clear to me what this means.  The "obvious"/usual meaning 
or interpretation would be that dv can take (only) the (positive
integer) values 1, 2, ..., 10.  If this is so, then a continuous 
model is not appropriate.  (It should be noted however that people 
in the social sciences do this sort of thing --- i.e. treat 
discrete variables as continuous --- all the time.)

It is *possible* that dv can take values in the real interval 
[1,10], in which case it *is* continuous, and a "continuous model"
is indeed appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 6/14/18, 11:16 AM, "Bernard Liew" <B.Liew at bham.ac.uk> wrote:

Dear Community,


I am doing a reliability study, using the methods of 
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question 
on the lmer formulation and the use of the variance components.

Background: I have 20 subjects, 2 fixed raters, 2 testing 
sessions, and
10 trials per sessions. my dependent variable is a continuous 
variable (scale 1-10). Sessions are nested within each 
subject-assessor combination. I desire a ICC (3) formulation of 
inter-rater and inter-session reliability from the variance components.

My lmer model is:

lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)

Question:

  1.  is the model formulation right? and is my interpretation 
of the  variance components for ICC below right?
  2.  inter-rater ICC = var (subj) / (var(subj) + var 
(residual)) # I  read that the variation of raters will be lumped with the residual
  3.  inter-session ICC =( var (subj) + var (residual)) /( var
(subj) +  var (subj:session) + var (residual))  some simulated
data:
df = expand.grid(subj = c(1:20), rater = c(1:2), session = 
c(1:2), trial  = c(1:10))  df$vas = rnorm (nrow (df_sim), mean = 
3, sd = 1.5)

I appreciate the kind response.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models