[R-meta] Differences in calculation of CVR in escalc() - R-SIG-meta-analysis

Mon, Oct 9, 2017 5:12 AM #

Dear all,

I am conducting a meta-analysis on the stability of crop yields. I now 
follow the approach suggeted by Nakagawa et al. (2015) approach and its 
implementation in the metafor package, which helps me a lot!

As I first step I compared the estimates of the escalc function for ROM, 
VR and CVR to the actual formulas (actually I used the functions in the 
supplement of Nakagawa). Fortunately, they all yielded the same 
estimates, except for the variance estimate of CVR. I did the 
calculations on the gibson example data. The respective code (only for 
CVR) is:

data <- get(data(dat.gibson2002))

metadat <- escalc(measure="CVR", m1i=m1i, m2i=m2i, sd1i=sd1i, sd2i=sd2i, 
n1i=n1i, n2i=n2i, data=data)

# functions from Nakagawa et al. (2015)
Calc.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN){
 ? ES<-log(ESD) - log(EMean) + 1 / (2*(EN - 1)) - (log(CSD) - log(CMean) 
+ 1 / (2*(CN - 1)))
 ? return(ES)
}
Calc.var.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN, Equal.E.C.Corr=T){
 ? if(Equal.E.C.Corr==T){
 ??? mvcorr<-cor.test(log(c(CMean, EMean)), log(c(CSD, ESD)))$estimate
 ??? S2<- CSD^2 / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 * mvcorr * 
sqrt((CSD^2 / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
 ???????? ESD^2 / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 * mvcorr * 
sqrt((ESD^2 / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
 ? }? else{
 ??? Cmvcorr<-cor.test(log(CMean), log(CSD))$estimate? # corrected 
(missing log()), was "cor.test(log(EMean), (ESD))$estimate"
 ??? Emvcorr<-cor.test(log(EMean), log(ESD))$estimate
 ??? S2<- CSD^2 / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 *Cmvcorr * 
sqrt((CSD^2 / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
 ???????? ESD^2 / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 *Emvcorr * 
sqrt((ESD^2 / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
 ? }
 ? return(S2)
}

# compare

with(data,Calc.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i))
metadat$yi # is the same
# with pooled correlation
with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = T))
metadat$vi # NOT THE SAME!!!
# with separate correlations for E and C
with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = F))
metadat$vi # ALSO NOT THE SAME!!!


I checked all the equations in the Nakagawa functions and couldn't find 
any error. Also, I tried the pooled and separate correlation. 
Unfortunately, I didn't manage to access the code behind the escalc 
function in order to check the underlying calculations.

Does anybody have a suggestion, what this difference could be due to?

(Versions: R 3.4.2, metafor 2.0)

Many thanks,

Sam

Reference: Nakagawa et al. , 2015. Meta-analysis of variation: 
ecological and evolutionary applications and beyond. Methods Ecol Evol 
6, 143?152. doi:10.1111/2041-210X.12309

Samuel Knapp

Lehrstuhl f?r Pflanzenern?hrung
Technische Universit?t M?nchen
(Chair of Plant Nutrition
Technical University of Munich)

Emil-Ramann-Strasse 2
D-85354 Freising

Tel. +49 8161 71-3578	
samuel.knapp at tum.de
www.researchgate.net/profile/Samuel_Knapp


	[[alternative HTML version deleted]]

Michael Dewey

Mon, Oct 9, 2017 7:17 AM #

Dear Samuel

Not sure what the issue is but the code from escalc is

           if (measure == "CVR") {
                 yi <- log(sd1i/m1i) + 1/(2 * (n1i - 1)) - log(sd2i/m2i) -
                   1/(2 * (n2i - 1))
                 vi <- 1/(2 * (n1i - 1)) + sd1i^2/(n1i * m1i^2) +
                   1/(2 * (n2i - 1)) + sd2i^2/(n2i * m2i^2)
             }

Note you can obtain this by going
library(metafor)
sink("escalc.txt")
escalc.default
sink()

and examining escalc.txt with your favourite text editor

Michael

On 09/10/2017 13:12, Samuel Knapp wrote:

Dear all,

I am conducting a meta-analysis on the stability of crop yields. I now
follow the approach suggeted by Nakagawa et al. (2015) approach and its
implementation in the metafor package, which helps me a lot!

As I first step I compared the estimates of the escalc function for ROM,
VR and CVR to the actual formulas (actually I used the functions in the
supplement of Nakagawa). Fortunately, they all yielded the same
estimates, except for the variance estimate of CVR. I did the
calculations on the gibson example data. The respective code (only for
CVR) is:

data <- get(data(dat.gibson2002))

metadat <- escalc(measure="CVR", m1i=m1i, m2i=m2i, sd1i=sd1i, sd2i=sd2i,
n1i=n1i, n2i=n2i, data=data)

# functions from Nakagawa et al. (2015)
Calc.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN){
  ? ES<-log(ESD) - log(EMean) + 1 / (2*(EN - 1)) - (log(CSD) - log(CMean)
+ 1 / (2*(CN - 1)))
  ? return(ES)
}
Calc.var.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN, Equal.E.C.Corr=T){
  ? if(Equal.E.C.Corr==T){
  ??? mvcorr<-cor.test(log(c(CMean, EMean)), log(c(CSD, ESD)))$estimate
  ??? S2<- CSD^2 / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 * mvcorr *
sqrt((CSD^2 / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
  ???????? ESD^2 / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 * mvcorr *
sqrt((ESD^2 / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
  ? }? else{
  ??? Cmvcorr<-cor.test(log(CMean), log(CSD))$estimate? # corrected
(missing log()), was "cor.test(log(EMean), (ESD))$estimate"
  ??? Emvcorr<-cor.test(log(EMean), log(ESD))$estimate
  ??? S2<- CSD^2 / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 *Cmvcorr *
sqrt((CSD^2 / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
  ???????? ESD^2 / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 *Emvcorr *
sqrt((ESD^2 / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
  ? }
  ? return(S2)
}

# compare

with(data,Calc.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i))
metadat$yi # is the same
# with pooled correlation
with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = T))
metadat$vi # NOT THE SAME!!!
# with separate correlations for E and C
with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = F))
metadat$vi # ALSO NOT THE SAME!!!


I checked all the equations in the Nakagawa functions and couldn't find
any error. Also, I tried the pooled and separate correlation.
Unfortunately, I didn't manage to access the code behind the escalc
function in order to check the underlying calculations.

Does anybody have a suggestion, what this difference could be due to?

(Versions: R 3.4.2, metafor 2.0)

Many thanks,

Sam

Reference: Nakagawa et al. , 2015. Meta-analysis of variation:
ecological and evolutionary applications and beyond. Methods Ecol Evol
6, 143?152. doi:10.1111/2041-210X.12309

Michael
http://www.dewey.myzen.co.uk/home.html

Samuel Knapp

Mon, Oct 9, 2017 9:34 AM #

Dear Michael and others,

thanks for your reply.

I discovered the difference: in escalc(), in the calculation of the 
variance of lnCVR (vi), the subtraction of term including the 
correlation (Eqn 12 in Nakagawa et al 2015) is ommited. Does anybody 
know, if this Is this due due to any mathematical reasons, or just to 
keep it simple? Or should/could this be adjusted in the function?

The values for vi estimated by escalc() differ from the values based on 
the original equation, on average by a factor of 2.6 assessed on the 
gibson example dataset.

Best regards,

Samuel

On 09/10/17 16:17, Michael Dewey wrote:

Dear Samuel

Not sure what the issue is but the code from escalc is

????????? if (measure == "CVR") {
??????????????? yi <- log(sd1i/m1i) + 1/(2 * (n1i - 1)) - log(sd2i/m2i) -
????????????????? 1/(2 * (n2i - 1))
??????????????? vi <- 1/(2 * (n1i - 1)) + sd1i^2/(n1i * m1i^2) +
????????????????? 1/(2 * (n2i - 1)) + sd2i^2/(n2i * m2i^2)
??????????? }

Note you can obtain this by going
library(metafor)
sink("escalc.txt")
escalc.default
sink()

and examining escalc.txt with your favourite text editor

Michael

On 09/10/2017 13:12, Samuel Knapp wrote:

Dear all,

I am conducting a meta-analysis on the stability of crop yields. I now
follow the approach suggeted by Nakagawa et al. (2015) approach and its
implementation in the metafor package, which helps me a lot!

As I first step I compared the estimates of the escalc function for ROM,
VR and CVR to the actual formulas (actually I used the functions in the
supplement of Nakagawa). Fortunately, they all yielded the same
estimates, except for the variance estimate of CVR. I did the
calculations on the gibson example data. The respective code (only for
CVR) is:

data <- get(data(dat.gibson2002))

metadat <- escalc(measure="CVR", m1i=m1i, m2i=m2i, sd1i=sd1i, sd2i=sd2i,
n1i=n1i, n2i=n2i, data=data)

# functions from Nakagawa et al. (2015)
Calc.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN){
? ? ES<-log(ESD) - log(EMean) + 1 / (2*(EN - 1)) - (log(CSD) - 
log(CMean)
+ 1 / (2*(CN - 1)))
? ? return(ES)
}
Calc.var.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN, 
Equal.E.C.Corr=T){
? ? if(Equal.E.C.Corr==T){
? ??? mvcorr<-cor.test(log(c(CMean, EMean)), log(c(CSD, ESD)))$estimate
? ??? S2<- CSD^2 / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 * mvcorr *
sqrt((CSD^2 / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
? ???????? ESD^2 / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 * mvcorr *
sqrt((ESD^2 / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
? ? }? else{
? ??? Cmvcorr<-cor.test(log(CMean), log(CSD))$estimate? # corrected
(missing log()), was "cor.test(log(EMean), (ESD))$estimate"
? ??? Emvcorr<-cor.test(log(EMean), log(ESD))$estimate
? ??? S2<- CSD^2 / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 *Cmvcorr *
sqrt((CSD^2 / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
? ???????? ESD^2 / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 *Emvcorr *
sqrt((ESD^2 / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
? ? }
? ? return(S2)
}

# compare

with(data,Calc.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i))
metadat$yi # is the same
# with pooled correlation
with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = T))
metadat$vi # NOT THE SAME!!!
# with separate correlations for E and C
with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = F))
metadat$vi # ALSO NOT THE SAME!!!


I checked all the equations in the Nakagawa functions and couldn't find
any error. Also, I tried the pooled and separate correlation.
Unfortunately, I didn't manage to access the code behind the escalc
function in order to check the underlying calculations.

Does anybody have a suggestion, what this difference could be due to?

(Versions: R 3.4.2, metafor 2.0)

Many thanks,

Sam

Reference: Nakagawa et al. , 2015. Meta-analysis of variation:
ecological and evolutionary applications and beyond. Methods Ecol Evol
6, 143?152. doi:10.1111/2041-210X.12309

Viechtbauer Wolfgang (STAT)

Tue, Oct 10, 2017 2:38 AM #

Dear Samuel,

Eq. 12 in Nakagawa et al (2015) is not correct. For normally distributed data, the mean and variance are independent and so are ln(mean) and ln(SD). Hence, the correlation term should be omitted.

For data that cannot be assumed to be (approximately) normally distributed, the mean and variance are no longer independent and then one would need to account for their correlation in the computation of the sampling variance. However, the correlation between the means and variances (or ln(mean) and ln(SD) values) across studies is not the same thing as the correlation within the bivariate sampling distribution. In fact, those are really different things.

One would have to derive what the correlation is depending on the type of distribution one wants to assume for the data. The correct equation would differ for each type of distribution.

Best,
Wolfgang

Wolfgang Viechtbauer, Ph.D., Statistician | Department of Psychiatry and 
Neuropsychology | Maastricht University | P.O. Box 616 (VIJV1) | 6200 MD 
Maastricht, The Netherlands | +31 (43) 388-4170 | http://www.wvbauer.com 

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Samuel Knapp
Sent: Monday, 09 October, 2017 18:35
To: Michael Dewey; r-sig-meta-analysis at r-project.org
Subject: Re: [R-meta] Differences in calculation of CVR in escalc()

Dear Michael and others,

thanks for your reply.

I discovered the difference: in escalc(), in the calculation of the 
variance of lnCVR (vi), the subtraction of term including the 
correlation (Eqn 12 in Nakagawa et al 2015) is ommited. Does anybody 
know, if this Is this due due to any mathematical reasons, or just to 
keep it simple? Or should/could this be adjusted in the function?

The values for vi estimated by escalc() differ from the values based on 
the original equation, on average by a factor of 2.6 assessed on the 
gibson example dataset.

Best regards,

Samuel

On 09/10/17 16:17, Michael Dewey wrote:
> Dear Samuel
>
> Not sure what the issue is but the code from escalc is
>
> ????????? if (measure == "CVR") {
> ??????????????? yi <- log(sd1i/m1i) + 1/(2 * (n1i - 1)) - log(sd2i/m2i) -
> ????????????????? 1/(2 * (n2i - 1))
> ??????????????? vi <- 1/(2 * (n1i - 1)) + sd1i^2/(n1i * m1i^2) +
> ????????????????? 1/(2 * (n2i - 1)) + sd2i^2/(n2i * m2i^2)
> ??????????? }
>
> Note you can obtain this by going
> library(metafor)
> sink("escalc.txt")
> escalc.default
> sink()
>
> and examining escalc.txt with your favourite text editor
>
> Michael
>
> On 09/10/2017 13:12, Samuel Knapp wrote:
>> Dear all,
>>
>> I am conducting a meta-analysis on the stability of crop yields. I now
>> follow the approach suggeted by Nakagawa et al. (2015) approach and its
>> implementation in the metafor package, which helps me a lot!
>>
>> As I first step I compared the estimates of the escalc function for ROM,
>> VR and CVR to the actual formulas (actually I used the functions in the
>> supplement of Nakagawa). Fortunately, they all yielded the same
>> estimates, except for the variance estimate of CVR. I did the
>> calculations on the gibson example data. The respective code (only for
>> CVR) is:
>>
>> data <- get(data(dat.gibson2002))
>>
>> metadat <- escalc(measure="CVR", m1i=m1i, m2i=m2i, sd1i=sd1i, sd2i=sd2i,
>> n1i=n1i, n2i=n2i, data=data)
>>
>> # functions from Nakagawa et al. (2015)
>> Calc.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN){
>> ? ? ES<-log(ESD) - log(EMean) + 1 / (2*(EN - 1)) - (log(CSD) - 
>> log(CMean)
>> + 1 / (2*(CN - 1)))
>> ? ? return(ES)
>> }
>> Calc.var.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN, 
>> Equal.E.C.Corr=T){
>> ? ? if(Equal.E.C.Corr==T){
>> ? ??? mvcorr<-cor.test(log(c(CMean, EMean)), log(c(CSD, ESD)))$estimate
>> ? ??? S2<- CSD^2 / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 * mvcorr *
>> sqrt((CSD^2 / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
>> ? ???????? ESD^2 / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 * mvcorr *
>> sqrt((ESD^2 / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
>> ? ? }? else{
>> ? ??? Cmvcorr<-cor.test(log(CMean), log(CSD))$estimate? # corrected
>> (missing log()), was "cor.test(log(EMean), (ESD))$estimate"
>> ? ??? Emvcorr<-cor.test(log(EMean), log(ESD))$estimate
>> ? ??? S2<- CSD^2 / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 *Cmvcorr *
>> sqrt((CSD^2 / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
>> ? ???????? ESD^2 / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 *Emvcorr *
>> sqrt((ESD^2 / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
>> ? ? }
>> ? ? return(S2)
>> }
>>
>> # compare
>>
>> with(data,Calc.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i))
>> metadat$yi # is the same
>> # with pooled correlation
>> with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = T))
>> metadat$vi # NOT THE SAME!!!
>> # with separate correlations for E and C
>> with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = F))
>> metadat$vi # ALSO NOT THE SAME!!!
>>
>>
>> I checked all the equations in the Nakagawa functions and couldn't find
>> any error. Also, I tried the pooled and separate correlation.
>> Unfortunately, I didn't manage to access the code behind the escalc
>> function in order to check the underlying calculations.
>>
>> Does anybody have a suggestion, what this difference could be due to?
>>
>> (Versions: R 3.4.2, metafor 2.0)
>>
>> Many thanks,
>>
>> Sam
>>
>> Reference: Nakagawa et al. , 2015. Meta-analysis of variation:
>> ecological and evolutionary applications and beyond. Methods Ecol Evol
>> 6, 143?152. doi:10.1111/2041-210X.12309

Samuel Knapp

Tue, Oct 10, 2017 3:00 PM #

Dear Wolfgang,

that sounds like a tricky problem. I agree with you, that the best (or 
the worst) assumption about the distribution we can make is normal 
distribution. However, in observations the mean and sd covary very often 
(e.g. D?ring et al. 2015), which is also the motivation to use CV=sd/mean.

Nagakawa et al. (2015) in their appendix assume normal distribution of 
the means and sds, but also assume covariation of mean and sd (without 
giving references, but I guess because of above mentioned observation).

I understand your point about the different kinds of correlation between 
studies and the bivariate sampling distribution. However, would it not 
be better to still include the correlation in order to account for the 
often observed covariation of mean and sd (and still being a good 
approximation independent of the real distribution), and also with the 
argument if there is no correlation (because of a assumed normal 
distribution), it will be estimated to zero and thus have no effect?

Looking forward to your reply!

Best regards,

Samuel

References: D?ring, T.F., Knapp, S., Cohen, J.E., 2015. Taylor?s power 
law and the stability of crop yields. Field Crops Research 183, 294?302. 
doi:10.1016/j.fcr.2015.08.005

On 10/10/17 11:38, Viechtbauer Wolfgang (SP) wrote:

Dear Samuel,

Eq. 12 in Nakagawa et al (2015) is not correct. For normally distributed data, the mean and variance are independent and so are ln(mean) and ln(SD). Hence, the correlation term should be omitted.

For data that cannot be assumed to be (approximately) normally distributed, the mean and variance are no longer independent and then one would need to account for their correlation in the computation of the sampling variance. However, the correlation between the means and variances (or ln(mean) and ln(SD) values) across studies is not the same thing as the correlation within the bivariate sampling distribution. In fact, those are really different things.

One would have to derive what the correlation is depending on the type of distribution one wants to assume for the data. The correct equation would differ for each type of distribution.

Best,
Wolfgang

-- Wolfgang Viechtbauer, Ph.D., Statistician | Department of 
Psychiatry and Neuropsychology | Maastricht University | P.O. Box 616 
(VIJV1) | 6200 MD Maastricht, The Netherlands | +31 (43) 388-4170 | 
http://www.wvbauer.com -----Original Message----- From: 
R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] 
On Behalf Of Samuel Knapp Sent: Monday, 09 October, 2017 18:35 To: 
Michael Dewey; r-sig-meta-analysis at r-project.org Subject: Re: [R-meta] 
Differences in calculation of CVR in escalc() Dear Michael and others, 
thanks for your reply. I discovered the difference: in escalc(), in 
the calculation of the variance of lnCVR (vi), the subtraction of term 
including the correlation (Eqn 12 in Nakagawa et al 2015) is ommited. 
Does anybody know, if this Is this due due to any mathematical 
reasons, or just to keep it simple? Or should/could this be adjusted 
in the function? The values for vi estimated by escalc() differ from 
the values based on the original equation, on average by a factor of 
2.6 assessed on the gibson example dataset. Best regards, Samuel On 
09/10/17 16:17, Michael Dewey wrote:

Dear Samuel

Not sure what the issue is but the code from escalc is

 ????????? if (measure == "CVR") {
 ??????????????? yi <- log(sd1i/m1i) + 1/(2 * (n1i - 1)) - log(sd2i/m2i) -
 ????????????????? 1/(2 * (n2i - 1))
 ??????????????? vi <- 1/(2 * (n1i - 1)) + sd1i^2/(n1i * m1i^2) +
 ????????????????? 1/(2 * (n2i - 1)) + sd2i^2/(n2i * m2i^2)
 ??????????? }

Note you can obtain this by going
library(metafor)
sink("escalc.txt")
escalc.default
sink()

and examining escalc.txt with your favourite text editor

Michael

On 09/10/2017 13:12, Samuel Knapp wrote:

Dear all,

I am conducting a meta-analysis on the stability of crop yields. I now
follow the approach suggeted by Nakagawa et al. (2015) approach and its
implementation in the metafor package, which helps me a lot!

As I first step I compared the estimates of the escalc function for ROM,
VR and CVR to the actual formulas (actually I used the functions in the
supplement of Nakagawa). Fortunately, they all yielded the same
estimates, except for the variance estimate of CVR. I did the
calculations on the gibson example data. The respective code (only for
CVR) is:

data <- get(data(dat.gibson2002))

metadat <- escalc(measure="CVR", m1i=m1i, m2i=m2i, sd1i=sd1i, sd2i=sd2i,
n1i=n1i, n2i=n2i, data=data)

# functions from Nakagawa et al. (2015)
Calc.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN){
 ? ? ES<-log(ESD) - log(EMean) + 1 / (2*(EN - 1)) - (log(CSD) -
log(CMean)
+ 1 / (2*(CN - 1)))
 ? ? return(ES)
}
Calc.var.lnCVR<-function(CMean, CSD, CN, EMean, ESD, EN,
Equal.E.C.Corr=T){
 ? ? if(Equal.E.C.Corr==T){
 ? ??? mvcorr<-cor.test(log(c(CMean, EMean)), log(c(CSD, ESD)))$estimate
 ? ??? S2<- CSD^2  / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 * mvcorr *
sqrt((CSD^2  / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
 ? ???????? ESD^2  / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 * mvcorr *
sqrt((ESD^2  / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
 ? ? }? else{
 ? ??? Cmvcorr<-cor.test(log(CMean), log(CSD))$estimate? # corrected
(missing log()), was "cor.test(log(EMean), (ESD))$estimate"
 ? ??? Emvcorr<-cor.test(log(EMean), log(ESD))$estimate
 ? ??? S2<- CSD^2  / (CN * (CMean^2)) + 1 / (2 * (CN - 1)) - 2 *Cmvcorr *
sqrt((CSD^2  / (CN * (CMean^2))) * (1 / (2 * (CN - 1)))) +
 ? ???????? ESD^2  / (EN * (EMean^2)) + 1 / (2 * (EN - 1)) - 2 *Emvcorr *
sqrt((ESD^2  / (EN * (EMean^2))) * (1 / (2 * (EN - 1))))
 ? ? }
 ? ? return(S2)
}

# compare

with(data,Calc.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i))
metadat$yi # is the same
# with pooled correlation
with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = T))
metadat$vi # NOT THE SAME!!!
# with separate correlations for E and C
with(data,Calc.var.lnCVR(m2i,sd2i,n2i,m1i,sd1i,n1i,Equal.E.C.Corr = F))
metadat$vi # ALSO NOT THE SAME!!!


I checked all the equations in the Nakagawa functions and couldn't find
any error. Also, I tried the pooled and separate correlation.
Unfortunately, I didn't manage to access the code behind the escalc
function in order to check the underlying calculations.

Does anybody have a suggestion, what this difference could be due to?

(Versions: R 3.4.2, metafor 2.0)

Many thanks,

Sam

Reference: Nakagawa et al. , 2015. Meta-analysis of variation:
ecological and evolutionary applications and beyond. Methods Ecol Evol
6, 143?152. doi:10.1111/2041-210X.12309

Viechtbauer Wolfgang (STAT)

Wed, Oct 11, 2017 12:23 PM #

Hi Samuel,

Taylor's law describes an *empirical* phenomenon that means and variances (or transformations thereof) across studies tend to be associated in certain ways. That is, if the true mean is larger, than the true variance also tends to be larger.

That is again something entirely different than the correlation between the sample mean and sample variance (or transformations thereof) in the bivariate sampling distribution for data from some distribution. We can derive this correlation (so this is not an empirical phenomenon but a purely statistical fact) and for normally distributed data, that correlation is zero.

Across many studies, the sample mean and variance could indeed be correlated even for normally distributed data because of Taylor's law. But then this has nothing to do with the sampling variance. We would capture this correlation by modeling the relationship between the underlying true means and variances in some way.

Best,
Wolfgang

-----Original Message-----
From: Samuel Knapp [mailto:samuel.knapp at tum.de] 
Sent: Wednesday, 11 October, 2017 0:00
To: Viechtbauer Wolfgang (SP); Michael Dewey; r-sig-meta-analysis at r-project.org
Subject: Re: [R-meta] Differences in calculation of CVR in escalc()

Dear Wolfgang,

that sounds like a tricky problem. I agree with you, that the best (or the worst) assumption about the distribution we can make is normal distribution. However, in observations the mean and sd covary very often (e.g. D?ring et al. 2015), which is also the motivation to use CV=sd/mean.

Nagakawa et al. (2015) in their appendix assume normal distribution of the means and sds, but also assume covariation of mean and sd (without giving references, but I guess because of above mentioned observation).

I understand your point about the different kinds of correlation between studies and the bivariate sampling distribution. However, would it not be better to still include the correlation in order to account for the often observed covariation of mean and sd (and still being a good approximation independent of the real distribution), and also with the argument if there is no correlation (because of a assumed normal distribution), it will be estimated to zero and thus have no effect?

Looking forward to your reply!

Best regards,
Samuel

References: D?ring, T.F., Knapp, S., Cohen, J.E., 2015. Taylor?s power law and the stability of crop yields. Field Crops Research 183, 294?302. doi:10.1016/j.fcr.2015.08.005

On 10/10/17 11:38, Viechtbauer Wolfgang (SP) wrote:

Dear Samuel,

Eq. 12 in Nakagawa et al (2015) is not correct. For normally distributed data, the mean and variance are independent and so are ln(mean) and ln(SD). Hence, the correlation term should be omitted.

For data that cannot be assumed to be (approximately) normally distributed, the mean and variance are no longer independent and then one would need to account for their correlation in the computation of the sampling variance. However, the correlation between the means and variances (or ln(mean) and ln(SD) values) across studies is not the same thing as the correlation within the bivariate sampling distribution. In fact, those are really different things.

One would have to derive what the correlation is depending on the type of distribution one wants to assume for the data. The correct equation would differ for each type of distribution.

Best,
Wolfgang

Samuel Knapp

Thu, Oct 12, 2017 12:28 AM #

Hi Wolfgang,

many thanks for your explanation. I think I understand your point now.

However, it would be nice, if this difference in calculation would be 
made clear in the help of escalc. So far, only Nakagawa et al. (2015) is 
cited (even 3 times), which lead me to assume that calculations are done 
as shown in the given reference.

Also, is there any reference with your argument?

Best,

Samuel

On 11/10/17 21:23, Viechtbauer Wolfgang (SP) wrote:

Hi Samuel,

Taylor's law describes an *empirical* phenomenon that means and variances (or transformations thereof) across studies tend to be associated in certain ways. That is, if the true mean is larger, than the true variance also tends to be larger.

That is again something entirely different than the correlation between the sample mean and sample variance (or transformations thereof) in the bivariate sampling distribution for data from some distribution. We can derive this correlation (so this is not an empirical phenomenon but a purely statistical fact) and for normally distributed data, that correlation is zero.

Across many studies, the sample mean and variance could indeed be correlated even for normally distributed data because of Taylor's law. But then this has nothing to do with the sampling variance. We would capture this correlation by modeling the relationship between the underlying true means and variances in some way.

Best,
Wolfgang

-----Original Message-----
From: Samuel Knapp [mailto:samuel.knapp at tum.de]
Sent: Wednesday, 11 October, 2017 0:00
To: Viechtbauer Wolfgang (SP); Michael Dewey; r-sig-meta-analysis at r-project.org
Subject: Re: [R-meta] Differences in calculation of CVR in escalc()

Dear Wolfgang,

that sounds like a tricky problem. I agree with you, that the best (or the worst) assumption about the distribution we can make is normal distribution. However, in observations the mean and sd covary very often (e.g. D?ring et al. 2015), which is also the motivation to use CV=sd/mean.

Nagakawa et al. (2015) in their appendix assume normal distribution of the means and sds, but also assume covariation of mean and sd (without giving references, but I guess because of above mentioned observation).

I understand your point about the different kinds of correlation between studies and the bivariate sampling distribution. However, would it not be better to still include the correlation in order to account for the often observed covariation of mean and sd (and still being a good approximation independent of the real distribution), and also with the argument if there is no correlation (because of a assumed normal distribution), it will be estimated to zero and thus have no effect?

Looking forward to your reply!

Best regards,
Samuel

References: D?ring, T.F., Knapp, S., Cohen, J.E., 2015. Taylor?s power law and the stability of crop yields. Field Crops Research 183, 294?302. doi:10.1016/j.fcr.2015.08.005

On 10/10/17 11:38, Viechtbauer Wolfgang (SP) wrote:
Dear Samuel,

Eq. 12 in Nakagawa et al (2015) is not correct. For normally distributed data, the mean and variance are independent and so are ln(mean) and ln(SD). Hence, the correlation term should be omitted.

For data that cannot be assumed to be (approximately) normally distributed, the mean and variance are no longer independent and then one would need to account for their correlation in the computation of the sampling variance. However, the correlation between the means and variances (or ln(mean) and ln(SD) values) across studies is not the same thing as the correlation within the bivariate sampling distribution. In fact, those are really different things.

One would have to derive what the correlation is depending on the type of distribution one wants to assume for the data. The correct equation would differ for each type of distribution.

Best,
Wolfgang

Viechtbauer Wolfgang (STAT)

Thu, Oct 12, 2017 12:55 AM #

I've added a note to help(escalc) about this.

A reference showing that the sample mean and variance are independent for normally distributed data? You can find some references here:

https://en.wikipedia.org/wiki/Normal_distribution#Properties

Best,
Wolfgang

-----Original Message-----
From: Samuel Knapp [mailto:samuel.knapp at tum.de] 
Sent: Thursday, 12 October, 2017 9:29
To: Viechtbauer Wolfgang (SP); r-sig-meta-analysis at r-project.org
Cc: Marcel van der Heijden
Subject: Re: [R-meta] Differences in calculation of CVR in escalc()

Hi Wolfgang,

many thanks for your explanation. I think I understand your point now.

However, it would be nice, if this difference in calculation would be 
made clear in the help of escalc. So far, only Nakagawa et al. (2015) is 
cited (even 3 times), which lead me to assume that calculations are done 
as shown in the given reference.

Also, is there any reference with your argument?

Best,

Samuel

On 11/10/17 21:23, Viechtbauer Wolfgang (SP) wrote:

Hi Samuel,

Taylor's law describes an *empirical* phenomenon that means and variances (or transformations thereof) across studies tend to be associated in certain ways. That is, if the true mean is larger, than the true variance also tends to be larger.

That is again something entirely different than the correlation between the sample mean and sample variance (or transformations thereof) in the bivariate sampling distribution for data from some distribution. We can derive this correlation (so this is not an empirical phenomenon but a purely statistical fact) and for normally distributed data, that correlation is zero.

Across many studies, the sample mean and variance could indeed be correlated even for normally distributed data because of Taylor's law. But then this has nothing to do with the sampling variance. We would capture this correlation by modeling the relationship between the underlying true means and variances in some way.

Best,
Wolfgang

-----Original Message-----
From: Samuel Knapp [mailto:samuel.knapp at tum.de]
Sent: Wednesday, 11 October, 2017 0:00
To: Viechtbauer Wolfgang (SP); Michael Dewey; r-sig-meta-analysis at r-project.org
Subject: Re: [R-meta] Differences in calculation of CVR in escalc()

Dear Wolfgang,

that sounds like a tricky problem. I agree with you, that the best (or the worst) assumption about the distribution we can make is normal distribution. However, in observations the mean and sd covary very often (e.g. D?ring et al. 2015), which is also the motivation to use CV=sd/mean.

Nagakawa et al. (2015) in their appendix assume normal distribution of the means and sds, but also assume covariation of mean and sd (without giving references, but I guess because of above mentioned observation).

I understand your point about the different kinds of correlation between studies and the bivariate sampling distribution. However, would it not be better to still include the correlation in order to account for the often observed covariation of mean and sd (and still being a good approximation independent of the real distribution), and also with the argument if there is no correlation (because of a assumed normal distribution), it will be estimated to zero and thus have no effect?

Looking forward to your reply!

Best regards,
Samuel

References: D?ring, T.F., Knapp, S., Cohen, J.E., 2015. Taylor?s power law and the stability of crop yields. Field Crops Research 183, 294?302. doi:10.1016/j.fcr.2015.08.005

On 10/10/17 11:38, Viechtbauer Wolfgang (SP) wrote:
Dear Samuel,

Eq. 12 in Nakagawa et al (2015) is not correct. For normally distributed data, the mean and variance are independent and so are ln(mean) and ln(SD). Hence, the correlation term should be omitted.

For data that cannot be assumed to be (approximately) normally distributed, the mean and variance are no longer independent and then one would need to account for their correlation in the computation of the sampling variance. However, the correlation between the means and variances (or ln(mean) and ln(SD) values) across studies is not the same thing as the correlation within the bivariate sampling distribution. In fact, those are really different things.

One would have to derive what the correlation is depending on the type of distribution one wants to assume for the data. The correct equation would differ for each type of distribution.

Best,
Wolfgang

Samuel Knapp

Thu, Oct 12, 2017 2:10 AM #

Thanks.

I actually meant a reference with a version of Eq. 12 from Nakagawa et 
al. (2015) without the correlation term, and including the argument 
against the use of the correlation term, as done in this paper.

Best,

Samuel

On 12/10/17 09:55, Viechtbauer Wolfgang (SP) wrote:

I've added a note to help(escalc) about this.

A reference showing that the sample mean and variance are independent for normally distributed data? You can find some references here:

https://en.wikipedia.org/wiki/Normal_distribution#Properties

Best,
Wolfgang

-----Original Message-----
From: Samuel Knapp [mailto:samuel.knapp at tum.de]
Sent: Thursday, 12 October, 2017 9:29
To: Viechtbauer Wolfgang (SP); r-sig-meta-analysis at r-project.org
Cc: Marcel van der Heijden
Subject: Re: [R-meta] Differences in calculation of CVR in escalc()

Hi Wolfgang,

many thanks for your explanation. I think I understand your point now.

However, it would be nice, if this difference in calculation would be
made clear in the help of escalc. So far, only Nakagawa et al. (2015) is
cited (even 3 times), which lead me to assume that calculations are done
as shown in the given reference.

Also, is there any reference with your argument?

Best,

Samuel

On 11/10/17 21:23, Viechtbauer Wolfgang (SP) wrote:

Hi Samuel,

Taylor's law describes an *empirical* phenomenon that means and variances (or transformations thereof) across studies tend to be associated in certain ways. That is, if the true mean is larger, than the true variance also tends to be larger.

That is again something entirely different than the correlation between the sample mean and sample variance (or transformations thereof) in the bivariate sampling distribution for data from some distribution. We can derive this correlation (so this is not an empirical phenomenon but a purely statistical fact) and for normally distributed data, that correlation is zero.

Across many studies, the sample mean and variance could indeed be correlated even for normally distributed data because of Taylor's law. But then this has nothing to do with the sampling variance. We would capture this correlation by modeling the relationship between the underlying true means and variances in some way.

Best,
Wolfgang

-----Original Message-----
From: Samuel Knapp [mailto:samuel.knapp at tum.de]
Sent: Wednesday, 11 October, 2017 0:00
To: Viechtbauer Wolfgang (SP); Michael Dewey; r-sig-meta-analysis at r-project.org
Subject: Re: [R-meta] Differences in calculation of CVR in escalc()

Dear Wolfgang,

that sounds like a tricky problem. I agree with you, that the best (or the worst) assumption about the distribution we can make is normal distribution. However, in observations the mean and sd covary very often (e.g. D?ring et al. 2015), which is also the motivation to use CV=sd/mean.

Nagakawa et al. (2015) in their appendix assume normal distribution of the means and sds, but also assume covariation of mean and sd (without giving references, but I guess because of above mentioned observation).

I understand your point about the different kinds of correlation between studies and the bivariate sampling distribution. However, would it not be better to still include the correlation in order to account for the often observed covariation of mean and sd (and still being a good approximation independent of the real distribution), and also with the argument if there is no correlation (because of a assumed normal distribution), it will be estimated to zero and thus have no effect?

Looking forward to your reply!

Best regards,
Samuel

References: D?ring, T.F., Knapp, S., Cohen, J.E., 2015. Taylor?s power law and the stability of crop yields. Field Crops Research 183, 294?302. doi:10.1016/j.fcr.2015.08.005

On 10/10/17 11:38, Viechtbauer Wolfgang (SP) wrote:
Dear Samuel,

Eq. 12 in Nakagawa et al (2015) is not correct. For normally distributed data, the mean and variance are independent and so are ln(mean) and ln(SD). Hence, the correlation term should be omitted.

For data that cannot be assumed to be (approximately) normally distributed, the mean and variance are no longer independent and then one would need to account for their correlation in the computation of the sampling variance. However, the correlation between the means and variances (or ln(mean) and ln(SD) values) across studies is not the same thing as the correlation within the bivariate sampling distribution. In fact, those are really different things.

One would have to derive what the correlation is depending on the type of distribution one wants to assume for the data. The correct equation would differ for each type of distribution.

Best,
Wolfgang

Viechtbauer Wolfgang (STAT)

Thu, Oct 12, 2017 3:19 AM #

I see. No, there is no reference explicitly for that. But this all follows directly from the proof that the sample mean and sample variance are independent for the normal distribution.

Best,
Wolfgang

-----Original Message-----
From: Samuel Knapp [mailto:samuel.knapp at tum.de] 
Sent: Thursday, 12 October, 2017 11:10
To: Viechtbauer Wolfgang (SP); r-sig-meta-analysis at r-project.org
Cc: Marcel van der Heijden
Subject: Re: [R-meta] Differences in calculation of CVR in escalc()

Thanks.

I actually meant a reference with a version of Eq. 12 from Nakagawa et 
al. (2015) without the correlation term, and including the argument 
against the use of the correlation term, as done in this paper.

Best,

Samuel

On 12/10/17 09:55, Viechtbauer Wolfgang (SP) wrote: