[R-meta] Meta-analysis of intra class correlation coefficients

Fri, Oct 18, 2024 7:33 AM

Dear Wolfgang,

Many thanks! This is really helpful, indeed! For clarification, related traits refer to traits that are likely to reflect a single shared underlying domain (e.g. assuming a musicality phenotype: singing, pitch, tapping a beat etc). These scores are available in MZ and DZ twins in the same cohort, and multiple analyses using these related phenotypes have been carried out with the same study, and multiple studies have analysed the same cohort, and multiple cohorts are available for analysis.

I adjusted the code as you suggested. I excluded the DZ_factor as it is redundant and just an opposite coding of the MZ_factor.

+   zi,
+   vzi,
+   random = list(~ MZDZ_factor | Study_ID, ~ MZDZ_factor | ESID), struct="UN",
+   data = data,
+   mods = ~ 0 + MZDZ_factor,
+   method = "REML",
+   tdist = TRUE)

This provides now estimates for MZDZ_factor0 and MZDZ_factor1. 
As heritability can only be estimated with raw ICC (twin correlations), I reverted back: 

#i.e. approximating heritability (which is usually carried out with established twin analysis methodology using ML)
#Reverting back to r from Z scores
rMZ <- FisherZInv(model$b[2])
rDZ <- FisherZInv(model$b[1])
h2 <- 2*(rMZ - rDZ)

#or equivalently
h2_v2 <- 2*((exp(2*model$b[2]) - 1) / (exp(2*model$b[2]) + 1) - (exp(2*model$b[1]) - 1) / (exp(2*model$b[1]) + 1))

#with SE:
h2_se <- deltamethod(~ 2*((exp(2*x2) - 1) / (exp(2*x2) + 1) - (exp(2*x1) - 1) / (exp(2*x1) + 1)), coef(model), vcov(model))

#This is now the SE for the h2 reverting back from Z scores, and I slightly adjusted for the order of the coefficients.

There is indeed covariance between zrMZ and zrDZ, which may arise due to the meta-analytic character of the estimates (study effects?). The covariance between rMZ and rDZ is assumed to be absent within a single study, but here, indeed, it is not. We will now look into possibilities to derive more sophisticated twin estimates using specialised software and ML estimations. Ideally, these analyses are carried out on the raw scale.

To simplify the calculation (avoiding the Fisher transformation), do you think we could transform rma estimates and vcov back from Z to the raw scale (still using the delta method)? There is a remarkable stability in the vcov, for z and raw r estimates:

MZDZ_factor0 MZDZ_factor1
MZDZ_factor0   0.01522759   0.03257534
MZDZ_factor1   0.03257534   0.07674611

MZDZ_factor0 MZDZ_factor1
MZDZ_factor0   0.01522877   0.03258687
MZDZ_factor1   0.03258687   0.07689732

Thanks again for your insight and help,
Beate










Beate St Pourcain, PhD
Senior Investigator & Group Leader
Room A207
Max Planck Institute for Psycholinguistics | Wundtlaan 1 | 6525 XD Nijmegen | The Netherlands


@bstpourcain
Tel:?+31 24 3521964
Fax:?+31 24 3521213
ORCID: https://orcid.org/0000-0002-4680-3517
Web: https://www.mpi.nl/departments/language-and-genetics/projects/population-variation-and-human-communication/
Further affiliations with:
MRC Integrative Epidemiology Unit | University of Bristol | UK
Donders Institute for Brain, Cognition and Behaviour | Radboud University | The Netherlands

My working hours may not be your working hours. Please do not feel obligated to reply outside of your normal working schedule.

-----Original Message-----
From: Viechtbauer, Wolfgang (NP) <wolfgang.viechtbauer at maastrichtuniversity.nl> 
Sent: Wednesday, October 16, 2024 10:21 PM
To: St Pourcain, Beate <Beate.StPourcain at mpi.nl>; R Special Interest Group for Meta-Analysis <r-sig-meta-analysis at r-project.org>
Subject: RE: Meta-analysis of intra class correlation coefficients

I don't understand what you mean by 'related traits'. Also, if multiple estimates are extracted from the same cohort, you may need to account for this as well. But leaving this aside, you should fit a single bivariate model to the z(rMZ) and z(rDZ) values. Maybe something like this:

model <- rma.mv(
  zi,
  vzi,
  random = list(~ MZDZ_factor | Study_ID, ~ MZDZ_factor | ESID), struct="UN",
  data = data,
  mods = ~ 0 + MZDZ_factor + DZMZ_factor,
  method = "REML",
  tdist = TRUE)

although not sure whether this will converge.

coef(model) then gives you the two coefficients and vcov(model) the corresponding var-cov matrix thereof. The multivariate delta method is described here:

https://en.wikipedia.org/wiki/Delta_method#Multivariate_delta_method

This is implemented, for example in msm::deltamethod(). Double-check this, but:

deltamethod(~ 2*((exp(2*x1) - 1) / (exp(2*x1) + 1) - (exp(2*x2) - 1) / (exp(2*x2) + 1)), coef(model), vcov(model))

should then give you the SE of 2*(rMZ - rDZ).

Best,
Wolfgang

-----Original Message-----
From: St Pourcain, Beate <Beate.StPourcain at mpi.nl>
Sent: Wednesday, October 16, 2024 10:59
To: Viechtbauer, Wolfgang (NP) 
<wolfgang.viechtbauer at maastrichtuniversity.nl>; R Special Interest 
Group for Meta-Analysis <r-sig-meta-analysis at r-project.org>
Subject: RE: Meta-analysis of intra class correlation coefficients

ATTACHMENT(S) REMOVED: Funnel_DZ_ICC_simple_withN_rZ.png | 
Funnel_MZ_ICC_simple_withN_rZ.png

Dear Wolfgang,
Many thanks for the suggestions!
The r-Z transformation was easily applied, as suggested.

data$zi <- FisherZ(data$r)
data$vzi <- 1/(data$n-3/2)
egger_seMZ <- regtest(MZdata$zi, MZdata$vzi) egger_seDZ <- 
regtest(DZdata$zi, DZdata$vzi)

...and improved the Egger regression
zrMZ_se_z       zrDZ_se_z
-3.968          -5.519

Also, the funnel plots look better (see attached). We subsequently 
carried out the rma.mv as the ICC estimates partially represent 
related traits estimated within the same study (i.e. the estimates are 
related within rMZ and within rDZ), and of course rMZ and rDZ are 
nested within the same study and also sometimes the same cohort (for code see below).

For the subsequent analysis, we back-transformed the Z-score intercept 
from rma.mv with

rMZ <- FisherZInv(model2$b[1])
rDZ <- FisherZInv(model1$b[1])

However, we are unsure as to how to best backtransform model2$se[1] 
and model1$se[1], i.e. the SEs of the rZ intercepts from rma.mv using 
the delta method as you recommended. Could you please advise? This 
will create the input for the subsequent heritability analyses.

Best wishes,
Beate

#################################################################
Study_ID - Cohort ID
ESID - actual Study ID
MZDZ_factor - (MZ=1, DZ=0)
DZMZ_factor - (DZ=1,MZ=0)

For completeness, I add the code below that was adapted from the 
Austerberry study for z values:
model1 <- rma.mv(
  zi,
  vzi,
  random = list(~MZDZ_factor|Study_ID, ~ MZDZ_factor | ESID),
  data = data,
  mods = ~ MZDZ_factor,
  method = "REML",
  tdist = TRUE
)

model2 <- rma.mv(
  zi,
  vzi,
  random = list(~DZMZ_factor|Study_ID, ~ DZMZ_factor | ESID),
  data = data,
  mods = ~ DZMZ_factor,
  method = "REML",
  tdist = TRUE
)

Beate St Pourcain, PhD
Senior Investigator & Group Leader
Room A207
Max Planck Institute for Psycholinguistics | Wundtlaan 1 | 6525 XD 
Nijmegen | The Netherlands

@bstpourcain
Tel: +31 24 3521964
Fax: +31 24 3521213
ORCID: https://orcid.org/0000-0002-4680-3517
Web: 
https://www.mpi.nl/departments/language-and-genetics/projects/populati
on-
variation-and-human-communication/
Further affiliations with:
MRC Integrative Epidemiology Unit | University of Bristol | UK Donders 
Institute for Brain, Cognition and Behaviour | Radboud University | 
The Netherlands

-----Original Message-----
From: Viechtbauer, Wolfgang (NP)
<mailto:wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: Monday, October 14, 2024 3:32 PM
To: St Pourcain, Beate <mailto:Beate.StPourcain at mpi.nl>; R Special 
Interest Group for Meta-Analysis 
<mailto:r-sig-meta-analysis at r-project.org>
Subject: RE: Meta-analysis of intra class correlation coefficients

It is not ideal to meta-analyze raw correlations (or raw ICC values) 
if they are so large. In this case, I think the r-to-z transformation is highly advisable.

What you observe is the fact that the variance of a raw correlation 
coefficient depends on the true correlation. In particular, the 
large-sample estimate of the variance of a raw correlation coefficient 
(or ICC based on pairs) is (1-r^2)^2 / (n-1), where r is the observed 
correlation and n the sample size. Therefore, as r gets close to 1, the variance will get small, as you noted.

It is therefore no surprise that the Egger regression test is highly 
significant. Of course, this then says nothing about potential publication bias.
There is an inherent link between the correlation and its varianace 
(and hence standard error). The same issue also arises with other 
outcome measures / effect sizes (e.g., standardized mean differences).

In the present case, the r-to-z transformation 'solves' this issue, 
since Var[z_r] =~ 1/(n-3) for Pearson correlations and Var[z_ICC] =~ 
1/(n-3/2) for
ICC(1) values.

I would then consider doing a bivariate meta-analysis of the z_ICC_mz 
and z_ICC_dz values. Since they are based on independent samples, 
their sampling errors are uncorrelated, but the random effects of the 
bivariate model then account for potential correlation in the 
underlying true (transformed) ICC values. This is analogous to what 
people do when pooling sensitivity and specificity values in a 
diagnostic test meta-analysis and also directly relates to the bivariate model discussed by van Houwelingen et al. (2002):

https://www.metafor-project.org/doku.php/analyses:vanhouwelingen2002

You can then estimate the heritability from this model by 
back-transforming the pooled estimate for the MZ twins and the pooled 
estimate from the DZ twins and taking twice the difference. The SE and 
hence CI for this can then be obtained via the delta method.

This raises interesting questions about the difference between between
'pooled(x) - pooled(y)' versus 'pooled(x - y)' -- there are papers in 
the literature that discuss this issue (not in the present context) -- 
but the latter option doesn't appear sensible to me here anyway.

Best,
Wolfgang

[R-meta] Meta-analysis of intra class correlation coefficients

Thread (10 messages)