Sorry about "blindly copy-pasting" the code
"control=list(tau2.min=-min(vi))". I thought the vi was a variable in the
specified dataset. I didn't know it was an external data vector. I mean,
other parameters in rma.uni come from the data. I'm new to R and metafor,
remember, so I think in SAS terms.
I also didn't know how you had generated the SE for heterogeneity. I
implemented "scoring=100" in Proc Mixed, as per your suggestion, and yes, it
generates the same SE as yours. It does make the coverage of the 90%
confidence interval for the heterogeneity variance and for the fixed effects
a little suboptimal (~88%) under some small-sample circumstances, but a big
plus is that only a handful of 5000 simulations failed to converge, whereas
up to 500 of the 5000 simulations failed without scoring=100. So thank you,
the suboptimal coverage is worth it. Those other methods of estimation you
mentioned don't appear to be available in Proc Mixed, but I am happy with
the default REML. I am not skilled enough yet with R to run my simulations
therein (I have to learn "list" coding?), so the only way I can compare Proc
Mixed's confidence limits with those for the same dataset in metafor is to
specify type="PL" in the confint function. There was close agreement with
the CLs for the fixed effects, but big discrpancy for the upper CL for the
heterogeneity.
Using Proc Mixed, with 90% CLs:
CovParm Estimate StdErr Lower Upper
StudyID 0.5427 0.5316 -0.3317 1.4172
Using rma.unit:
tau^2 (estimated amount of residual heterogeneity): 0.5425 (SE = 0.5315)
Then confint(..., level=90,...):
estimate ci.lb ci.ub
tau^2 0.5425 NA 2.3542
I don't know what to make of that. Proc Mixed's upper CL comes from assuming
a normal distribution for the hetero variance. Remember, this way of doing
the CLs gives pretty darn good coverage. If your upper CL is consistency
greater than Proc Mixed's, the coverage will be way too high, won't it? I
guess I'll see what happens when I get 5000 simulations running in metafor.
Re: selmodel() does not work with negative variance for heterogeneity... So
the idea is not to allow negative point estimates of variance in rma.uni. A
substantial proportion of point estimates will want to be (and arguably
should be allowed to be) negative, when real heterogeneity is zero or
slightly positive, when sample sizes and numbers of studies are small, and
especially when there is substantial publication bias. But that's OK, you
have to set them to zero. Again, I'll have to see how well it works, when I
figure out how to do it.
Re: the discrepancy with the SE for the males... I couldn't reproduce that,
and I really did check it at the time. Let's say a cosmic ray flipped a bit!
I ran half a dozen more datasets in SAS and metafor, and the SEs of the
fixed effects all agreed.
Will
-----Original Message-----
From: Viechtbauer, Wolfgang (NP)
<wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: Monday, March 4, 2024 11:07 PM
To: R Special Interest Group for Meta-Analysis
<r-sig-meta-analysis at r-project.org>
Cc: Will Hopkins <willthekiwi at gmail.com>
Subject: RE: [R-meta] Standard error of heterogeneity with rma.uni
Dear Will,
Please see my responses below.
Best,
Wolfgang
<mailto:r-sig-meta-analysis-bounces at r-project.org>
On Behalf Of Will Hopkins via R-sig-meta-analysis
Sent: Sunday, March 3, 2024 22:15
To: 'R Special Interest Group for Meta-Analysis'
<mailto:r-sig-meta-analysis at r- project.org>
Cc: Will Hopkins <mailto:willthekiwi at gmail.com>
Subject: [R-meta] Standard error of heterogeneity with rma.uni
After a year of denial, a colleague and I are starting to use metafor
instead of SAS's Proc Mixed, since finding recently that selection
models to adjust for publication bias are not implementable in SAS
(not by us, anyway, and no-one else seems to be using SAS for
meta-analyses). So I'm starting by importing data into R that I have
simulated and analyzed in SAS, to make sure I am using metafor
correctly. I have found that the standard error (SE) for the
heterogeneity variance provided by rma.uni is considerably smaller
than that provided by SAS, and the SE for a fixed effect can be
smaller in metafor, too. Here are two examples. Along the way, I have
several questions I hope you or someone can answer, Wolf.
In the first example, there were 20 studies (10 female, 10 male), with
sample size ranging from 10 to 30, with small and moderate mean
changes for males and females respectively, small heterogeneity within
each group, and standard errors of measurement such that the mean
changes in most of the male studies were non-significant and most of the
females were significant.
("Small" is not trivial, "small" and "moderate" are not defined by
standardization. In the following examples, small and moderate mean
changes are 1 and 3, and small heterogeneity is and SD of 0.5.)
The code for metafor was:
rma.uni(yi=Ydelta, vi=YdeltaSEsq, mods=Female01, data=simraw).
(Female01 is a dummy variable.) The solution for fixed effects was the
same in metafor...
estimate se
intrcpt 0.9546 0.3819
mods 2.1568 0.5162
...as in SAS...
Effect Estimate StdErr
Mean 0.9546 0.3819
Female01 2.1568 0.5162
But for the heterogeneity, metafor produced...
tau^2 (estimated amount of residual heterogeneity): 0.2755 (SE =
0.3810)
...while SAS produced...
CovParm Estimate StdErr
StudyID 0.2755 0.5256
As you can see, the point estimates are the same, but the SE in
metafor 0.72 that in SAS. I presume SAS is correct, because I can use
the simulation in SAS to generate thousands of meta-analyses for given
population values of everything, and the coverage of the confidence
intervals for the estimates of heterogeneity is exactly the level of
confidence of the intervals (I use 90%, for various good reasons). Is
this a small-sample issue in the way you produce the SE in rma.uni,
Wolf, one that is not a problem in SAS? Is there a solution in metafor?
SAS uses by default the inverse of the Hessian (i.e., the observed Fisher
information matrix) to calculate the SE of tau^2, while rma() uses the
expected Fisher information matrix. If you use 'scoring=100' (should be
sufficiently large), then SAS will also use the latter and the SEs are the
same. Or one can use:
simraw$id <- 1:nrow(simraw)
res2 <- rma.mv(Ydelta, YdeltaSEsq, random = ~ 1 | id, cvvc=TRUE)
round(sqrt(res2$vvc), 4)
to get the SE based on the observed Fisher information matrix.
Which is to be preferred (the observed or expected information) is an open
issue. Neither is particularly useful though in the present context, because
better CIs for tau^2 can be constructed (e.g., using the Q-profile method,
the generalized Q-statistic method, or the profile likelihood method) that
do not make use of any SE of tau^2.
Will this matter, when I come to use selmodel?
No.
And will selmodel work with negative values of heterogeneity variance?
No.
In the second example, I used simulated data that produced negative
heterogeneity variance in SAS. I am "brave enough to step into risky
territory", to quote you from the documentation, Wolf, because
negative variance for point estimates and confidence limits are
necessary to get realistic unbiased estimates and correct coverage of
random effects, when the uncertainty in heterogeneity is large enough
relative to its true point estimate, so I wanted to make sure rma.uni
produced correct negative variance. When I asked this mailing list
about getting negative variance a year ago, James Pustejovsky provided
the code (which is also in the metafor
documentation):
rma(yi = yi, vi = vi, data=dat, control=list(tau2.min=-min(vi))).
Unfortunately the above code doesn't quite work for me. Here's the
line of code for my data:
rma.uni(yi=Ydelta, vi=YdeltaSEsq, mods=Female01, data=simrawnv,
control=list(tau2.min=-min(vi)))
I got this error:
Error in min(vi) : invalid 'type' (closure) of argument.
Blindly copy-pasting code without the necessary adjustments to variable
names isn't going to work.
And when I tried this...
rma.uni(yi=Ydelta, vi=YdeltaSEsq, mods=Female01, data=simrawnv,
control=list(tau2.min=-min(YdeltaSEsq)))
...I got this...
Error: object 'YdeltaSEsq' not found.
You can use control=list(tau2.min=-min(simrawnv$YdeltaSEsq)). The tau2.min
argument does not make use of non-standard evaluation.
But I got it to work with this...
rma.uni(yi=Ydelta, vi=YdeltaSEsq, mods=Female01, data=simrawnv,
control=list(tau2.min=-99))
...which gave this message...
Warning message:
Value of 'tau2.min' constrained to -min(vi) = -0.1640.
So there's a bug, but it's easy to bypass it meantime.
This is not a bug. If we allow tau^2 to go below -min(vi), then the marginal
variance becomes negative. Neither the all holy SAS(r) nor metafor can
magically make this work.
The point estimates of the fixed effects in metafor and SAS were
identical, but the SE for the intercept (males) in metafor...
estimate se
intrcpt 1.3013 0.0013
mods 1.9245 0.2063
...was 0.67 times than in SAS...
Effect Estimate StdErr
Mean 1.3013 0.001930
Female01 1.9245 0.2063
This seems quite peculiar. Please provide a fully reproducible example
replicating this issue.
The SE for heterogeneity in metafor...
tau^2 (estimated amount of residual heterogeneity): -0.1640 (SE =
0.0735)
... was less than half that in SAS...
CovParm Estimate StdErr
StudyID -0.1640 0.1636
Should I be disappointed that both SAS and metafor set the point
estimate of negative variance to minus the smallest variance of the study
estimates?
No, because it is the only sensible thing that can be done.
It seems a pretty arbitrary and clunky thing to do, "to ensure that
the marginal variances are always non-negative", to quote Wolf in the
documentation.
That 'Wolf' guy seems to know what he is talking about here.
Perhaps someone can explain that. It obviously works as far as
coverage is concerned, in SAS anyway.
These problems presumably go away with large-enough numbers of studies
and/or sample sizes and/or study-estimate SEs relative to effect
magnitudes and/or heterogeneity, but mainly I am working with small
numbers of studies, small sample sizes, often large(-ish)
study-estimate SEs relative to effect magnitudes, and small heterogeneity.
The main issue for me is whether I will be able to use selmodel to
adjust for publication bias, when the study estimates are such that
negative heterogeneity could arise purely from sampling uncertainty or
from publication bias (which results in underestimation of
heterogeneity, in our simulations), and will surely arise in
simulations aimed at estimating bias and coverage.
As mentioned above, selmodel() does not allow for negative tau^2 values.