Skip to content

[R-meta] Predictive interval in MA with less than 10 studies

5 messages · Tobias Saueressig, Wolfgang Viechtbauer, Philippe Tadger

#
Dear R-sig-MA community

According to Cochrane manual: it's recommended to not trust in the PI 
when there are fewer than 10 studies because such calculation relies on 
the assumption of normality. Is there a way to check formally on each 
case when using less than 10 studies is not safe for PI calculation?. I 
can understand this limitation when the PI is calculated through a 
method that uses the methods of moments (or exact calculations like 
Riley 2001), but when the PI comes from a model that uses ML/REML (or 
iterative methods with identifiable likelihood) or Bayesian, such 
concern cannot exist. I would like to find confirmation or refutation of 
this idea.


In advance,? your time and shared wisdom are appreciated.
2 days later
#
And:

Wang, C. C., & Lee, W. C. (2019). A simple method to estimate prediction intervals and predictive distributions: Summarizing meta-analyses beyond means and confidence intervals. Research Synthesis Methods, 10(2), 255-266. https://doi.org/10.1002/jrsm.1345

But to add to this:

The issue of k and normality are a bit conflated here. If the distribution of true effects is non-normal, then k could be a million and a PI calculated under the assumption of normality is still garbage.

But if the distribution is normal (or approximately so), then k is relevant for getting an accurate estimate of tau^2 (which is what mostly determines the width of the PI, besides the SE of mu-hat).

As for the method of estimation: The same concerns apply whether one uses the method of moments, ML/REML, or Bayesian methods. Not sure why you think those concerns do not apply for the latter two types.

In general: I would consider all commonly-used methods for calculating a PI (including Bayesian methods) as rough approximations, regardless of k (well, I might have a lower bound on k, but that more generally applies to the use of RE models). They don't have nominal coverage properties, but are still useful to translate the estimate of tau^2 (which is difficult to interpret) into a range of 'plausible' effects one might see across many studies (including future ones).

Best,
Wolfgang
#
Thank you Wolfgang, Tobias

For the useful links, and the nice food for further thougths.

Wolfgang thank you to remind me of a really important point, the number 
is not a warranty for the random effects (RE) will follow a Normal 
distribution.

With respect to MLE and Bayesian methods. I agree by themself such 
methods are not better prepare to deal with non-normality assumption, 
BUT it's quite common to find extensions in packages like bamdit, 
metaplus where RE can follow t-distribution or mixture of normals which 
are more robust to non-gaussian distributions.

I have an additional question with respect to the PI in packages like 
meta and metafor. Do they differ in the way they are calculated? I read 
with respect the the PI (in meta) calculation: "implements equation (12) 
of Higgins et al., (2009) which proposed a t distribution with K-2 
degrees of freedom where K corresponds to the number of studies in the 
meta-analysis". Is it the same for metafor? Wouldn't this PI approach 
(t-dist) help to cope with the slight departure of the RE form Normality?

Thanks in advance for the help and guidance
On 30/09/2021 08:49, Viechtbauer, Wolfgang (SP) wrote:
#
Using a t-distribution with df=k-2 for constructing the PI was suggested as a heuristic way to account for the fact that two parameters are estimated in the RE model (mu and tau^2), not as a way of allowing for a non-normal distribution of the random effect (in the latter case, you have to fit the model itself differently).

I don't know exactly what meta does, but metafor does this:

https://www.metafor-project.org/doku.php/faq#for_random-effects_models_fitt

There is also a (currently undocumented) argument for predict() called 'pi.type'. If you set pi.type="riley", then you get the PI with t(df=k-2), or really df=k-p-1, where p is the number of fixed effects estimated (since one can also compute PIs for meta-regression models and I just implemented the logical extension of df=k-2) and really really df=k-p-q, where q denotes the number of variance/correlation components estimated, since one can also compute PIs for more complex models with more than just a single tau^2.

But in all of these cases, the model itself assumes normally distributed random effects.

Best,
Wolfgang