Skip to content
Prev 2289 / 5636 Next

[R-meta] Correction for sample overlap in a meta-analysis of prevalence

Dear Thao,

I do not know these papers, so I cannot comment on what methods they describe and whether those could be implemented using metafor.

Obviously, the degree of dependence between overlapping estimates depends on the degree of overlap. Say there are two diseases (as in your example). Then if we had the raw data, we could count the number of individuals that:

x1:  have only disease 1
x2:  have only disease 2
x12: have both disease 1 and 2
x0:  have neither disease

Let n = x1 + x2 + x12 + x0. Then you have p1 = (x1+x12) / n and p2 = (x2+x12) / n as the two prevalences. One could easily work out the covariance (I am too lazy to do that right now), but in the end this won't help, because computing this will require knowing all the x's, not just p1 and p2 and n. And I assume no information is reported on the degree of overlap. One could maybe make some reasonable 'guestimates' and then compute the covariances followed by a sensitivity analysis.

Alternatively, you could use the 'sandwich' method (cluster-robust inference). This has been discussed on this mailing list extensively in the past (not in the context of overlap in such estimates, but the principle is all the same).

Best,
Wolfgang