[R-meta] Question on effect sizes

Sat, Jan 22, 2022 1:50 PM

Hi Lukas and list,

first of all, thanks for your time and the suggestions and apologies for 
not making my point as clear as I should have. I have already contacted 
all authors and the pharmaceutical companies, but the latter are kind of 
reluctant to disclose results for all kinds of reasons and the former 
have sometime no access anymore since some results are 20 or more years 
old. But let me delineate my problem:

We have a measure of disease severity, which consists of several items 
and is summarised. In most of the studies I have a sum score 
group_mean(x) - so x1+x2 - with group_sd(x). But it may happen that 
authors provide group_mean(x1) and group_mean(x2)? with their respective 
sd. It's of course easy to get the group_mean(x) but I'm wondering what 
the approach would be for sd(x). I though about the "pooled_sd" with

pooled_sd <- sqrt(((n1-1)*sd_x1^2 + (n2-1)*sd_x2^2) / (n1+n2-2)))

but I'm not sure whether that makes sense. So I tried to simulate data 
to get a hunch of how reliable results are (code below), but the mean 
difference between "true" sd and estimated sd is in a few cases 
considerable. So I was wondering if I am missing something/if this is a 
valid approach.

I would be delighted if you or someone else could guide me with some 
advice.

All the best,

David


Code:

## Test for simulation of compund SD

# General
set.seed(1234)
rnorm2 ??? ? <- function(n,mean,sd) { mean+sd*scale(rnorm(n)) }
nsim ??? ??? <- 500
group_size?? <- c(100, 100)

# Simulate two known datasets
means_x1 ??? <- runif(nsim, 0, 5) # values are be between 0 and 5
sd_x1 ??? ?? <- runif(nsim, 0, 4)

means_x2 ??? <- runif(nsim, 0, 5)
sd_x2 ??? ?? <- runif(nsim, 0, 4)

x1 ??? ??? ? <- matrix(data=NA,nrow=group_size[1],ncol=500)
x2 ??? ??? ? <- matrix(data=NA,nrow=group_size[2],ncol=500)
for (i in 1:500){
 ?? ?x1[,i] <- rnorm2(group_size[1], means_x1[i], sd_x1[i])
 ?? ?x2[,i] <- rnorm2(group_size[2], means_x2[i], sd_x2[i])
}

mean_sum?? ? <- apply(rbind(x1,x2), 2, mean) #trivial see also 
plot(apply(rbind(x1,x2), 2, mean) - apply(rbind(means_x1, means_x2), 2, 
mean)) for estimation differences
sd_sum ????? <- apply(rbind(x1,x2), 2, sd) # "ground truth"

sd_estimate <- rep(NA, nsim) # according to rnorm2
for (i in 1:500){
 ?? ?sd_estimate[i] <- sqrt(((group_size[1]-1)*sd_x1[i]^2 + 
(group_size[2]-1)*sd_x2[i]^2) / (group_size[1]+group_size[2]-2))
}
results <- data.frame(x=sd_sum, y=sd_estimate, z=sd_sum-sd_estimate)
plot(results$z)


Am 21.01.22 um 17:25 schrieb Lukasz Stasielowicz:

Hi,

a couple of ideas that may be obvious to you but the provided 
description is rather short, so I don't know whether you have thought 
about the following points:

1. Did you try to contact the authors of the studies? Maybe they will 
be willing to provide the missing statistics or the data set. The 
willingness varies obviously between researchers (and research areas) 
but it is often worth the effort.

One could contact the corresponding author and ask for the statistics 
or the data set (providing the choice can increase the success rate). 
If you don't receive an answer within several days (e.g. one week) 
thwn one can try to contact the other authors. Recently I used this 
strategy for two different meta-analyses and approximately 80% - 90% 
of the research teams wrote back. Obviously, not all of them could 
provide answers or data (hard drive failure etc.) but approximately 
30% - 50% of the authors provided additional information.

2. If you have already explored the first strategy and the relevant 
information is still missing, then one could try to reconstruct it. It 
is something that you were referring to but the description is rather 
short, so I cannot infer what is meant by pooled SD etc.
One could try to rearrange the formulas to compute the missing 
information manually but if there are two unknowns (e.g. SD and M for 
one group is missing) then it is not possible.
Nevertheless, one could try to make some guesstimates (e.g. are the 
SDs for both groups in other studies similar? if yes than one could 
make a respective guesstimate for the missing information) in order to 
impute the data.
One could even make several guesstimates and test these different 
scenarios to test the robustness of the findings. Another sensitivity 
analysis would be to compare meta-analytic results based on studies 
with without missing information and the scenarios with guesstimates.

3. It is probably obvious to you but dropping the studies with missing 
information is also a possibility. However, it could bias the results 
(if the dropped studies differ significantly from the included studies).


Hope it helps!

Best wishes,

<http://www.ukgm.de>

				
PD Dr. David Pedrosa
Leitender Oberarzt der Klinik f?r Neurologie,
Leiter der Sektion Bewegungsst?rungen, Universit?tsklinikum Gie?en und 
Marburg

Tel.: (+49) 6421-58 65299 Fax: (+49) 6421-58 67055

Adresse: Baldingerstr., 35043 Marburg
Web: https://www.ukgm.de/ugm_2/deu/umr_neu/index.html

	[[alternative HTML version deleted]]

[R-meta] Question on effect sizes

Thread (3 messages)