Skip to content
Prev 3751 / 5636 Next

[R-meta] Question on effect sizes

Hi Lukas and list,

first of all, thanks for your time and the suggestions and apologies for 
not making my point as clear as I should have. I have already contacted 
all authors and the pharmaceutical companies, but the latter are kind of 
reluctant to disclose results for all kinds of reasons and the former 
have sometime no access anymore since some results are 20 or more years 
old. But let me delineate my problem:

We have a measure of disease severity, which consists of several items 
and is summarised. In most of the studies I have a sum score 
group_mean(x) - so x1+x2 - with group_sd(x). But it may happen that 
authors provide group_mean(x1) and group_mean(x2)? with their respective 
sd. It's of course easy to get the group_mean(x) but I'm wondering what 
the approach would be for sd(x). I though about the "pooled_sd" with

pooled_sd <- sqrt(((n1-1)*sd_x1^2 + (n2-1)*sd_x2^2) / (n1+n2-2)))

but I'm not sure whether that makes sense. So I tried to simulate data 
to get a hunch of how reliable results are (code below), but the mean 
difference between "true" sd and estimated sd is in a few cases 
considerable. So I was wondering if I am missing something/if this is a 
valid approach.

I would be delighted if you or someone else could guide me with some 
advice.

All the best,

David


Code:

## Test for simulation of compund SD

# General
set.seed(1234)
rnorm2 ??? ? <- function(n,mean,sd) { mean+sd*scale(rnorm(n)) }
nsim ??? ??? <- 500
group_size?? <- c(100, 100)

# Simulate two known datasets
means_x1 ??? <- runif(nsim, 0, 5) # values are be between 0 and 5
sd_x1 ??? ?? <- runif(nsim, 0, 4)

means_x2 ??? <- runif(nsim, 0, 5)
sd_x2 ??? ?? <- runif(nsim, 0, 4)

x1 ??? ??? ? <- matrix(data=NA,nrow=group_size[1],ncol=500)
x2 ??? ??? ? <- matrix(data=NA,nrow=group_size[2],ncol=500)
for (i in 1:500){
 ?? ?x1[,i] <- rnorm2(group_size[1], means_x1[i], sd_x1[i])
 ?? ?x2[,i] <- rnorm2(group_size[2], means_x2[i], sd_x2[i])
}

mean_sum?? ? <- apply(rbind(x1,x2), 2, mean) #trivial see also 
plot(apply(rbind(x1,x2), 2, mean) - apply(rbind(means_x1, means_x2), 2, 
mean)) for estimation differences
sd_sum ????? <- apply(rbind(x1,x2), 2, sd) # "ground truth"

sd_estimate <- rep(NA, nsim) # according to rnorm2
for (i in 1:500){
 ?? ?sd_estimate[i] <- sqrt(((group_size[1]-1)*sd_x1[i]^2 + 
(group_size[2]-1)*sd_x2[i]^2) / (group_size[1]+group_size[2]-2))
}
results <- data.frame(x=sd_sum, y=sd_estimate, z=sd_sum-sd_estimate)
plot(results$z)


Am 21.01.22 um 17:25 schrieb Lukasz Stasielowicz: