This doesn't look correct to me. The step of centering the predictor is
essential for the code that I posted to work. If you want to be able to
predict at arbitrary points, then you'll need to take into account the
covariance between the random effects, so calculating
Var(u_0j + u_1j * (a) + v_ij) = Var(u_0j) + a^2 * Var(u_1j) + 2 * a *
Cov(u_0j, u_1j) + Var(v_ij)
James
On Thu, Mar 21, 2024 at 6:04?PM Zac Robinson <zacrobinson2015 at gmail.com>
wrote:
James,
Thank you very much for your reply - it was extremely helpful!
I have taken your code, modified it a bit, and attempted to create a
function that works alongside emmprep / emmeans. From my tests it seems to
get the same results as your manual calculations but let me know if I'm
missing something - only thing I wondered is if "rho" and "phi" need to be
incorporated directly, as well as the covariances you mentioned
Thanks again,
Zac
# load packages and data
library(metafor)
library(tidyverse)
library(emmeans)
data("dat.konstantopoulos2011")
df <-
dat.konstantopoulos2011 %>%
mutate(
factor=sample(c("A","B"),nrow(.),replace=TRUE)
)
# create function
get.pi.emmeans.rma.mv <- function(model, mm){
tmp <- summary(mm)
tmp <- tmp[ , ]
test.stat <- stats::qt(mm at misc$level+(1-mm at misc$level)/2, tmp$df[[1]])
n_g <- colnames(model$mf.g)
n_g2 <- ifelse(is.null(n_g),0,n_g[n_g!="intrcpt"&n_g!="outer"])
l_g <- ifelse(is.null(n_g),0,mm at linfct[,n_g2])
v_g <- ifelse(is.null(n_g),0,diag(model$G)%>%as.data.frame()%>%.[n_g2,])
n_h <- colnames(model$mf.h)
n_h2 <- ifelse(is.null(n_h),0,n_h[n_h!="intrcpt"&n_h!="outer"])
l_h <- ifelse(is.null(n_h),0,mm at linfct[,n_h2])
v_h <- ifelse(is.null(n_h),0,diag(model$H)%>%as.data.frame()%>%.[n_h2,])
V_b <- tmp$SE^2
V_ri <- sum(model$sigma2 + model$tau2[1] + model$gamma2[1])
V_g <- l_g*v_g
V_h <- l_h*v_h
PI <- test.stat * sqrt(V_b + V_ri + V_g + V_h)
tmp$lower.PI <- tmp$emmean - PI
tmp$upper.PI <- tmp$emmean + PI
return(tmp)
}
# RANDOM INTERCEPT
--------------------------------------------------------
# fit random intercept model
m1 <- rma.mv(
yi ~ year + factor, V = vi,
random = list(~1|district, ~1|interaction(district,school)),
data = df, method = "REML"
)
# Prediction interval at year = 1989 (for Taylor) and factor = A
predict(m1, newmods = cbind(1989, 0), level = 95)
# Using function
emmprep(m1,
at=list(year=1989,
factor="A"))%>%
emmeans(~year+factor,
weights = "prop")%>%
get.pi.emmeans.rma.mv(model=m1) #same
# RANDOM SLOPE (x1)
-------------------------------------------------------
# fit single random slope model
m2 <- rma.mv(
yi ~ year + factor, vi,
random = list(~year|district, ~1|interaction(district,school)),
data = df, method = "REML", struct = "GEN",control =
list(optimizer="optim")
)
# Calculate prediction interval for year = 1989, factor = A
vec <- c(1, 1989, 0)
pred <- sum(m2$beta * vec)
Vb <- as.numeric(t(vec) %*% vcov(m2) %*% vec)
V_RE <- sum(m2$sigma2) + m2$tau2[1] + m2$tau2[2]*1989
S <- sqrt(Vb + V_RE)
z_crit <- qnorm((1 + 0.95) / 2)
pred + c(-1, 1) * z_crit * S
# Using function
emmprep(m2,
at=list(year=1989,
factor="A"))%>%
emmeans(~year+factor,
weights = "prop")%>%
get.pi.emmeans.rma.mv(model=m2) # same
On Thu, Mar 21, 2024 at 1:00?PM James Pustejovsky <jepusto at gmail.com>
wrote:
It looks like you are trying to generate a prediction interval for a new
effect size given a specific set of predictors, based on a model with
random slopes terms. A model like look like the following:
Y_ij = b0 + b1 X_ij + u_0j + u_1j X_ij + v_ij + e_ij,
where (u_0j, u_1j) are the study-level random effects with
variance-covariance matrix T, v_ij is a within-study random effect with
variance omega^2, and e_ij is the sampling error with known variance V_ij.
Your goal might be to predict a new effect size for a study with X_ij = a,
or the distribution of
delta_ij = b0 + b1 * (a) + u_0j + u_1j * (a) + v_ij
The tricky bit with these models (and probably the reason that PIs are
not currently implemented in metafor) is that the study-level random
effects terms can be correlated and you need to take the variances and
covariances into account to get a prediction interval.
My understanding is that metafor calculates prediction intervals by
taking the predicted value given the fixed effects terms, plus or minus a
critical value (call it z_p) times a quantity S, where S^2 is the sum of
the variance of the predicted value and the variance of the random effects
at the predicted value. For the example above:
PI = [bhat0 + bhat1 * (a)] +/- z_p * S, where S = sqrt[ Var(bhat0 +
bhat1 * (a)) + Var(u_0j + u_1j * (a) + v_ij) ]
You can calculate the first term directly from the beta estimates that
come out of metafor. You can calculate the first component of S by using
the vcov matrix that comes out of metafor. And you can calculate the third
term from the estimated variance components.
To make the calculations simpler, it would be convenient to re-center
the predictor at the value for which you're trying to get a prediction
interval. In other words, if you set X'_ij = X_ij - a, then you're
predicting at the value X'_ij = 0, and the prediction interval simplifies to
PI = bhat0 +/- z_p * sqrt[ Var(bhat0) + Var(u_0j + v_ij)]
Example code using Konstantopoulos2011 below.
James
# load packages and data
library(metafor)
library(tidyverse)
data("dat.konstantopoulos2011")
df <-
dat.konstantopoulos2011 %>%
mutate(
factor=sample(c("A","B"),nrow(.),replace=TRUE)
)
# RANDOM INTERCEPT
--------------------------------------------------------
# fit random intercept model
m1 <- rma.mv(
yi ~ year + factor, V = vi,
random = list(~1|study, ~1|district, ~1|school),
data = df, method = "REML"
)
# Prediction interval at year = 1989 (for Taylor) and factor = A
predict(m1, newmods = cbind(1989, 0), level = 80)
# Same thing, manual calcualtion
vec <- c(1, 1989, 0)
pred <- sum(m1$beta * vec)
Vb <- as.numeric(t(vec) %*% vcov(m1) %*% vec)
V_RE <- sum(m1$sigma2)
S <- sqrt(Vb + V_RE)
z_crit <- qnorm((1 + 0.8) / 2)
pred + c(-1, 1) * z_crit * S
# RANDOM SLOPE (x1)
-------------------------------------------------------
# fit single random slope model
# Re-center the year predictor
df$year_c <- df$year - 1989
m2 <- rma.mv(
yi ~ year_c + factor, vi,
random = list(~ year_c | study, ~1|district, ~1|school),
data = df, method = "REML", struct = "GEN"
)
# Does not return prediction interval
predict(m2, newmods = cbind(0, 0), level = 80)
# Calculate prediction interval for year_c = 0, factor = A
vec <- c(1, 0, 0)
pred <- sum(m2$beta * vec)
Vb <- as.numeric(t(vec) %*% vcov(m2) %*% vec)
V_RE <- sum(m2$sigma2) + m2$tau2[1] # Using only the random intercept at
the study level
S <- sqrt(Vb + V_RE)
z_crit <- qnorm((1 + 0.8) / 2)
pred + c(-1, 1) * z_crit * S
On Thu, Mar 21, 2024 at 11:19?AM Zac Robinson via R-sig-meta-analysis <
r-sig-meta-analysis at r-project.org> wrote:
Wolfgang,
Apologies for the incorrect method to reply - hopefully got it right
this
time. If I'm understanding things correctly - the coding can
(potentially)
end up giving me what I'm after, but I'm just specifying the sigma / sd
for
the prediction interval incorrectly (i.e., the variances can't just be
summed).
Thus, I guess my follow up question is - how would you recommend going
about this calculation? The closest thing I can find in the literature
with
a similar variance calculation is in the context of R2 from Johnson et
al.,
where they calculate the "mean random effects variance" which seems
like it
could be useful here:
https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.12225
Let me know what you think and thank you for your time and expertise,
Zac
On Thu, Mar 14, 2024 at 9:23?AM Viechtbauer, Wolfgang (NP) <
wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
Dear Zac,
Please respond to the mailing list, not just to me personally.
Just summing up all the variances isn't correct - the correct
of the prediction interval for such a random-effects structure is
unfortunately more complex.
Best,
Wolfgang
-----Original Message-----
From: Zac Robinson <zacrobinson2015 at gmail.com>
Sent: Monday, March 11, 2024 16:07
To: Viechtbauer, Wolfgang (NP) <
wolfgang.viechtbauer at maastrichtuniversity.nl>
Subject: Re: [R-meta] Prediction Intervals for estimates of random
model with predict.rma()
Hey Wolfgang,
Thank you very much for the reply. I actually have played around
bit more and think I've found a solution. I may not be specifying
argument perfectly for a random slope model - but I think in concept
Looks like emmeans / predict() have cross compatibility that allows
prediction intervals when sigma is specified
Let me know what you think,
Zac
# SETUP
-------------------------------------------------------------------
#load packages and data
library(metafor)
library(tidyverse)
library(emmeans)
data("dat.konstantopoulos2011")
df=dat.konstantopoulos2011%>%
as.data.frame()%>%
mutate(factor=sample(c("A","B"),56,replace=TRUE))
# RANDOM SLOPE (x1)
-------------------------------------------------------
#fit single random slope model
m2= rma.mv(yi ~ year+factor, vi,
random = list(~year|study, ~1|district, ~1|school),
data = df, method = "REML",struct = "GEN",
control = list(optimizer="bobyqa"))
#extract estimates with emmmprep + emmeans + predict
est.ci=emmprep(m2,
at=list(year=1976:2000))%>%
emmeans(~year,
weights = "prop",
sigma=sqrt(sum(
m2$sigma2,
m2$tau2,
m2$gamma2
)))
pi= est.ci%>%
predict(interval="prediction")%>%
select(lower.PL,upper.PL)
preds=cbind(est.ci,pi)
preds # prediction intervals are now added
ggplot(data = preds,
aes(x=year,
y=emmean))+theme_classic()+
geom_point(data=df,
aes(y=yi,
size=1/vi))+
geom_ribbon(aes(ymin=lower.PL,
ymax=upper.PL),
linetype="dashed",
alpha=0,
color="black")+
geom_ribbon(aes(ymin=asymp.LCL,
ymax=asymp.UCL),
alpha=0.25)+
geom_line()
On Mon, Mar 11, 2024 at 2:06?AM Viechtbauer, Wolfgang (NP)
<mailto:wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
Dear Zac,
predict() currently does not provide predicition intervals for
models with a GEN random effects structure. Not sure when I will get
implementing that.
Best,
Wolfgang
-----Original Message-----
From: R-sig-meta-analysis <mailto:
r-sig-meta-analysis-bounces at r-project.org>
Of Zac Robinson via R-sig-meta-analysis
Sent: Sunday, March 10, 2024 16:26
To: mailto:r-sig-meta-analysis at r-project.org
Cc: Zac Robinson <mailto:zacrobinson2015 at gmail.com>
Subject: [R-meta] Prediction Intervals for estimates of random
model with predict.rma()
Dear All,
I am working on extracting the adjusted effects from a
multilevel meta-regression. One issue I've run into is getting the
appropriate prediction intervals on my estimates when the model
includes random slopes (specifically when struct = "GEN" is
what I gather from the help files, this comes down to
specifying the `tau2.levels` and `gamma2.levels` arguments within
predict.rma() - however, I am unsure how to get this to work.
In the code below I've tried to demonstrate my issue and explore
solutions I'm aware of through the orchaRd package). My end goal
adjusting for both continuous and categorical moderators so
solution stays within the emmprep() / emmeans() framework (which
proportional weights and setting non-focal continuous variables
easy), but I realize this may not be possible.
Thank you very much in advance!
Zac
# SETUP
-------------------------------------------------------------------
#load packages and data
library(metafor)
library(tidyverse)
library(orchaRd)
library(emmeans)
data("dat.konstantopoulos2011")
df=dat.konstantopoulos2011%>%
as.data.frame()%>%
mutate(factor=sample(c("A","B"),nrow(df),replace=TRUE))
# RANDOM INTERCEPT
--------------------------------------------------------
#fit random intercept model
m1=http://rma.mv(yi ~ year+factor, vi,
random = list(~1|study, ~1|district, ~1|school),
data = df, method = "REML")
#extract estimates with predict.rma
predict(m1,
newmods =cbind(rep(c(1976:2000),2),
rep(c(0,1),each=25)),
addx = TRUE) # works fine
#extract estimates with emmprep
emmprep(m1,
at=list(year=1976:2000))%>%
emmeans(~year,
weights = "prop")%>%
pred_interval_esmeans(model=m1,
mm=.,
mod="year") # works fine
#extract estimates with orchaRd
mod_results(m1,
mod = "year",
group = "study") # works fine
# RANDOM SLOPE (x1)
-------------------------------------------------------
#fit single random slope model
m2=http://rma.mv(yi ~ year+factor, vi,
random = list(~year|study, ~1|district, ~1|school),
data = df, method = "REML",struct = "GEN",
control = list(optimizer="bobyqa"))
#extract estimates with predict.rma
predict(m2, newmods =cbind(rep(c(1976:2000),2),
rep(c(0,1),each=25)),
addx = TRUE) # no prediction intervals
predict(m2, newmods =cbind(rep(c(1976:2000),2),
rep(c(0,1),each=25)),
addx = TRUE,
tau2.levels = levels(year), # try adding tau2.levels
) # still no prediction intervals
#extract estimates with emmprep
emmprep(m2,
at=list(year=1976:2000))%>%
emmeans(~year,
weights = "prop")%>%
pred_interval_esmeans(model=m2,
mm=.,
mod="year") # works but gives an error
#extract estimates with orchaRd
mod_results(m2,
mod = "year",
group = "study") # works but gives an error
# RANDOM SLOPE (x2)
-------------------------------------------------------
#fit double random slope model
m3=http://rma.mv(yi ~ year+factor, vi,
random = list(~year|study, ~year|district, ~1|school),
data = df, method = "REML",struct = c("GEN","GEN"),
control = list(optimizer="bobyqa"))
#extract estimates with predict.rma
predict(m3, newmods =cbind(rep(c(1976:2000),2),
rep(c(0,1),each=25)),
addx = TRUE) # no prediction intervals
predict(m3, newmods =cbind(rep(c(1976:2000),2),
rep(c(0,1),each=25)),
addx = TRUE,
tau2.levels = levels(year),# try adding tau2.levels
gamma2.levels = levels(year)# try adding gamma2.levels
) # still no prediction intervals
#extract estimates with emmprep
emmprep(m3,
at=list(year=1976:2000))%>%
emmeans(~year,
weights = "prop")%>%
pred_interval_esmeans(model=m3,
mm=.,
mod="year") # works but gives an error
#extract estimates with orchaRd
mod_results(m3,
mod = "year",
group = "study") # works but gives an error
[[alternative HTML version deleted]]