R-experts: In 2002, Hans Van Houwelingen et al. published a tutorial on how to do meta-regression in Statistics in Medicine. They used the classic BCG dataset of Colditz to demonstrate correct methodology and computed the results using PROC MIXED in SAS. In trying to duplicate the results presented in this paper, I have discovered that I can reproduce certain items with lmer but not others. I was hoping that someone might point out how I could correctly program R code to arrive at the correct solution. I have placed the paper and the datasets used below at the following website for easy access to any helpers: http://www.duke.edu/~bi6 I start by loading the data into R. ----- bcg <- read.csv('bcg.csv') bcg.long <- read.csv('bcg-long.csv') bcg$study <- paste(bcg$author, bcg$year) ----- I then perform standard meta-analysis using two different R functions: (1) the metabin function (meta package) to pool odds ratios with standard inverse variance techniques using the "wide" bcg dataset and (2) the lmer function (lme4 package) to perform a multilevel meta- analysis using the "long" dataset. The only reason that the lmer function is possible here is because the outcome is binary (disease .vs. no disease) and I could create a long dataset. A 3rd option is the mima function, but that is not presented here since I am interested in using lmer to extend to situations where there are study level (level 2) and individual level (level 1) predictors, something mima cannot currently handle. ----- library(meta) # Fixed and random effects models, no covariates f0 <- metabin(bcg[,3], bcg[,4], bcg[,5], bcg[,6], sm='OR', method='Inverse') summary(f0) library(lme4) # Fixed effects model, no covariates f1 <- lmer(tb ~ bcg + (1 | study), family=binomial, data=bcg.long) summary(f1) exp(fixef(f1)[2]) # OR exp(f1 at fixef[2] - (1.96*sqrt(vcov(f1)[2,2]))) # lci exp(f1 at fixef[2] + (1.96*sqrt(vcov(f1)[2,2]))) # uci # Random effects model, no covariates. f2 <- lmer(tb ~ bcg + (bcg | study), family=binomial, data=bcg.long) # Random effects, no covariates summary(f2) exp(fixef(f2)[2]) # OR exp(f2 at fixef[2] - (1.96*sqrt(vcov(f2)[2,2]))) # lci exp(f2 at fixef[2] + (1.96*sqrt(vcov(f2)[2,2]))) # uci as.numeric(summary(f2)@REmat[2,3]) # Tau ----- So far these results look good and compare favorably to those obtained by Van Houwelingen. It is also obvious that although metabin and lmer give similar results, computational time is much longer with lmer than meta since it must use the long version of the dataset. The problem comes when two covariates are added to model f2, latitude and year of publication. ----- # Random effects model, 1 covariate f3 <- lmer(tb ~ bcg + latitude + (bcg | study), family=binomial, data=bcg.long) summary(f3) exp(fixef(f3)) # OR # Random effects model, 1 covariate f4 <- lmer(tb ~ bcg + year + (bcg | study), family=binomial, data=bcg.long) summary(f4) exp(fixef(f4)) # OR ----- I assumed, incorrectly, that models f3 and f4 would reproduce the results of Van Houwelingen in sections 5.2.1 and 5.2.2. They do not seem to do so. I would be very appreciative if someone pointed out my error with models f3 and f4 and why they do not seem to be correct. Incidentally, other sources (ex: Egger/Altman book on systematic reviews) report results on this dataset similar to Van Houwelingen, so I think that my code is definitely the problem. Thanks, Brant Inman
Duplicating meta-regression results from PROC MIXED with lmer
8 messages · Rolf Turner, Ken Beath, Brant Inman +1 more
On 6/05/2009, at 3:00 PM, Brant Inman wrote:
R-experts: In 2002, Hans Van Houwelingen et al. published a tutorial on how to do meta-regression in Statistics in Medicine. They used the classic BCG dataset of Colditz to demonstrate correct methodology and computed the results using PROC MIXED in SAS. In trying to duplicate the results presented in this paper, I have discovered that I can reproduce certain items with lmer but not others. I was hoping that someone might point out how I could correctly program R code to arrive at the correct solution.
<snip>
There appears to be a tacit assertion here that the results from PROC
MIXED in The-Package-That-Must-Not-Be-Named are the correct results.
This assertion is very likely to bring the wrath of Doug Bates down
upon your head. An outcome to be devoutly avoided! :-)
cheers,
Rolf Turner
######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
Rolf, I actually don't believe that this is a SAS vs R issue since I have 3 sources that report the same results. I know that STATA, SAS and the mima function from R can all be used to give the correct results. The question is related more to how I can get similar results with lmer. Currently, the code I provided generates VERY different results, especially related to between study variance estimation (tau) and regression coefficient standard errors. Basically, I think I got it all wrong in my coding of the models with covariates (f3 and f4), but I can't understand why. Brant
On May 5, 2009, at 11:43 PM, Rolf Turner wrote:
On 6/05/2009, at 3:00 PM, Brant Inman wrote:
R-experts: In 2002, Hans Van Houwelingen et al. published a tutorial on how to do meta-regression in Statistics in Medicine. They used the classic BCG dataset of Colditz to demonstrate correct methodology and computed the results using PROC MIXED in SAS. In trying to duplicate the results presented in this paper, I have discovered that I can reproduce certain items with lmer but not others. I was hoping that someone might point out how I could correctly program R code to arrive at the correct solution.
<snip> There appears to be a tacit assertion here that the results from PROC MIXED in The-Package-That-Must-Not-Be-Named are the correct results. This assertion is very likely to bring the wrath of Doug Bates down upon your head. An outcome to be devoutly avoided! :-) cheers, Rolf Turner ###################################################################### Attention: This e-mail message is privileged and confidential. If you are not the intended recipient please delete the message and notify the sender. Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshal www.marshalsoftware.com ######################################################################
On 06/05/2009, at 8:43 PM, Brant Inman wrote:
Rolf, I actually don't believe that this is a SAS vs R issue since I have 3 sources that report the same results. I know that STATA, SAS and the mima function from R can all be used to give the correct results. The question is related more to how I can get similar results with lmer. Currently, the code I provided generates VERY different results, especially related to between study variance estimation (tau) and regression coefficient standard errors. Basically, I think I got it all wrong in my coding of the models with covariates (f3 and f4), but I can't understand why. Brant
This is closer but maybe not quite equivalent. It also avoids using
the huge data file.
Ken
bcg <- read.csv("bcg.csv")
bcg$study <- paste(bcg$author, bcg$year)
# this is a bit messy but requires almost no thought
bcgnames <- bcg[,c(1,2,7:10)]
bcgvacc <- bcg[,3:4]
names(bcgvacc) <- c("tb","n")
bcgnovacc <- bcg[,5:6]
names(bcgnovacc) <- c("tb","n")
newbcg <- cbind(rbind(bcgnames,bcgnames),rbind(bcgvacc,bcgnovacc))
newbcg$bcg <- factor(rep(c("yes","no"),each=13))
newbcg$notb <- newbcg$n-newbcg$tb
newbcg$latitude <- newbcg$latitude-mean(newbcg$latitude)
newbcg$year <- newbcg$year-mean(newbcg$year)
library(meta)
# Fixed and random effects models, no covariates
f0 <- metabin(bcg[,3], bcg[,4], bcg[,5], bcg[,6], sm='OR',
method='Inverse')
summary(f0)
library(lme4)
# Fixed effects model, no covariates
f1 <- lmer(cbind(tb,notb) ~ bcg + (1 | study), family=binomial,
data=newbcg)
summary(f1)
exp(fixef(f1)[2]) # OR
exp(f1 at fixef[2] - (1.96*sqrt(vcov(f1)[2,2]))) # lci
exp(f1 at fixef[2] + (1.96*sqrt(vcov(f1)[2,2]))) # uci
# Random effects model, no covariates.
f2 <- lmer(cbind(tb,notb) ~ bcg + (bcg | study), family=binomial,
data=newbcg) # Random effects, no covariates
summary(f2)
exp(fixef(f2)[2]) # OR
exp(f2 at fixef[2] - (1.96*sqrt(vcov(f2)[2,2]))) # lci
exp(f2 at fixef[2] + (1.96*sqrt(vcov(f2)[2,2]))) # uci
as.numeric(summary(f2)@REmat[2,3]) # Tau
# Random effects model, 1 covariate
f3 <- lmer(cbind(tb,notb) ~ latitude*bcg + (bcg | study),
family=binomial, data=newbcg)
summary(f3)
exp(fixef(f3)) # OR
# Random effects model, 1 covariate
f4 <- lmer(cbind(tb,notb) ~ latitude*bcg + year*bcg + (bcg | study),
family=binomial, data=newbcg)
summary(f4)
exp(fixef(f4)) # OR
Ken,
Thanks very much for the great advice. You essentially suggest the
following:
1) Add interaction term b/t covariates and treatment arm variable
2) Center continuous covariates
3) Use a wide dataset (improves computation time dramatically over a
long dataset)
The estimates obtained with your method would seem at first glance to
compare favorably with the published results. However, on closer
analysis I wonder if the interpretation of your model is the same as
Van Houwelingen's.
YOUR MODEL, WHERE YEAR AND LATITUDE ARE ACTUALLY CENTERED VARIABLES
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ bcg * latitude + bcg * year + (bcg | study)
Data: newbcg
AIC BIC logLik deviance
105.6 116.9 -43.8 87.6
Random effects:
Groups Name Variance Std.Dev. Corr
study (Intercept) 0.60620909 0.778594
bcgyes 0.00045554 0.021343 1.000
Number of obs: 26, groups: study, 13
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.116305 0.221649 -18.571 < 2e-16 ***
bcgyes -0.733330 0.049815 -14.721 < 2e-16 ***
latitude 0.024016 0.021007 1.143 0.25294
year -0.099948 0.027955 -3.575 0.00035 ***
bcgyes:latitude -0.034332 0.003991 -8.601 < 2e-16 ***
bcgyes:year -0.001489 0.005811 -0.256 0.79781
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.?
0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) bcgyes latitd year bcgys:l
bcgyes 0.046
latitude -0.004 0.006
year -0.015 0.014 0.658
bcgyes:lttd 0.002 0.156 0.088 0.059
bcgyes:year 0.020 -0.234 0.049 0.078 0.706
VAN HOUWELINGEN's MODEL RESULTS (I BELIEVE THAT HE DID NOT CENTER THE
VARIABLES):
5.2.4. Regression on latitude and year. When both covariates latitude
and year are put into
the model, the residual between-study variance becomes only 0.002,
corresponding with an
explained variance of 99.3 per cent, only slightly more than by
latitude alone. The regression
coe??cients for the intercept, latitude and year are, respectively,
0.494 (standard error = 0:529),
?0:034 (standard error = 0:004) and ?0:001 (standard error = 0:006).
After taking into account centering, do you believe that your
regression equation is interpreted in the same way as his?
Brant
On May 6, 2009, at 8:02 AM, Ken Beath wrote:
On 06/05/2009, at 8:43 PM, Brant Inman wrote:
Rolf, I actually don't believe that this is a SAS vs R issue since I have 3 sources that report the same results. I know that STATA, SAS and the mima function from R can all be used to give the correct results. The question is related more to how I can get similar results with lmer. Currently, the code I provided generates VERY different results, especially related to between study variance estimation (tau) and regression coefficient standard errors. Basically, I think I got it all wrong in my coding of the models with covariates (f3 and f4), but I can't understand why. Brant
This is closer but maybe not quite equivalent. It also avoids using
the huge data file.
Ken
bcg <- read.csv("bcg.csv")
bcg$study <- paste(bcg$author, bcg$year)
# this is a bit messy but requires almost no thought
bcgnames <- bcg[,c(1,2,7:10)]
bcgvacc <- bcg[,3:4]
names(bcgvacc) <- c("tb","n")
bcgnovacc <- bcg[,5:6]
names(bcgnovacc) <- c("tb","n")
newbcg <- cbind(rbind(bcgnames,bcgnames),rbind(bcgvacc,bcgnovacc))
newbcg$bcg <- factor(rep(c("yes","no"),each=13))
newbcg$notb <- newbcg$n-newbcg$tb
newbcg$latitude <- newbcg$latitude-mean(newbcg$latitude)
newbcg$year <- newbcg$year-mean(newbcg$year)
library(meta)
# Fixed and random effects models, no covariates
f0 <- metabin(bcg[,3], bcg[,4], bcg[,5], bcg[,6], sm='OR',
method='Inverse')
summary(f0)
library(lme4)
# Fixed effects model, no covariates
f1 <- lmer(cbind(tb,notb) ~ bcg + (1 | study), family=binomial,
data=newbcg)
summary(f1)
exp(fixef(f1)[2]) # OR
exp(f1 at fixef[2] - (1.96*sqrt(vcov(f1)[2,2]))) # lci
exp(f1 at fixef[2] + (1.96*sqrt(vcov(f1)[2,2]))) # uci
# Random effects model, no covariates.
f2 <- lmer(cbind(tb,notb) ~ bcg + (bcg | study), family=binomial,
data=newbcg) # Random effects, no covariates
summary(f2)
exp(fixef(f2)[2]) # OR
exp(f2 at fixef[2] - (1.96*sqrt(vcov(f2)[2,2]))) # lci
exp(f2 at fixef[2] + (1.96*sqrt(vcov(f2)[2,2]))) # uci
as.numeric(summary(f2)@REmat[2,3]) # Tau
# Random effects model, 1 covariate
f3 <- lmer(cbind(tb,notb) ~ latitude*bcg + (bcg | study),
family=binomial, data=newbcg)
summary(f3)
exp(fixef(f3)) # OR
# Random effects model, 1 covariate
f4 <- lmer(cbind(tb,notb) ~ latitude*bcg + year*bcg + (bcg | study),
family=binomial, data=newbcg)
summary(f4)
exp(fixef(f4)) # OR
I have not followed the discussion fully, but I have a hunch that may be useful to examine. Suppose you want to meta-analyze k 2x2 table data using the odds ratio as the outcome measure. One approach is to calculate the log odds ratio with the corresponding sampling variances (or to be precise: estimates thereof) for those k tables and apply one of the usual meta-analytic models. Moderators can be added and van Houwelingen, Arends, and Stijnen (2002) nicely demonstrate how those types of models can be fitted with SAS. An alternative approach is not to rely on the normal approximation and instead formulate a logistic regression model. An "equivalent" fixed-effects model then should include *dummy variables for the k tables* and a dummy variable for the group variable (e.g., bcg vaccinated = 1; not bcg vaccinated = 0). A constant term should also be in the model. If you want to add a moderator variable to this model, then one should add *the interaction between the group variable and the moderator variable* to the model (but NOT the main effect for the moderator variable, since then the model would be overparameterized). If one wants a random-effects model, one should STILL add the *dummy variables for the k tables* and the dummy variable for the group variable, plus a random effect along with that group dummy variable (so that we get a random treatment effect). Again, moderators are included via an interaction term between the moderator variable and the group variable. These could be considered the "equivalent" models to the usual meta-anaytic fixed-, random-, and mixed-effects models. As far as I can tell, Brant, there are no dummies in your model for the tables. Give that a try. And then, when you throw in moderators, just include the interaction between the moderator and the bcg variable. I'll be interested in what you find! Best,
Wolfgang Viechtbauer ?Department of Methodology and Statistics ?University of Maastricht, The Netherlands ?http://www.wvbauer.com/
Wolfgang,
Thanks for the helpful comments. To be clear, I am going to
demonstrate the models you suggest, along with results so that anyone
interested can provide insight without having to execute the code on
their PCs.
Quick look at the famous BCG data, in the format that was suggested in
the last post:
author studyid latitude year
startyear study tb n bcg notb
1 Aronson 1 10.5384615 -18.230769 1935
Aronson 1948 4 123 yes 119
2 Ferguson 2 21.5384615 -17.230769 1933
Ferguson 1949 6 306 yes 300
3 Stein 6 10.5384615 -13.230769
1935 Stein 1953 180 1541 yes 1361
4 Rosenthal 3 8.5384615 -6.230769 1941
Rosenthal 1960 3 231 yes 228
5 Rosenthal 10 8.5384615 -5.230769 1937
Rosenthal 1961 17 1716 yes 1699
6 Coetzee 9 -6.4615385 1.769231 1965
Coetzee 1968 29 7499 yes 7470
7 Comstock-Webster 12 -0.4615385 2.769231 1947 Comstock-
Webster 1969 5 2498 yes 2493
8 Frimont-Moller 5 -20.4615385 6.769231 1950 Frimont-
Moller 1973 33 5069 yes 5036
9 Vandeviere 7 -14.4615385 6.769231 1965
Vandeviere 1973 8 2545 yes 2537
10 Comstock School 11 -15.4615385 7.769231 1947 Comstock
School 1974 186 50634 yes 50448
11 Comstock Comm 13 -0.4615385 9.769231 1950
Comstock Comm 1976 27 16913 yes 16886
12 Hart 4 18.5384615 10.769231
1950 Hart 1977 62 13598 yes 13536
13 Madras 8 -20.4615385 13.769231 1968
Madras 1980 505 88391 yes 87886
14 Aronson 1 10.5384615 -18.230769 1935
Aronson 1948 11 139 no 128
15 Ferguson 2 21.5384615 -17.230769 1933
Ferguson 1949 29 303 no 274
16 Stein 6 10.5384615 -13.230769
1935 Stein 1953 372 1451 no 1079
17 Rosenthal 3 8.5384615 -6.230769 1941
Rosenthal 1960 11 220 no 209
18 Rosenthal 10 8.5384615 -5.230769 1937
Rosenthal 1961 65 1665 no 1600
19 Coetzee 9 -6.4615385 1.769231 1965
Coetzee 1968 45 7277 no 7232
20 Comstock-Webster 12 -0.4615385 2.769231 1947 Comstock-
Webster 1969 3 2341 no 2338
21 Frimont-Moller 5 -20.4615385 6.769231 1950 Frimont-
Moller 1973 47 5808 no 5761
22 Vandeviere 7 -14.4615385 6.769231 1965
Vandeviere 1973 10 629 no 619
23 Comstock School 11 -15.4615385 7.769231 1947 Comstock
School 1974 141 27338 no 27197
24 Comstock Comm 13 -0.4615385 9.769231 1950
Comstock Comm 1976 29 17854 no 17825
25 Hart 4 18.5384615 10.769231
1950 Hart 1977 248 12867 no 12619
26 Madras 8 -20.4615385 13.769231 1968
Madras 1980 499 88391 no 87892
---------------------------------------------------------------------------
### Fixed effects model, no moderators
> y <- cbind(newbcg$tb, newbcg$notb)
> lmer(y ~ bcg + (1 | study), family=binomial, data=newbcg)
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ bcg + (1 | study)
Data: newbcg
AIC BIC logLik deviance
260.4 264.2 -127.2 254.4
Random effects:
Groups Name Variance Std.Dev.
study (Intercept) 1.9885 1.4101
Number of obs: 26, groups: study, 13
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.20441 0.39456 -10.66 <2e-16 ***
bcgyes -0.47747 0.04123 -11.58 <2e-16 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr)
bcgyes -0.045
This model appears to give correct results for the fixed effects
case. Note that the variable that identifies the 2x2 table, "study",
is assigned a random effect but is not a fixed effect. The group
dummy variable that identifies the treatment arm "bcg" is a fixed
effect.
---------------------------------------------------------------------------
# Alternative method of coding the fixed effects model suggested by
Wolfgang. Result for treatment effect "bcg" is similar.
> lmer(y ~ bcg + study + (1 | study), family=binomial, data=newbcg)
eneralized linear mixed model fit by the Laplace approximation
Formula: y ~ bcg + study + (1 | study)
Data: newbcg
AIC BIC logLik deviance
207.0 225.8 -88.48 177.0
Random effects:
Groups Name Variance Std.Dev.
study (Intercept) 0 0
Number of obs: 26, groups: study, 13
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.60205 0.26674 -9.755 < 2e-16 ***
bcgyes -0.47726 0.04123 -11.575 < 2e-16 ***
studyCoetzee 1968 -2.47540 0.29070 -8.515 < 2e-16 ***
studyComstock Comm 1976 -3.62322 0.29801 -12.158 < 2e-16 ***
studyComstock School 1974 -2.58468 0.27210 -9.499 < 2e-16 ***
studyComstock-Webster 1969 -3.58318 0.44288 -8.091 5.93e-16 ***
studyFerguson 1949 0.01952 0.31832 0.061 0.95109
studyFrimont-Moller 1973 -2.10792 0.28899 -7.294 3.01e-13 ***
studyHart 1977 -1.61561 0.27237 -5.932 3.00e-09 ***
studyMadras 1980 -2.35244 0.26818 -8.772 < 2e-16 ***
studyRosenthal 1960 -0.62094 0.38048 -1.632 0.10268
studyRosenthal 1961 -0.87730 0.28885 -3.037 0.00239 **
studyStein 1953 1.34371 0.27050 4.968 6.78e-07 ***
studyVandeviere 1973 -2.20159 0.35640 -6.177 6.52e-10 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) bcgyes sC1968 sCC197 sCS197 sC-W19 sF1949 sF-M19
sH1977 sM1980 sR1960 sR1961 sS1953
bcgyes -0.057
stdyCtz1968 -0.914 -0.003
stdyCmC1976 -0.892 -0.001 0.819
stdyCmS1974 -0.976 -0.026 0.897 0.875
stdyC-W1969 -0.600 -0.003 0.551 0.537 0.589
stdyFrg1949 -0.835 -0.004 0.766 0.748 0.819 0.503
stdyF-M1973 -0.920 0.002 0.844 0.823 0.902 0.554 0.771
stdyHrt1977 -0.976 -0.005 0.896 0.874 0.957 0.588 0.818 0.901
stdyMdr1980 -0.991 -0.003 0.910 0.887 0.972 0.597 0.831 0.915
0.971
stdyRsn1960 -0.699 -0.004 0.641 0.625 0.685 0.421 0.586 0.645
0.684 0.695
stdyRsn1961 -0.920 -0.004 0.845 0.824 0.902 0.554 0.771 0.850
0.901 0.916 0.645
stdyStn1953 -0.982 -0.011 0.902 0.880 0.964 0.592 0.824 0.907
0.963 0.978 0.689 0.908
stdyVnd1973 -0.744 -0.040 0.685 0.668 0.732 0.449 0.625 0.688
0.731 0.742 0.523 0.689 0.736
---------------------------------------------------------------------------
# Random effects model, no moderators
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ bcg + (bcg | study)
Data: newbcg
AIC BIC logLik deviance
119.0 125.3 -54.52 109.0
Random effects:
Groups Name Variance Std.Dev. Corr
study (Intercept) 2.43772 1.56132
bcgyes 0.33610 0.57974 -0.739
Number of obs: 26, groups: study, 13
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.1234 0.4372 -9.431 < 2e-16 ***
bcgyes -0.7417 0.1818 -4.079 4.52e-05 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr)
bcgyes -0.684
This model appears to give the correct results.
---------------------------------------------------------------------------
# Alternative way of coding random effects model suggested by
Wolfgang. Main difference: the estimate of between study variance
(tau^2) is smaller in this version of the model.
> lmer(y ~ bcg + study + (bcg | study), family=binomial, data=newbcg)
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ bcg + study + (bcg | study)
Data: newbcg
AIC BIC logLik deviance
76.68 98.06 -21.34 42.68
Random effects:
Groups Name Variance Std.Dev. Corr
study (Intercept) 0.0000 0.00000
bcgyes 0.2797 0.52887 NaN
Number of obs: 26, groups: study, 13
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.4917 0.2934 -8.493 < 2e-16 ***
bcgyes -0.7639 0.1715 -4.455 8.39e-06 ***
studyCoetzee 1968 -2.5697 0.3257 -7.889 3.04e-15 ***
studyComstock Comm 1976 -3.8602 0.3387 -11.397 < 2e-16 ***
studyComstock School 1974 -2.7602 0.3045 -9.064 < 2e-16 ***
studyComstock-Webster 1969 -3.7729 0.4973 -7.586 3.30e-14 ***
studyFerguson 1949 0.1661 0.3499 0.475 0.6351
studyFrimont-Moller 1973 -2.2833 0.3241 -7.045 1.85e-12 ***
studyHart 1977 -1.4475 0.3001 -4.824 1.41e-06 ***
studyMadras 1980 -2.6741 0.2966 -9.015 < 2e-16 ***
studyRosenthal 1960 -0.5553 0.4162 -1.334 0.1821
studyRosenthal 1961 -0.7421 0.3184 -2.331 0.0198 *
studyStein 1953 1.4243 0.2992 4.760 1.93e-06 ***
studyVandeviere 1973 -1.8349 0.4230 -4.338 1.44e-05 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) bcgyes sC1968 sCC197 sCS197 sC-W19 sF1949 sF-M19
sH1977 sM1980 sR1960 sR1961 sS1953
bcgyes -0.095
stdyCtz1968 -0.897 0.052
stdyCmC1976 -0.862 0.036 0.775
stdyCmS1974 -0.962 0.078 0.864 0.830
stdyC-W1969 -0.581 -0.041 0.524 0.505 0.560
stdyFrg1949 -0.834 0.036 0.750 0.721 0.803 0.489
stdyF-M1973 -0.902 0.053 0.811 0.779 0.868 0.527 0.754
stdyHrt1977 -0.977 0.085 0.877 0.842 0.940 0.568 0.815 0.882
stdyMdr1980 -0.989 0.090 0.887 0.852 0.951 0.574 0.825 0.892
0.966
stdyRsn1960 -0.699 0.000 0.629 0.605 0.673 0.412 0.586 0.632
0.683 0.691
stdyRsn1961 -0.919 0.062 0.826 0.793 0.884 0.536 0.768 0.830
0.898 0.909 0.644
stdyStn1953 -0.980 0.086 0.880 0.845 0.943 0.570 0.818 0.884
0.957 0.969 0.685 0.901
stdyVnd1973 -0.684 -0.033 0.617 0.594 0.660 0.407 0.575 0.620
0.669 0.677 0.485 0.631 0.671
---------------------------------------------------------------------------
# Random effects model, adding two centered continuous moderator
variables
> lmer(y ~ bcg*latitude + bcg*year +(bcg | study), family=binomial,
data=newbcg)
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ bcg * latitude + bcg * year + (bcg | study)
Data: newbcg
AIC BIC logLik deviance
105.6 116.9 -43.8 87.6
Random effects:
Groups Name Variance Std.Dev. Corr
study (Intercept) 0.60620909 0.778594
bcgyes 0.00045554 0.021343 1.000
Number of obs: 26, groups: study, 13
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.116305 0.221649 -18.571 < 2e-16 ***
bcgyes -0.733330 0.049815 -14.721 < 2e-16 ***
latitude 0.024016 0.021007 1.143 0.25294
year -0.099948 0.027955 -3.575 0.00035 ***
bcgyes:latitude -0.034332 0.003991 -8.601 < 2e-16 ***
bcgyes:year -0.001489 0.005811 -0.256 0.79781
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) bcgyes latitd year bcgys:l
bcgyes 0.046
latitude -0.004 0.006
year -0.015 0.014 0.658
bcgyes:lttd 0.002 0.156 0.088 0.059
bcgyes:year 0.020 -0.234 0.049 0.078 0.706
##
The Van Houwelingen results using non-centered moderators for a
similar model are:
tau^2 = 0.002
intercept: 0.494 (se = 0.592)
latitude: -0.034 (se = 0.004)
year: -0.001 (se = 0.006)
Note that there are significant differences in these estimates and
those obtained in my model.
---------------------------------------------------------------------------
# Wolfgang's suggestion for the fixed effects model with moderators.
> lmer(y ~ bcg*latitude + bcg*year + study + (bcg | study),
family=binomial, data=newbcg)
Error in mer_finalize(ans) : Downdated X'X is not positive definite, 1.
In addition: Warning message:
In model.matrix.default(mt, mf, contrasts) :
variable 'study' converted to a factor
Obviously a quite different result.
---------------------------------------------------------------------------
I hope these details help the smart people out there figure out what I
have done wrong!
Brant
On May 7, 2009, at 10:39 AM, Viechtbauer Wolfgang (STAT) wrote:
I have not followed the discussion fully, but I have a hunch that may be useful to examine. Suppose you want to meta-analyze k 2x2 table data using the odds ratio as the outcome measure. One approach is to calculate the log odds ratio with the corresponding sampling variances (or to be precise: estimates thereof) for those k tables and apply one of the usual meta-analytic models. Moderators can be added and van Houwelingen, Arends, and Stijnen (2002) nicely demonstrate how those types of models can be fitted with SAS. An alternative approach is not to rely on the normal approximation and instead formulate a logistic regression model. An "equivalent" fixed-effects model then should include *dummy variables for the k tables* and a dummy variable for the group variable (e.g., bcg vaccinated = 1; not bcg vaccinated = 0). A constant term should also be in the model. If you want to add a moderator variable to this model, then one should add *the interaction between the group variable and the moderator variable* to the model (but NOT the main effect for the moderator variable, since then the model would be overparameterized). If one wants a random-effects model, one should STILL add the *dummy variables for the k tables* and the dummy variable for the group variable, plus a random effect along with that group dummy variable (so that we get a random treatment effect). Again, moderators are included via an interaction term between the moderator variable and the group variable. These could be considered the "equivalent" models to the usual meta- anaytic fixed-, random-, and mixed-effects models. As far as I can tell, Brant, there are no dummies in your model for the tables. Give that a try. And then, when you throw in moderators, just include the interaction between the moderator and the bcg variable. I'll be interested in what you find! Best, -- Wolfgang Viechtbauer Department of Methodology and Statistics University of Maastricht, The Netherlands http://www.wvbauer.com/
I now had a chance to run these models myself.
Here are the data that I used (study is a factor and the first column is the # of TB cases and the second the # of non-TB cases).
group ablati syear study
[1,] 4 119 1 44 48 1
[2,] 6 300 1 55 49 2
[3,] 3 228 1 42 60 3
[4,] 62 13536 1 52 77 4
[5,] 33 5036 1 13 73 5
[6,] 180 1361 1 44 53 6
[7,] 8 2537 1 19 73 7
[8,] 505 87886 1 13 80 8
[9,] 29 7470 1 27 68 9
[10,] 17 1699 1 42 61 10
[11,] 186 50448 1 18 74 11
[12,] 5 2493 1 33 69 12
[13,] 27 16886 1 33 76 13
[14,] 11 128 0 44 48 1
[15,] 29 274 0 55 49 2
[16,] 11 209 0 42 60 3
[17,] 248 12619 0 52 77 4
[18,] 47 5761 0 13 73 5
[19,] 372 1079 0 44 53 6
[20,] 10 619 0 19 73 7
[21,] 499 87892 0 13 80 8
[22,] 45 7232 0 27 68 9
[23,] 65 1600 0 42 61 10
[24,] 141 27197 0 18 74 11
[25,] 3 2338 0 33 69 12
[26,] 29 17825 0 33 76 13
If you want results that are essentially those from the paper, you should use:
lmer(y ~ group + ablati:group + syear:group + study + (group - 1| study), family=binomial)
The estimate of tau^2 is then essentially zero and:
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.426859 0.269031 -9.021 < 2e-16 ***
group 0.548365 0.493131 1.112 0.26613
study2 0.131173 0.320457 0.409 0.68230
...
study13 -3.716989 0.301366 -12.334 < 2e-16 ***
group:ablati -0.034185 0.003948 -8.659 < 2e-16 ***
group:syear -0.001770 0.005753 -0.308 0.75838
This matches up quite nicely with the results from the "usual" approach.
An alternative would be to add study as a random instead of a fixed effect. Then the main effects for absolute latitude and study year can also be added to the model:
lmer(y ~ group + ablati + ablati:group + syear + syear:group + (group | study), family=binomial)
Then the estimate of tau^2 is 0.00045553, still pretty much zero, and:
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.699707 2.386607 0.712 0.47635
group 0.514024 0.497863 1.032 0.30186
ablati 0.024016 0.021007 1.143 0.25294
syear -0.099948 0.027955 -3.575 0.00035 ***
group:ablati -0.034332 0.003991 -8.601 < 2e-16 ***
group:syear -0.001488 0.005811 -0.256 0.79787
which is still quite close.
Best,
Wolfgang Viechtbauer ?Department of Methodology and Statistics ?University of Maastricht, The Netherlands ?http://www.wvbauer.com/