Skip to content

package metafor: error when setting 'col' and 'at' for a forest plot

6 messages · Viechtbauer Wolfgang (STAT), Brian Z Ring, John Sorkin +2 more

#
Dear Brian,

At the moment, the various forest() functions are not meant to accept 'col' as an argument. While it is indeed possible to specify a 'col' argument, it will be passed on via the ... argument to further functions within forest() and this is where things can go awry. 

To give a more technical explanation, let's look at your example. First, note that the highest 'at' value you specified is log(10), which is approximately 2.30. However, the highest upper CI bound in your dataset is 2.9. Therefore, the forest() function wants to draw a right-pointed arrow for that study using the polygon() function. The code within forest() looks something like this:

polygon(<lots of other stuff>, col="black", ...)

When you specify a 'col' argument, it is passed on to polygon() via ... -- but col="black" is already hard-coded in forest() and so you get the "formal argument "col" matched by multiple actual arguments" error.

The issue of changing colors in the forest() functions has come up before:

https://stat.ethz.ch/pipermail/r-help/2011-August/287788.html

In fact, the problems associated with allowing the user to specify colors in high-level plotting functions, such as forest(), was nicely discussed in a recent article by Paul Murrell in the R Journal:

http://journal.r-project.org/archive/2012-2/RJournal_2012-2_Murrell.pdf

I may add functionality to the forest() functions to handle a 'col' argument for specifying row colors, but I will have to evaulate carefully how well that will work. Also, I would then need to change the current 'col' argument in the forest.rma() function -- breaking backwards compatability :/

Back to your example. You can actually get it to work if you use:

forest(forest$OR, ci.lb=forest$Low, ci.ub=forest$High,  at=log(c(.05, .25, 1, 20)), slab=forest$SNP, atransf=exp)

in which case the values of 'at' encompass the CI bounds of all studies. Then the forest() function does not use polygon(), but segments() and here 'col' is not hard-coded. One (probably) unintended consequence of using 'col' this way is that the x axis label is also given a different color. 

A few other notes:

1) Calling your data frame 'forest' is probably not a good idea -- since that is the name of a function.

2) Your data appear to be (raw -- not log transformed) odds ratios and the corresponding CI bounds. So, you should not be using the atransf arguments -- or you would be transforming your odds ratios to exp(ORs). Probably closer to what you want is:

dat <- read.table(header=TRUE, text="
SNP   Group   High   Low   OR
rs1137101   A   1.21   0.87   1.03
rs1137101   B   2.11   1.21   1.6
rs1137101   C   2.9   1.42   2.03
rs1042522   A   1.12   0.84   0.97
rs1042522   B   1.15   0.79   0.95
rs1042522   C   0.92   0.5   0.7
rs1625895   A   1.14   0.76   0.93
rs1625895   B   1.15   0.75   0.93
rs1625895   C   NA   NA   NA
ACEI/D   A   1.55   0.79   1.11
ACEI/D   B   1.25   0.76   0.98
ACEI/D   C   0.85   0.41   0.59
")

forest(dat$OR, ci.lb=dat$Low, ci.ub=dat$High, col=c(1,2,3), at=c(.25, 1, 2, 3), slab=dat$SNP, refline=1, xlab=" ", xlim=c(-1.5,5.5))
mtext("Odds Ratio", side=1, line=2.5, adj=0.45)

where I manually add the x axis label, so that it is in black. Again, though, using 'col' this way is technically not intended (even though it does work here).

I hope this helps!

Best,
Wolfgang

--   
Wolfgang Viechtbauer, Ph.D., Statistician   
Department of Psychiatry and Psychology   
School for Mental Health and Neuroscience   
Faculty of Health, Medicine, and Life Sciences   
Maastricht University, P.O. Box 616 (VIJV1)   
6200 MD Maastricht, The Netherlands   
+31 (43) 388-4170 | http://www.wvbauer.com
#
Brilliant, this works very well. Thank you for explaining this so clearly,
all your points are well taken.

Sincerely,
Brian

Brian Z Ring PhD
Professor, Director
Institute of Personalized and Genomic Medicine
College of Life Science
Huazhong University of Science and Technology
Wuhan, ?China

-----Original Message-----
From: Viechtbauer Wolfgang (STAT)
[mailto:wolfgang.viechtbauer at maastrichtuniversity.nl] 
Sent: Monday, January 07, 2013 6:16 PM
To: Brian Z Ring; 'David Winsemius'; r-help at r-project.org
Subject: RE: [R] package metafor: error when setting 'col' and 'at' for a
forest plot

Dear Brian,

At the moment, the various forest() functions are not meant to accept 'col'
as an argument. While it is indeed possible to specify a 'col' argument, it
will be passed on via the ... argument to further functions within forest()
and this is where things can go awry. 

To give a more technical explanation, let's look at your example. First,
note that the highest 'at' value you specified is log(10), which is
approximately 2.30. However, the highest upper CI bound in your dataset is
2.9. Therefore, the forest() function wants to draw a right-pointed arrow
for that study using the polygon() function. The code within forest() looks
something like this:

polygon(<lots of other stuff>, col="black", ...)

When you specify a 'col' argument, it is passed on to polygon() via ... --
but col="black" is already hard-coded in forest() and so you get the "formal
argument "col" matched by multiple actual arguments" error.

The issue of changing colors in the forest() functions has come up before:

https://stat.ethz.ch/pipermail/r-help/2011-August/287788.html

In fact, the problems associated with allowing the user to specify colors in
high-level plotting functions, such as forest(), was nicely discussed in a
recent article by Paul Murrell in the R Journal:

http://journal.r-project.org/archive/2012-2/RJournal_2012-2_Murrell.pdf

I may add functionality to the forest() functions to handle a 'col' argument
for specifying row colors, but I will have to evaulate carefully how well
that will work. Also, I would then need to change the current 'col' argument
in the forest.rma() function -- breaking backwards compatability :/

Back to your example. You can actually get it to work if you use:

forest(forest$OR, ci.lb=forest$Low, ci.ub=forest$High,  at=log(c(.05, .25,
1, 20)), slab=forest$SNP, atransf=exp)

in which case the values of 'at' encompass the CI bounds of all studies.
Then the forest() function does not use polygon(), but segments() and here
'col' is not hard-coded. One (probably) unintended consequence of using
'col' this way is that the x axis label is also given a different color. 

A few other notes:

1) Calling your data frame 'forest' is probably not a good idea -- since
that is the name of a function.

2) Your data appear to be (raw -- not log transformed) odds ratios and the
corresponding CI bounds. So, you should not be using the atransf arguments
-- or you would be transforming your odds ratios to exp(ORs). Probably
closer to what you want is:

dat <- read.table(header=TRUE, text="
SNP   Group   High   Low   OR
rs1137101   A   1.21   0.87   1.03
rs1137101   B   2.11   1.21   1.6
rs1137101   C   2.9   1.42   2.03
rs1042522   A   1.12   0.84   0.97
rs1042522   B   1.15   0.79   0.95
rs1042522   C   0.92   0.5   0.7
rs1625895   A   1.14   0.76   0.93
rs1625895   B   1.15   0.75   0.93
rs1625895   C   NA   NA   NA
ACEI/D   A   1.55   0.79   1.11
ACEI/D   B   1.25   0.76   0.98
ACEI/D   C   0.85   0.41   0.59
")

forest(dat$OR, ci.lb=dat$Low, ci.ub=dat$High, col=c(1,2,3), at=c(.25, 1, 2,
3), slab=dat$SNP, refline=1, xlab=" ", xlim=c(-1.5,5.5)) mtext("Odds Ratio",
side=1, line=2.5, adj=0.45)

where I manually add the x axis label, so that it is in black. Again,
though, using 'col' this way is technically not intended (even though it
does work here).

I hope this helps!

Best,
Wolfgang

--   
Wolfgang Viechtbauer, Ph.D., Statistician   
Department of Psychiatry and Psychology   
School for Mental Health and Neuroscience   
Faculty of Health, Medicine, and Life Sciences   
Maastricht University, P.O. Box 616 (VIJV1)   
6200 MD Maastricht, The Netherlands   
+31 (43) 388-4170 | http://www.wvbauer.com
#
Windows 7
R 2.12.1
I am trying to write a function (see sample code below) that will take the output of a t-test and produce results suitable for a table.
I have two questions
(1) You will note that the name of the outcome variable, which is "value" in the input is replaced by the string "outcome by class" in the data frame produced by my function. How can I make my function put the name of the variable being analyzed,, i.e "value" in the output data frame
(2) How can I pass the entire input data to the function so my call to the function will not have to be in its current ugly form, i.e.  
Table1(data$value,data$sex)
and instead could just be
Table1(value,sex,data=data)
 
 
x <- data.frame(value=rnorm(20)  ,sex=rep("Male",  20))
y <- data.frame(value=rnorm(20,4),sex=rep("Female",20))
data <- rbind(x,y)
temp <- t.test(value~sex,data=data)
temp
v<-data.frame(dep=temp$data.name,
           female=temp$estimate[1],male=temp$estimate[2],
           p=temp$p.value,
           CILow=temp$conf.int[1],CIHigh=temp$conf.int[2])
row.names(v) <- NULL
 
 
 
Table1 <- function(outcome,class) {
temp <- t.test(outcome~ class)
mydf <- data.frame(dep=temp$data.name,
             female=temp$estimate[1],male=temp$estimate[2],
             p=temp$p.value,
             CILow=temp$conf.int[1],CIHigh=temp$conf.int[2])
row.names(mydf) <- NULL
mydf}
Table1(data$value,data$sex)
 
 
 
Thank you,
John
 
 
 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information.  Any unauthorized use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
#
John:

R2.12 ??? Time to update.

1. ?t.test and note the last entry in the Value section.

2.  ?with

-- Bert
On Mon, Jan 7, 2013 at 6:44 AM, John Sorkin <JSorkin at grecc.umaryland.edu> wrote:
#
I've just realized that you're swapping female and male in the creation 
of the results data frame.
It should be

Table2 <- function(formula, data) {
     dname <- rownames(attr(terms(formula), "factors"))[1]
     temp <- t.test(formula, data)
     mydf <- data.frame(dep=temp$data.name,
              female=temp$estimate[2], male=temp$estimate[1],
              p=temp$p.value,
              CILow=temp$conf.int[1],CIHigh=temp$conf.int[2])
     row.names(mydf) <- NULL
     mydf
}


Rui Barradas

Em 07-01-2013 15:39, Rui Barradas escreveu: