An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121129/da8a6b92/attachment.pl>
bootstrapped cox regression (rms package)
11 messages · Eric Claus, Mark Lamias, Yihui Xie +1 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121129/5e079986/attachment.pl>
Quite a few people have had this problem, but since I'm unable to reproduce it, I'm not exactly sure how to fix it either. A few references that might be helpful to you: http://stackoverflow.com/q/12448507/559676 https://github.com/yihui/knitr/issues/413 It is very likely to be a pure LaTeX problem. Letting MikTeX install the missing LaTeX packages on the fly might solve the problem. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA
On Thu, Nov 29, 2012 at 10:53 AM, Mark Lamias <mlamias at yahoo.com> wrote:
R Users,
I just upgraded my version of R from R-2.15.0 to R-2.15.2 and installed the latest versions of LyX and MikTex running Windows 7 Ultimate, 64-bit OS. Prior to the upgrade, I was using Lyx with knitr to generate a document with no problems. However, after the upgrade, and using the same LyX document, I'm receiving the following error when I attempt to compile the document:
\end{verbatim}
The control sequence at the end of the top line
of your error message was never \def'ed. If you have
misspelled it (e.g., `\hobx'), type `I' and the correct
spelling (e.g., `I\hbox'). Otherwise just continue,
and I'll forget about whatever was undefined.
I have determined that the error is caused when printing the anova results from the anova statement in my R source code, but can't seem to resolve the issue. Here is an example code chunk that creates the error:
<<NonCP1, fig.width=6, fig.height=4, out.width='.8\\linewidth' ,par=FALSE>>=
#Read in data
y=c( 67, 73, 83, 89, 65, 91, 87, 86, 155, 127, 147, 212, 108, 100, 90, 153, 140, 142, 121, 150, 33, 8, 46, 54 )
temp=as.factor(c(rep(seq(360, 380, 10), each=4), rep(seq(380, 360, -10), each=4)))
coat=as.factor(rep(seq(1, 4), 6))
replicate=as.factor(rep(seq(1, 6), each=4))
#Obtain Factorial/Incorrect Model
o=lm(y~temp*coat)
ano=anova(o)
ano
@
Removing the ano=anova(o) or ano lines in the code chunk allows the document to compile with no problem. Does anyone else have this problem or did I do something wrong when I migrated to the newer versions?
Thanks, in advance for any help!
Sincerely yours,
Mark J. Lamias
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121129/5e071ea1/attachment.pl>
That is very helpful! Just to continue debugging, can you save the two versions of the tex files produced from LyX with different versions of R and do a diff on them? It sounds like something has changed from R 2.15.0 to 2.15.2. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA
On Thu, Nov 29, 2012 at 1:26 PM, Mark Lamias <mlamias at yahoo.com> wrote:
Thanks, Yihui! Luckily I kept R-2.15.0 and left it untouched (so I can continue to use that for now). If it helps any, I was able to go back into Lyx and change the path to point to R-2.15.0 and I also changed the windows path environment variable to point to the old version. After doing this, LyX worked fine with no problem on the code below. Changing the paths back to the new version R-2.15.2, generates the error below. If anyone else has any idea how to resolve this, either through R or a Lyx/LaTeX fix, I'd be all ears. Thanks, again for your response, Yihui! Sincerely yours, Mark J. Lamias
Eric, the output you showed for anova(out) is not correct. anova.rms does not produce such output. Please give us the correct script that obtained those results and let us know if you are overriding the anova command somehow. To your point, make sure that SPSS does not use the bootstrap to obtain a new point estimate of beta but rather uses the original Cox beta coefficients in the test. Frank Eric Claus wrote
Hi,
I am trying to convert a colleague from using SPSS to R, but am having
trouble generating a result that is similar enough to a bootstrapped cox
regression analysis that was run in SPSS. I tried unsuccessfully with
bootcens, but have had some success with the bootcov function in the rms
package, which at least generates confidence intervals similar to what is
observed in SPSS. However, the p-values associated with each predictor in
the model are not really close in many instances.
Here is the code I am using:
formula=Surv(months, recidivate) ~ fac1 + fac2 + fac3 + fac4 + fac5 + fac6
+ fac7 + fac8
fit=cph(formula, data=temp, x=T, y=T)
validate(fit, method="boot", B=9999, bw=F, type="residual", sls=0.05,
aics=0,force=NULL, estimates=TRUE, pr=FALSE)
out=bootcov(fit, B=9999, pr=F, coef.reps=T, loglik=F)
for (i in 1:8) {
print(quantile(out$boot.Coef[,i], c(.025, .975)))
}
anova(out)
variable low CI high CI p-value
fac1 -8.919692 20.800878 .5917
fac2 -8.683579 3.091100 .6381
fac3 -1.848428 2.193492 .9312
fac4 -0.17575426 0.08333277 .8246
fac5 -3.1488578 0.5166171 .2946
fac6 -0.03621405 0.07241772 .5600
fac7 -0.62847922 0.08566296 .3433
fac8 -0.01553286 0.20909384 .5756
The results from SPSS I am trying to match (or come close to matching) are
the following:
variable low CI high CI p-value
fac1 -8.474 20.020 .456
fac2 -8.206 3.093 .524
fac3 -1.829 2.087 .900
fac4 -.173 .083 .749
fac5 -2.945 .450 .143
fac6 -.035 .070 .306
fac7 -.626 .092 .189
fac8 -.017 .203 .247
Sorry if this is a really basic question. I have searched for several
hours for an explanation, but cannot find anything that explains why the
p-values would be different despite similar confidence intervals.
Thanks in advance,
Eric
[[alternative HTML version deleted]]
______________________________________________
R-help@
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/bootstrapped-cox-regression-rms-package-tp4651306p4651344.html Sent from the R help mailing list archive at Nabble.com.
Hi, Yihui, Attached is an HTML Diff report of the two files.? The left pane contains the R-2.15.0 file. Thanks. --Mark
From: Yihui Xie <xie at yihui.name>
To: Mark Lamias <mlamias at yahoo.com>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Thursday, November 29, 2012 2:43 PM
Subject: Re: [R] knitr error with Lyx
To: Mark Lamias <mlamias at yahoo.com>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Thursday, November 29, 2012 2:43 PM
Subject: Re: [R] knitr error with Lyx
That is very helpful! Just to continue debugging, can you save the two versions of the tex files produced from LyX with different versions of R and do a diff on them? It sounds like something has changed from R 2.15.0 to 2.15.2. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Thu, Nov 29, 2012 at 1:26 PM, Mark Lamias <mlamias at yahoo.com> wrote: > Thanks, Yihui! > > Luckily I kept R-2.15.0 and left it untouched (so I can continue to use that > for now).? If it helps any, I was able to go back into Lyx and change the > path to point to R-2.15.0 and I also changed the windows path environment > variable to point to the old version.? After doing this, LyX worked fine > with no problem on the code below.? Changing the paths back to the new > version? R-2.15.2, generates the error below. > > If anyone else has any idea how to resolve this, either through R or a > Lyx/LaTeX fix, I'd be all ears. > > Thanks, again for your response, Yihui! > > > Sincerely yours, > > Mark J. Lamias -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: DifferencesReport.htm URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121129/b9f0c312/attachment.pl>
Hi Frank, Below is the actual output from the anova(out) command. I had copied in the p-values and from the previous output from anova(out) and the confidence intervals from print(quantile(out$boot.Coef[,i], c(.025, .975))) to illustrate that the confidence intervals were similar to SPSS while the p-values were not. Actual output from anova.rms(out): Wald Statistics Response: Surv(months, recidivate) Factor Chi-Square d.f. P fac1 0.27 1 0.6055 fac2 0.20 1 0.6514 fac3 0.01 1 0.9338 fac4 0.05 1 0.8311 fac5 1.06 1 0.3036 fac6 0.33 1 0.5647 fac7 0.81 1 0.3670 fac8 0.30 1 0.5832 TOTAL 1.48 8 0.9930 Regarding your second question, it looks like SPSS is using the original estimate of Cox beta coefficients in the test (i.e. a new point estimate is not generated for the statistical test) Thanks again, Eric -- View this message in context: http://r.789695.n4.nabble.com/bootstrapped-cox-regression-rms-package-tp4651306p4651363.html Sent from the R help mailing list archive at Nabble.com.
Thanks Eric. It would be good to show your entire script next time as stated in the posting guidance. Regarding matching with SPSS please describe the bootstrapping algorithm used there. In rms I do the unconditional bootstrap, i.e., I sample with replacement from the rows of the raw data. And also make sure that SPSS ran a large number of bootstrap replications. Frank Eric Claus wrote
Hi Frank, Below is the actual output from the anova(out) command. I had copied in the p-values and from the previous output from anova(out) and the confidence intervals from print(quantile(out$boot.Coef[,i], c(.025, .975))) to illustrate that the confidence intervals were similar to SPSS while the p-values were not. Actual output from anova.rms(out): Wald Statistics Response: Surv(months, recidivate) Factor Chi-Square d.f. P fac1 0.27 1 0.6055 fac2 0.20 1 0.6514 fac3 0.01 1 0.9338 fac4 0.05 1 0.8311 fac5 1.06 1 0.3036 fac6 0.33 1 0.5647 fac7 0.81 1 0.3670 fac8 0.30 1 0.5832 TOTAL 1.48 8 0.9930 Regarding your second question, it looks like SPSS is using the original estimate of Cox beta coefficients in the test (i.e. a new point estimate is not generated for the statistical test) Thanks again, Eric
----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/bootstrapped-cox-regression-rms-package-tp4651306p4651438.html Sent from the R help mailing list archive at Nabble.com.
Hi Frank,
My apologies for not posting the entire script - I have repasted it below.
library(rms)
library(foreign)
temp=read.spss('coxdata.sav', to.data.frame=T)
formula=Surv(months, recidivate) ~ fac1 + fac2 + fac3 + fac4 + fac5 + fac6 +
fac7 + fac8
fit=cph(formula, data=temp, x=T, y=T)
val.out=validate(fit, method="boot", B=9999, bw=F, type="residual",
sls=0.05, aics=0,force=NULL, estimates=TRUE, pr=FALSE)
out=bootcov(fit, B=9999, pr=F, coef.reps=T, loglik=F)
anova(out)
Factor Chi-Square d.f. P
fac1 0.27 1 0.6055
fac2 0.20 1 0.6514
fac3 0.01 1 0.9338
fac4 0.05 1 0.8311
fac5 1.06 1 0.3036
fac6 0.33 1 0.5647
fac7 0.81 1 0.3670
fac8 0.30 1 0.5832
TOTAL 1.48 8 0.9930
for (i in 1:8) {
print(quantile(out$boot.Coef[,i], c(.025, .975)))
}
2.5% 97.5%
-9.236751 20.772061
2.5% 97.5%
-8.841030 3.094755
2.5% 97.5%
-1.834436 2.161983
2.5% 97.5%
-0.1800666 0.0871867
2.5% 97.5%
-3.2129636 0.4783566
2.5% 97.5%
-0.04157389 0.07130994
2.5% 97.5%
-0.6415962 0.1001843
2.5% 97.5%
-0.01529467 0.21055259
Again, the SPSS output I am trying to match is here:
variable low CI high CI p-value
fac1 -8.474 20.020 .456
fac2 -8.206 3.093 .524
fac3 -1.829 2.087 .900
fac4 -.173 .083 .749
fac5 -2.945 .450 .143
fac6 -.035 .070 .306
fac7 -.626 .092 .189
fac8 -.017 .203 .247
In looking through the SPSS syntax, my colleague is using SIMPLE resampling,
which is doing sampling with replacement from the original data set. 9999
bootstrap replications are being used, the same as what I have used in the
bootcov command. The piece of the SPSS output that is not clear is the
generation of p-values from the distribution of parameter estimates; spss
appears to be testing the parameter estimate from the original cox
regression, but the method of testing that parameter is not clear.
Eric
--
View this message in context: http://r.789695.n4.nabble.com/bootstrapped-cox-regression-rms-package-tp4651306p4651474.html
Sent from the R help mailing list archive at Nabble.com.
It will be crucial to know the details of the test statistic and P-value calculations from SPSS. It's also running anova on both the bootcov and the original fits to see if SPSS is ignoring the bootstrap when computing the covariance matrix. Frank Eric Claus wrote
Hi Frank,
My apologies for not posting the entire script - I have repasted it below.
library(rms)
library(foreign)
temp=read.spss('coxdata.sav', to.data.frame=T)
formula=Surv(months, recidivate) ~ fac1 + fac2 + fac3 + fac4 + fac5 + fac6
+ fac7 + fac8
fit=cph(formula, data=temp, x=T, y=T)
val.out=validate(fit, method="boot", B=9999, bw=F, type="residual",
sls=0.05, aics=0,force=NULL, estimates=TRUE, pr=FALSE)
out=bootcov(fit, B=9999, pr=F, coef.reps=T, loglik=F)
anova(out)
Factor Chi-Square d.f. P
fac1 0.27 1 0.6055
fac2 0.20 1 0.6514
fac3 0.01 1 0.9338
fac4 0.05 1 0.8311
fac5 1.06 1 0.3036
fac6 0.33 1 0.5647
fac7 0.81 1 0.3670
fac8 0.30 1 0.5832
TOTAL 1.48 8 0.9930
for (i in 1:8) {
print(quantile(out$boot.Coef[,i], c(.025, .975)))
}
2.5% 97.5%
-9.236751 20.772061
2.5% 97.5%
-8.841030 3.094755
2.5% 97.5%
-1.834436 2.161983
2.5% 97.5%
-0.1800666 0.0871867
2.5% 97.5%
-3.2129636 0.4783566
2.5% 97.5%
-0.04157389 0.07130994
2.5% 97.5%
-0.6415962 0.1001843
2.5% 97.5%
-0.01529467 0.21055259
Again, the SPSS output I am trying to match is here:
variable low CI high CI p-value
fac1 -8.474 20.020 .456
fac2 -8.206 3.093 .524
fac3 -1.829 2.087 .900
fac4 -.173 .083 .749
fac5 -2.945 .450 .143
fac6 -.035 .070 .306
fac7 -.626 .092 .189
fac8 -.017 .203 .247
In looking through the SPSS syntax, my colleague is using SIMPLE
resampling, which is doing sampling with replacement from the original
data set. 9999 bootstrap replications are being used, the same as what I
have used in the bootcov command. The piece of the SPSS output that is
not clear is the generation of p-values from the distribution of parameter
estimates; spss appears to be testing the parameter estimate from the
original cox regression, but the method of testing that parameter is not
clear.
Eric
----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/bootstrapped-cox-regression-rms-package-tp4651306p4651493.html Sent from the R help mailing list archive at Nabble.com.