1. The computations "behind the scenes" produce the variance of the cumulative hazard. This is true for both an ordinary Kaplan-Meier and a Cox model. Transformations to other scales are done using simple Taylor series. H = cumulative hazard = log(S); S=survival var(H) = var(log(S)) = the starting point S = exp(log(S)), so var(S) is approx [deriv of exp(x)]^2 * var(log(S)) = S^2 var(H) var(log(log(S)) is approx (1/S^2) var(H) 2. At the time it was written, summary.survfit was used only for printing out the survival curve at selected times, and the audience for the printout wanted std(S). True, that was 20 years ago, but I don't recall anyone ever asking for summary to do anything else. Your request is not a bad idea. Note however that the primary impact of using log(S) or S or log(log(S)) scale is is on the confidence intervals, and they do appear per request in the summary output. Terry T.
On 06/28/2014 05:00 AM, r-help-request at r-project.org wrote:
Message: 9
Date: Fri, 27 Jun 2014 12:39:29 -0700
From: array chip<arrayprofile at yahoo.com>
To:"r-help at r-project.org" <r-help at r-project.org>
Subject: [R] standard error of survfit.coxph()
Message-ID:
<1403897969.91269.YahooMailNeo at web122906.mail.ne1.yahoo.com>
Content-Type: text/plain
Hi, can anyone help me to understand the standard errors printed in the output of survfit.coxph()?
time<-sample(1:15,100,replace=T)
status<-as.numeric(runif(100,0,1)<0.2)
x<-rnorm(100,10,2)
fit<-coxph(Surv(time,status)~x)
??? ### method 1
survfit(fit, newdata=data.frame(time=time,status=status,x=x)[1:5,], conf.type='log')$std.err
???????????? [,1]??????? [,2]??????? [,3]??????? [,4]?????? [,5]
?[1,] 0.000000000 0.000000000 0.000000000 0.000000000 0.00000000
?[2,] 0.008627644 0.008567253 0.008773699 0.009354788 0.01481819
?[3,] 0.008627644 0.008567253 0.008773699 0.009354788 0.01481819
?[4,] 0.013800603 0.013767977 0.013889971 0.014379928 0.02353371
?[5,] 0.013800603 0.013767977 0.013889971 0.014379928 0.02353371
?[6,] 0.013800603 0.013767977 0.013889971 0.014379928 0.02353371
?[7,] 0.030226811 0.030423883 0.029806263 0.028918817 0.05191161
?[8,] 0.030226811 0.030423883 0.029806263 0.028918817 0.05191161
?[9,] 0.036852571 0.037159980 0.036186931 0.034645002 0.06485394
[10,] 0.044181716 0.044621159 0.043221145 0.040872939 0.07931028
[11,] 0.044181716 0.044621159 0.043221145 0.040872939 0.07931028
[12,] 0.055452631 0.056018832 0.054236881 0.051586391 0.10800413
[13,] 0.070665160 0.071363749 0.069208056 0.066655730 0.14976433
[14,] 0.124140400 0.125564637 0.121281571 0.118002021 0.30971860
[15,] 0.173132357 0.175309455 0.168821266 0.164860523 0.46393111
survfit(fit, newdata=data.frame(time=time,status=status,x=x)[1:5,], conf.type='log')$time
?[1]? 1? 2? 3? 4? 5? 6? 7? 8? 9 10 11 12 13 14 15
??? ### method 2
summary(survfit(fit, newdata=data.frame(time=time,status=status,x=x)[1:5,], conf.type='log'),time=10)$std.err
????????????? 1????????? 2????????? 3????????? 4????????? 5
[1,] 0.04061384 0.04106186 0.03963184 0.03715246 0.06867532
By reading the help of ?survfit.object and ?summary.survfit, the standard error provided in the output of method 1 (survfit()) was for cumulative hazard-log(survival), while the standard error provided in the output of method 2 (summary.survfit()) was for survival itself, regardless of how you choose the value for "conf.type" ('log', 'log-log' or 'plain'). This explains why the standard error output is different between method 1 (10th row) and method 2.
My question is how do I get standard error estimates for log(-log(survival))?
Thanks!
John