Skip to content
Prev 269237 / 398502 Next

val.surv

On Aug 21, 2011, at 7:03 AM, Salvo Mac wrote:

            
Salvo;

I'm not sure where the error is coming from (although I will share my  
speculation below). I modified your code somewhat. I was under the  
impression that read.csv had sep="," hardcoded and the read.csv( ...  
seo="\t") would throw and error. The as.data.fame around either of the  
read.* statements is completely unnecessary:

Input code:
train<-read.table("~/train.txt", header=T, sep="\t")
  test<-read.table("~/test.txt", header=T, sep="\t")

I thought the  newdata=test argument should succeed, but it does throw  
the error that you report:

  f.1<-cph(Surv(time,event)~age, x=T, y=T,surv=T, data=train)
  val.surv(f.1, newdata=test, u=10)
Error in val.surv(f.1, newdata = test, u = 10) :
   dims [product 210] do not match the length of object [314]
In addition: Warning message:
In est.surv + S[, 1] :
   longer object length is not a multiple of shorter object length

I tried sampling from test with replacement to generate a dataframe of  
equal extent and with that object I do not get the same errors:

 > extend <- test[sample(1:nrow(test), 314, replace=TRUE), ]
 >  f.1<-cph(Surv(time,event)~age, x=T, y=T,surv=T, data=train)
 > val.surv(f.1, newdata=extend, u=10)

Validation of Predicted Survival at Time= 10 	n= 314 , events= 64

hare fit:

dim A/D   loglik       AIC        penalty
                                 min    max
   1 Add   -485.24    976.22  271.33     Inf
   2 Del   -349.57    710.64    5.67  271.33
   3 Del   -346.74    710.72    0.96    5.67
   4 Del   -346.26    715.51    0.01    0.96
   5 Add   -346.25    721.25    0.00    0.01

the present optimal number of dimensions is 2.
penalty(AIC) was 5.75, the default (BIC), would have been 5.75.

   dim1           dim2           beta        SE         Wald
Constant                             -10       0.55  -18.14
Time        43                      0.15      0.015   10.30

Function used to transform predictions:
function (p)  log(-log(p))

Mean absolute error in predicted probabilities: 0.0331
0.9 Quantile of absolute errors               : 0.0613

I have also tried looking at the code and adding   
options(error=utils::recover) to see if I can identify the point where  
the length mismatch is being generated, (but I am NOT an ace  
debugger). I can see that est.surv is created. I can also get the  
predict(fit, newdata, type = "lp") call to run with train and give  
sensible numbers. You did not create (or at least say so) a datadist.  
I tried that when I saw that "lim" was dependent on datadist limits.

ddT <- datadist(train); options(datadist="ddT")

Seeing the usehare was set to TRUE; I submitted the code section that  
was intended for that situation to a browser session and the the first  
line throws the error that I see at the console:

Enter a frame number, or 0 to exit

1: val.surv(f.1, newdata = test, u = 10)

Selection: 1
Called from: top level
Browse[1]> i <- !is.na(est.surv + S[, 1] + S[, 2])
Error during wrapup: dims [product 210] do not match the length of  
object [314]

As I read this that line is supposed to create an index of valid rows  
in newdata and the fitted values but is failing in the situation where  
the nrows of the newdata does not match that of the fit.

At this point one option might be trying to generate a structure that  
matches the val.surv output by going through the code and building it  
up bit by bit:

  w <- structure(list(harefit = f, p = est.surv, actual = actual,
             pseq = pseq, actualseq = actualseq, u = u, fun = fun,
             n = nrow(S), d = sum(S[, 2]), units = units), class =  
"val.survh")
         return(w)

But a much better option would be to report the error to Frank  
Harrell. I'm copying him since I think your .txt files probably  
reached the list as well as my mailbox.