predicting expected number of events using a coxph model
On 2012-07-02 14:14, peter dalgaard wrote:
On Jul 2, 2012, at 19:27 , agittens wrote:
Peter Dalgaard-2 wrote
I fit a coxph model: coxphfit <- coxph(Surv(sampledLifetime, !sampledCensoredQ) ~ curpbc6 + prevpbc6, sampledTimeSeries) Now I'm trying to predict the expected number of events using a new dataset. The documentation suggests that coxPred <- predict(coxphfit, newdata = testTimeSeries, type="expected") will do what I want, but I get the error Error in model.frame.default(data = testTimeSeries, formula = Surv(sampledLifetime, : variable lengths differ (found for 'curpbc6') when I do this. The dataframes sampledTimeSeries and testTimeSeries were constructed by taking rows from a larger dataframe, so they have the same data. What am I doing incorrectly?
Most likely referring to a variable not in testTimeSeries. (I kind of suspect that unlike predict.lm, predict.coxph does not ignore the left hand side of formulas. Does testTimeSeries contain a sampledLifetime column?)
No, I did not have the lifetime and censored data in the dataframe. Per your idea, I put the sampledLifetime and and sampledCensoredQ variables in the dataframe sampledTimeSeries and left the rest of the code the same. Now when I try with the new data set, coxPred <- predict(coxphfit, newdata = testTimeSeries, type="expected") I get different errors. If I use testTimeSeries without the lifetime and censor indicator columns (which shouldn't be required for prediction), then i get the same error as before.
I gather that type="expected" requires a follow-up time which the routine needs to get from somewhere. Presumably those columns _are_ required.
If I put in these columns, then I get the error Error in predict.coxph(coxphfit, newdata = testTimeSeries, type = "expected", : object 'x' not found
Do you have an "x" somewhere in your model specification? Otherwise, I'm out of clues. Perhaps try a traceback() or options(error=recover) and see where the error comes from.
As I read the OP's previous response, s/he has not necessarily added sampledLifetime and sampledCensoredQ to *testTimeSeries* but perhaps only to *sampledTimeSeries*. (But I don't know why the error refers to "x".) It is also not clear from the original post just where sampledLifetime and sampledCensoredQ resided in the original call to coxph. Perhaps we could have a look at str(sampledTimeSeries) str(testTimeSeries) ? Peter Ehlers