Convergence issues when using ns splines (pkg: spline) in Cox model (coxph) even when changing coxph.control
Thanks to David for pointing this out. The "time dependent covariates" vignette in the
survival package has a section on time dependent coefficients that talks directly about
this issue. In short, the following model is simply wrong:
coxph(Surv(time, status) ~ trt + prior + karno + I(karno * log(time)), data=veteran)
People try this often as a way to create the time dependent covariate Karnofsky * log(t),
which is often put forwards as a way to deal with non-proportional hazards. To do this
correctly you have to use the tt() functionality in coxph to move the computation out of
the model statement:
coxph(Surv(time, status) ~ trt + prior + karno + tt(karno), data=veteran,
tt = function(x, time, ...) x*log(time))
BTW the following SAS code is also wrong:
proc phreg data=veteran;
model time * status(0) = trt + prior + karno* time;
SAS does the right thing, however, if you move the computation off the model line.
model time * status(0) = trt + karno + zzz;
zzz = karno * time;
The quote "SAS does it but R fails" comes at me moderately often in this context. The
reason is that SAS won't LET you put a phrase like "log(time)" into the model statement,
so people end up doing the right thing, but by accident.
Terry T.
On 03/30/2016 05:28 PM, G?ran Brostr?m wrote:
On 2016-03-30 23:06, David Winsemius wrote:
On Mar 29, 2016, at 1:47 PM, Jennifer Wu, Miss <jennifer.wu2 at mail.mcgill.ca> wrote: Hi, I am currently using R v3.2.3 and on Windows 10 OS 64Bit. I am having convergence issues when I use coxph with a interaction term (glarg*bca_py) and interaction term with the restricted cubic spline (glarg*bca_time_ns). I use survival and spline package to create the Cox model and cubic splines respectively. Without the interaction term and/or spline, I have no convergence problem. I read some forums about changing the iterations and I have but it did not work. I was just wondering if I am using the inter.max and outer.max appropriately. I read the survival manual, other R-help and stackoverflow pages and it suggested changing the iterations but it doesn't specify what is the max I can go. I ran something similar in SAS and did not run into a convergence problem. This is my code: bca_time_ns <- ns(ins_ca$bca_py, knots=3, Boundary.knots=range(2,5,10)) test <- ins_ca$glarg*ins_ca$bca_py test1 <- ins_ca$glarg*bca_time_ns
In your `coxph` call the variable 'bca_py' is the survival time and
Right David: I didn't notice that the 'missing main effect' in fact was part of the survival object! And as you say: Time to rethink the whole model. G?ran
yet here you are constructing not just one but two interactions (one of which is a vector but the other one a matrix) between 'glarg' and your survival times. Is this some sort of effort to identify a violation of proportionality over the course of a study? Brostr?m sagely points out that these interactions are not in the data-object and subsequent efforts to refer to them may be confounded by the multiple environments from which data would be coming into the model. Better to have everything come in from the data-object. The fact that SAS did not have a problem with this rather self-referential or circular model may be a poor reflection on SAS rather than on the survival package. Unlike Therneau or Brostr?m who asked for data, I suggest the problem lies with the model construction and you should be reading what Therneau has written about identification of non-proportionality and identification of time dependence of effects. See Chapter 6 of his "Modeling Survival Data".