Hi all, Sorry if this has been answered already, but I couldn't find it in the archives or general internet. Is it possible to implement the gompertz distribution as survreg.distribution to use with survreg of the survival library? I haven't found anything and recent attempts from my side weren't succefull so far. I know that other packages like 'eha' and 'flexsurv' offer functions similar to survreg with gompertz support. However, due to the run-time environment were this needs to be running in the end, I can't use these packages :( Same questions for the gompertz-makeham distribution. Many thanks! Matthias
survreg & gompertz
3 messages · Matthias Ziehm, David Winsemius
On Nov 15, 2012, at 5:38 AM, Matthias Ziehm wrote:
Hi all, Sorry if this has been answered already, but I couldn't find it in the archives or general internet.
A Markmail/Rhelp search on: gompertz survreg ...brings this link to a reply by Terry Therneau. Seems to address everything you asked and a bit more http://markmail.org/search/?q=list%3Aorg.r-project.r-help+gompertz+survreg#query:list%3Aorg.r-project.r-help%20gompertz%20survreg+page:1+mid:6xdlsmo272oa7zkw+state:results (Depending on how your mailer breaks URLs you may need to paste it back together.)
Is it possible to implement the gompertz distribution as survreg.distribution to use with survreg of the survival library? I haven't found anything and recent attempts from my side weren't succefull so far. I know that other packages like 'eha' and 'flexsurv' offer functions similar to survreg with gompertz support. However, due to the run-time environment were this needs to be running in the end, I can't use these packages :( Same questions for the gompertz-makeham distribution. Many thanks! Matthias
David Winsemius, MD Alameda, CA, USA
On 15/11/12 21:22, David Winsemius wrote:
On Nov 15, 2012, at 5:38 AM, Matthias Ziehm wrote:
Hi all, Sorry if this has been answered already, but I couldn't find it in the archives or general internet.
A Markmail/Rhelp search on: gompertz survreg ...brings this link to a reply by Terry Therneau. Seems to address everything you asked and a bit more http://markmail.org/search/?q=list%3Aorg.r-project.r-help+gompertz+survreg#query:list%3Aorg.r-project.r-help%20gompertz%20survreg+page:1+mid:6xdlsmo272oa7zkw+state:results (Depending on how your mailer breaks URLs you may need to paste it back together.)
Thanks! I hadn't read this thread because of the missleading title. However, now I've got follow up questions, on that explanation:
On Feb 5, 2010 at 8:48:08 am, Terry Therneau wrote:
Subject: Re: [R] Using coxph with Gompertz-distributed survival data. ... the note below talks about how to do so approximately with survreg. It's a note to myself of something to add to the survival package documentation, not yet done, and to my embarassment the file has a time stamp in 1996. Ah well. Terry Therneau My document "A Package for Survival Analysis in S" contains statements about how to fit Gompertz and Rayleigh distributions with the survreg routine. Nicholas Brouard, in a recent query to this group, quite correctly states that "Therneau's documentation is a little elliptic for people not so familiar with extreme value theory". I've spent the last day trying to work out concrete examples of the fits. Let me start by saying that I now think my paper's remarks were overly optimistic. This note will try to indicate why. I will use some "TeX" notation below: \alpha, \beta, etc for Greek letters. Hazard functions: Weibull: p*(\lambda)^p * t^(p-1) Extreme value: (1/ \sigma) * exp( (t- \eta)/ \sigma) Rayleigh: a + bt Gompertz: b * c^t Makeham: a + b* c^t The Makeham hazard seems to fit human mortality experience beyond infancy quite well, where "a" is a constant mortality which is independent of the health of the subject (accidents, homicide, etc) and the second term models the Gompertz assumption that "the average exhaustion of a man's power to avoid death to is such that at theendof equal infinitely small itervals of time he lost equal portions of his remaining power to oppose destruction which he had at the commencement of these intervals". For older ages "a" is a neglible portion of the death rate and the Gompertz model holds. The fitting routine depends on the decomposition Y = \eta + \sigma W, where \eta = \beta_0 + \beta_1 * X_1 + \beta_2 * X_2 + ... is the fitted linear predictor and W is a distribution in "standard" form. For instance, if the response time t is Weibull, then y = log(t) follows this with \eta = log(\lambda) \sigma = 1/p Clearly 1. The Wiebull distribution with p=2 (sigma=.5) is the same as a Rayleigh distribution with a=0. It is not, however, the most general form of a Rayleigh. 2. The (least) extreme value and Gompertz distributions have the same hazard function, with \sigma = 1/ log(c), and exp(-\eta/ \sigma) = b.
If I do the math correctly from the above given extreme value hazard to Gompertz hazard. It needs to b = 1/ \sigma * exp(-\eta/ \sigma) Other wise the 1/ \sigma of the Extreme value hazard is missing, isn't it?
It would appear that the Gompertz can be fit with an identity link function combined with the extreme value distribution. However, this ignores a boundary restriction. If f(x; \eta, \sigma) is the extreme value distribution with paramters \eta and \sigma, then the definition of the Gompertz densitiy is g(x; \eta, \sigma) = 0 x< 0 g(x; \eta, \sigma) = c f(x; \eta, \sigma) x>=0 where c= exp(exp(-\eta / \sigma)) is the necessary constant so that g integrates to 1.
Here I got really lost were the addition double exp suddenly comes from and how it fits in. Given the above I would have thought that: g(x,b,c) = f(x, \eta=-1/log(c)*log(b*1/log(c)), \sigma= 1/log(c) ) for x>=0 Can anyone clarify these thinga to me, please? Matthias
If \eta / \sigma is far from 1, then the correction term will be minimal and survreg should give a reasonable answer. If not, the distribution can't be fit, nor can it be made to easily conform to the general fitting scheme of the program. The Makeham distribution falls into the gamma family (equation 2.3 of Kalbfleisch and Prentice, Survival Analysis), but with the same range restriction problem. In summary, the Gompertz is a truncated form of the extreme value distribution (Johnson, Kotz and Blakrishnan, Contiuous Univariate Distri- butions, section 22.8). If one ignores the truncation, i.e., assume that negative time values are possible, then it can be fit with survreg. My original note seems to have been compounded of 3 errors: the -1 arises from confusing the maximal extreme distribution (most common in theory books) with the minimal extreme distribution (used in survreg), the log() term was a typing mistake, and I never noticed the range restriction. This is one of the few topics in the report without a worked example as part of my test library (the Examples section of the package). The replacement document, currently in early draft, is intended to have a worked example for every claim and the code for that example in the appendix. This will, hopefully, cure any other mistakes of this sort.
Is it possible to implement the gompertz distribution as survreg.distribution to use with survreg of the survival library? I haven't found anything and recent attempts from my side weren't succefull so far. I know that other packages like 'eha' and 'flexsurv' offer functions similar to survreg with gompertz support. However, due to the run-time environment were this needs to be running in the end, I can't use these packages :( Same questions for the gompertz-makeham distribution. Many thanks! Matthias