Skip to content
Prev 277312 / 398506 Next

Fit continuous distribution to truncated empirical values

David,

here is the smallest dataset

# Bid	Price	Survival	Censored
0.03	0.029	1	1
0.03	0.029	11	1
0.03	0.029	10	1
0.03	0.029	9	1
0.03	0.029	8	1
0.03	0.029	7	1
0.03	0.029	6	1
0.03	0.029	5	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	1	1
0.03	0.029	1	1
0.03	0.029	9	1
0.03	0.029	8	1
0.03	0.029	7	1
0.03	0.029	6	1
0.03	0.029	5	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	16	1
0.03	0.029	15	1
0.03	0.029	14	1
0.03	0.029	13	1
0.03	0.029	12	1
0.03	0.029	11	1
0.03	0.029	10	1
0.03	0.029	9	1
0.03	0.029	8	1
0.03	0.029	7	1
0.03	0.029	6	1
0.03	0.029	5	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	6	1
0.03	0.029	5	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	7	1
0.03	0.029	6	1
0.03	0.029	5	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	5	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	6	1
0.03	0.029	5	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	30	1
0.03	0.029	29	1
0.03	0.029	28	1
0.03	0.029	27	1
0.03	0.029	26	1
0.03	0.029	25	1
0.03	0.029	24	1
0.03	0.029	23	1
0.03	0.029	22	1
0.03	0.029	21	1
0.03	0.029	20	1
0.03	0.029	19	1
0.03	0.029	18	1
0.03	0.029	17	1
0.03	0.029	16	1
0.03	0.029	15	1
0.03	0.029	14	1
0.03	0.029	13	1
0.03	0.029	12	1
0.03	0.029	11	1
0.03	0.029	10	1
0.03	0.029	9	1
0.03	0.029	8	1
0.03	0.029	7	1
0.03	0.029	6	1
0.03	0.029	5	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	8	1
0.03	0.029	7	1
0.03	0.029	6	1
0.03	0.029	5	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.029	4	1
0.03	0.029	3	1
0.03	0.029	2	1
0.03	0.029	1	1
0.03	0.027	71	0
0.03	0.027	70	0
0.03	0.027	69	0
0.03	0.027	68	0
0.03	0.027	67	0
0.03	0.027	66	0
0.03	0.027	65	0
0.03	0.027	64	0
0.03	0.027	63	0
0.03	0.027	62	0
0.03	0.027	61	0
0.03	0.027	60	0
0.03	0.027	59	0
0.03	0.027	58	0
0.03	0.027	57	0
0.03	0.027	56	0
0.03	0.027	55	0
0.03	0.027	54	0
0.03	0.027	53	0
0.03	0.027	52	0
0.03	0.027	51	0
0.03	0.027	50	0
0.03	0.027	49	0
0.03	0.027	48	0
0.03	0.027	47	0
0.03	0.027	46	0
0.03	0.027	45	0
0.03	0.027	44	0
0.03	0.027	43	0
0.03	0.027	42	0
0.03	0.027	41	0
0.03	0.027	40	0
0.03	0.027	39	0
0.03	0.027	38	0
0.03	0.027	37	0
0.03	0.027	36	0
0.03	0.027	35	0
0.03	0.027	34	0
0.03	0.027	33	0
0.03	0.027	32	0
0.03	0.027	31	0
0.03	0.027	30	0
0.03	0.027	29	0
0.03	0.027	28	0
0.03	0.027	27	0
0.03	0.027	26	0
0.03	0.027	25	0
0.03	0.027	24	0
0.03	0.027	23	0
0.03	0.027	22	0
0.03	0.027	21	0
0.03	0.027	20	0
0.03	0.027	19	0
0.03	0.027	18	0
0.03	0.027	17	0
0.03	0.027	16	0
0.03	0.027	15	0
0.03	0.027	14	0
0.03	0.027	13	0
0.03	0.027	12	0
0.03	0.027	11	0
0.03	0.027	10	0
0.03	0.027	9	0
0.03	0.027	8	0
0.03	0.027	7	0
0.03	0.027	6	0
0.03	0.027	5	0
0.03	0.027	4	0
0.03	0.027	3	0
0.03	0.027	2	0
0.03	0.027	1	0


This is column # 3
1 11 10  9  8  7  6  5  4  3  2  1  1  1  9  8  7  6  5  4  3  2  1 16 15 14 
13 12 11 10  9  8  7  6  5  4  3  2  1  6  5  4  3  2  1  7  6  5  4  3  2  1
5  4  3  2  1  4  3  2  1  6  5  4  3  2  1 30 29 28 27 26 25 24 23 22 21 20
19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  8  7  6  5  4  3  2
1  4  3  2  1 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51
50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25
24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1


This is the script I have used for fitting the dataset to a Weibull distribution


my_plot <- function() {
	require(plotrix) # for plotCI
	require(MASS) # for fitdistr
	require(survival) # for survival analysis
	require(fitdistrplus) # for fitdist


	lifetimes=read.table("lifetime.txt", header=F, comment.char="#")
	s = Surv(lifetimes$V3, lifetimes$V4)
	mfit = survfit(s~V1, data=lifetimes, conf.type="plain")
	
	cdf=ecdf(lifetimes$V3)
	empirical = numeric(24)
	T = seq(1,24)
	for (i in T) empirical[i] = cdf(i)

	fit = fitdist(lifetimes$V3, "weibull")
	plot(T, 1- pweibull(T, fit$estimate[1], fit$estimate[2]), type="l", col="red", lwd=2, lty=1, xlim=c(1,24), xlab="Time [hours]", ylab="S(t)")
	lines(T, 1 - empirical, col="blue", lwd=2, lty=2)
	plotCI(T, mfit$surv[T], ui=mfit$upper[T], li=mfit$lower[T], col="green", lwd=1, lty=3, add=T)
	legend("bottomleft", c("Weibull fit", "1 - Empirical CDF", "Empirical S(t)"), col=c("red", "blue", "green"), horiz=F, lty=1:3, lwd=2)

}


my_plot()




About your questions
1 - What do you mean?, I guess I don't know the answer -- that's why I have posted my question on this mailing list.
2 - As I wrote in my first email, I am interested only in the first portion of the survival times (24 hours). Hence, I'd prefer to have a "good" fit over the interval of interest rather than a "reasonable" fit over the whole data.
3 - I went though those tutorials, but couldn't find an answer. That's probably due to the fact that those tutorials are "general", as you pointed out, while my question is rather specific.

Cheers,
Michele
On Nov 14, 2011, at 5:59 PM, David Winsemius wrote: