Skip to content

glm-logistic on discrete-time methods with individual and aggregated data

2 messages · Camarda, Carlo Giovanni, Spencer Graves

4 days later
#
You've found a region of infinite extent over which the likelihood 
function is for all practical purposes flat.  This means that the 
maximum likelihood estimates (MLEs) are not unique.  To see this 
consider the following properties of your datiINDa:

 > with(datiINDa, table(statusINDa, timesINDa))
           timesINDa
statusINDa  1  2  3  4
          0 10  8  6  4
          1  0  2  2  2
 > sapply(datiINDa, class)
  timesINDa statusINDa
   "factor"  "numeric"

	  You are estimating 4 parameters, an intercept plus one parameter for 
each level of the factor "timesInda".  The first level occurs only with 
statusINDa = 0, never with statusINDa = 1.  Therefore, the theoretical 
MLE for that level of timesINDa would have slope = +/-Inf (and the 
intercept would also be adjusted to +/-Inf to compensate).  However, glm 
doesn't bother pushing it that far, and gives up with still moderately 
small values for the parameters.  To understand this better, first 
modify your example to store the glm fitted object as follows:

fit.a <- glm(statusINDa ~ timesINDa, family=binomial, data=datiINDa)

	  Then apply "predict" to that object:

predict(fit.a, type="response")

	  The result is that the 10 cases with timesInda = 1 all have a 
Pr{statusINDa = 1} = 3e-9, which glm thinks is essentially 0 and quits.

	  Now let's do the same with your weighted version:

fit.wa <- glm(statusAGGa ~ timesAGGa, family=binomial, data=datiAGGa,
weights=weightAGGa)
sort(predict(fit.wa, type="response"))

	  Those 10 cases now have Pr{statusINDa = 1} = 5.4e-9.  This is 
essentially the issue of "complete separation".  We can request more 
precision as follows:

 > fit.a3 <- glm(statusINDa ~ timesINDa, family=binomial, data=datiINDa,
+           control=glm.control(epsilon=1e-13,
+             maxit=250))
Warning message:
fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = 
Y, weights = weights, start = start, etastart = etastart,

	  In this case, we get a warning.  For more on this, try 
RSiteSearch("complete separation with logistic regression").

	  Sehr interessant, nicht?
	  hope this helps.
	  spencer graves
Camarda, Carlo Giovanni wrote: