Is it possible to use glm() with 30 observations?
The issue is not 30 observations but whether it is possible to
perfectly separate the two possible outcomes. Consider the following:
tst.glm <- data.frame(x=1:3, y=c(0, 1, 0))
glm(y~x, family=binomial, data=tst.glm)
tst2.glm <- data.frame(x=1:1000,
y=rep(0:1, each=500))
glm(y~x, family=binomial, data=tst2.glm)
The algorithm fits y~x to tst.glm without complaining for tst.glm,
but issues warnings for tst2.glm. This is called the Hauck-Donner
effect, and RSiteSearch("Hauck-Donner") just now produced 8 hits. For
more information, look for "Hauck-Donnner" in the index of Venables, W.
N. and Ripley, B. D. (2002) _Modern Applied Statistics with S._ New
York: Springer. (If you don't already have this book, I recommend you
give serious consideration to purchasing a copy. It is excellent on
many issues relating to statistical analysis and R.
Spencer Graves
Kerry Bush wrote:
I have a very simple problem. When using glm to fit binary logistic regression model, sometimes I receive the following warning: Warning messages: 1: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, What does this output tell me? Since I only have 30 observations, i assume this is a small sample problem. Is it possible to fit this model in R with only 30 observations? Could any expert provide suggestions to avoid the warning?
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915