glm predict on new data
On 4/6/2011 2:17 PM, dirknbr wrote:
I am aware this has been asked before but I could not find a resolution. I am doing a logit lg<- glm(y[1:200] ~ x[1:200,1],family=binomial)
glm (and most modeling functions) are designed to work with data frames, not raw vectors.
Then I want to predict a new set pred<- predict(lg,x[201:250,1],type="response") But I get varying error messages or warnings about the different number of rows. I have tried data/newdata and also to wrap in data.frame() but cannot get to work.
I'll made up some data, show the way you approached it, show where it went wrong, and then how it works more easily. # data like what I think you had: y <- rbinom(200, 1, prob=.8) x <- data.frame(x=rnorm(250)) # your glm call: lg <- glm(y[1:200]~x[1:200,1],family=binomial) # take a look at print(lg). Notice that your independent variable # name is "x[1:200, 1]", which is what you would need to match in # a call to predict. # Make data.frames of the given and testing data. DF <- data.frame(y=y, x=x[1:200,1]) DF.new <- data.frame(x=x[200:250,1]) # Notice DF.new has the same name (x) as DF. lg <- glm(y~x, data=DF, family=binomial) pred <- predict(lg, newdata=DF.new, type="response") summary(pred)
Help would be appreciated. Dirk.
Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University