warning with glm.predict, wrong number of data rows
carol white <wht_crl <at> yahoo.com> writes:
Hi, I split a data set into two partitions (80 and 42), use the first as the
training set in glm and the second as
testing set in glm predict. But when I call glm.predict, I get the warning
message:?
Warning message: 'newdata' had 42 rows but variable(s) found have 80 rows? ---------------------
[snip] The warning correctly diagnoses the problem. The posting guide asks for a 'reproducible example', but you did not give us one. There is one below. Note what happens when predict() tries to reconstruct the variable 'x[1:4]' as dictated by the formula. How many elements can 'x[1:4]' have when newdata has (say) nrowsNew? Use the subset argument to select a subset of observations.
y <- sample(factor(1:2),80,repl=T) y <- sample(factor(1:2),5,repl=T) x <- 1:4 fit <- glm( y[1:4] ~ x[1:4], family = binomial) fit
Call: glm(formula = y[1:4] ~ x[1:4], family = binomial) Coefficients: (Intercept) x[1:4] -1.110e-16 0.000e+00 Degrees of Freedom: 3 Total (i.e. Null); 2 Residual Null Deviance: 5.545 Residual Deviance: 5.545 AIC: 9.545
predict(fit,newdata=data.frame(x=1:2))
1 2 3 4 -1.110223e-16 -1.110223e-16 NA NA Warning message: 'newdata' had 2 rows but variable(s) found have 4 rows
predict(fit,newdata=data.frame(x=1:5))
1 2 3 4 -1.110223e-16 -1.110223e-16 -1.110223e-16 -1.110223e-16 Warning message: 'newdata' had 5 rows but variable(s) found have 4 rows
HTH, Chuck [rest deleted]