predict (PR#2686)
This is intentional. The coding for factors is based on the full set of levels, and should be comparable for different prediction sets. If you are using factors with fictitious levels the fix is obvious: improve the design.
On Wed, 26 Mar 2003 Mark.Bravington@csiro.au wrote:
# r-bugs@r-project.org `predict' complains about new factor levels, even if the "new" levels are merely levels in the original that didn't occur in the original fit and were sensibly dropped, and that don't occur in the prediction data either. (At least if `drop.unused.levels' was set to TRUE, which the default.)
Actually, the default is FALSE: see args(model.frame.default). lm and glm call model.frame.default with non-default args.
test> scrunge.data.2_ data.frame( y=runif( 3), disc=factor( c( 'cat', 'dog',
'cat'), levels=c( 'cat', 'dog', 'earwig')))
test> lm.predbug.2_ lm( y~disc, data=scrunge.data.2)
test> predict(lm.predbug.2, newdata=scrunge.data.2)
Error in model.frame.default(object, data, xlev = xlev) :
factor disc has new level(s) earwig
A cure for this seems to be to add the commented line below towards the end
of `model.frame.default':
<<...>>
if (length(xlev) > 0) {
for (nm in names(xlev)) if (!is.null(xl <- xlev[[nm]])) {
xi <- data[[nm]]
if (is.null(nxl <- levels(xi)))
warning(paste("variable", nm, "is not a factor"))
else {
xi <- xi[, drop = TRUE]
nxl <- levels( xi) # MVB: remove droppees
if (any(m <- is.na(match(nxl, xl))))
stop(paste("factor", nm, "has new level(s)", nxl[m]))
}
}
}
else if (drop.unused.levels) {
<<...>>
cheers
Mark
*******************************
Mark Bravington
CSIRO (CMIS)
PO Box 1538
Castray Esplanade
Hobart
TAS 7001
phone (61) 3 6232 5118
fax (61) 3 6232 5012
Mark.Bravington@csiro.au
--please do not edit the information below--
Version:
platform = i386-pc-mingw32
arch = i386
os = mingw32
system = i386, mingw32
status =
major = 1
minor = 6.2
year = 2003
month = 01
day = 10
language = R
Windows 2000 Professional (build 2195) Service Pack 3.0
Search Path:
.GlobalEnv, ROOT, package:handy, package:debug, mvb.session.info,
package:mvbutils, package:tcltk, Autoloads, package:base
______________________________________________ R-devel@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595