Skip to content
Prev 280423 / 398502 Next

Random Forest Reading N/A's, I don't see them

Try randomForest with a small dataset to see how it works:
  > d <- data.frame(stringsAsFactors=FALSE,
  +                 Num=(1:10)%%9,
  +                 Fac=factor(rep(LETTERS[1:2],each=5)),
  +                 Char=rep(letters[24:26],len=10))
  > randomForest(x=d[,"Char",drop=FALSE], y=d$Num)
  Error in randomForest.default(x = d[, "Char", drop = FALSE], y = d$Num) : 
    NA/NaN/Inf in foreign function call (arg 1)
  In addition: Warning message:
  In data.matrix(x) : NAs introduced by coercion
  > randomForest(x=d[,"Fac",drop=FALSE], y=d$Num)

  Call:
   randomForest(x = d[, "Fac", drop = FALSE], y = d$Num) 
                 Type of random forest: regression
                       Number of trees: 500
  No. of variables tried at each split: 1

            Mean of squared residuals: 9.573558
                      % Var explained: -40.58

It appears to die if any predictors are character vectors:
it will not convert them to factors (as most modelling functions
do).

as.matrix(data.frame) creates a character matrix if not all columns
are numeric or logical, so I suspect you are running into the
no-character-data limitation.  Try leaving off the as.matrix and
pass in the data.frame that it expects:
   randomForest(x=cm3[,-1,drop=FALSE], y=cm3[,1])
(The is no need or use for the data= argument if you use the x=,y=
interface.  It is only there for the formula interface.)

If you dislike the no-character-data limitation discuss it with
the person at the address given by maintainer("randomForest").

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com