How to deal with missing values when using Random Forrest
On Feb 25, 2012, at 6:24 PM, kevin123 wrote:
I am using the package Random Forrest to test and train a model, I aim to predict (LengthOfStay.days),:
library(randomForest) model <- randomForest( LengthOfStay.days~.,data = training,
+ importance=TRUE, + keep.forest=TRUE + ) *This is a small portion of the data frame: * *data(training)* LengthOfStay.days CharlsonIndex.numeric DSFS.months 1 0 0.0 8.5 6 0 0.0 3.5 7 0 0.0 0.5 8 0 0.0 0.5 9 0 0.0 1.5 11 0 1.5 NaN *Error message* Error in na.fail.default(list(LengthOfStay.days = c(0, 0, 0, 0, 0, 0, : missing values in object,
What part of that error message is unclear? Have you looked at the randomForest page? It tells you what the default behavior is na.fail.
I would greatly appreciate any help
I would seem that the way forward is to remove the cases with missing values or to impute values.
David Winsemius, MD Heritage Laboratories West Hartford, CT