Help with split data routine and subsequent predict function --caret and klaR pkgs - R-SIG-mixed-models

Tue, Nov 27, 2018 8:57 AM #
R=3.5.1
Windows=10
RStudio Version = 1.1.456

Hello I am following this split data routine located at:
https://machinelearningmastery.com/how-to-estimate-model-accuracy-in-r-using-the-caret-package/

When I get to the "predictions <- predict(model, x_test)" below I am getting the following error:
#Error in `[[<-.data.frame`(`*tmp*`, i, value = integer(0)) :  replacement has 0 rows, data has 4628

So I checked again for the usual culprit being NA's but there are none?

I Thought maybe "tryCatch()" might help but it isn't working for me either? LOL!
#Error: unexpected ')' in "tryCatch({predictions <- predict(model, x_test)}, error = function(e)print(e), warning = function(w))"

Thank you for any insight and direction.

WHP

str(r1a1)
# Classes 'data.table' and 'data.frame':23141 obs. of  8 variables:
#   $ SavingsReversed: num  0 0 0 0 0 0 0 0 0 0 ...
# $ productID      : num  3 3 3 3 3 3 3 3 1 1 ...
# $ ProviderID     : num  113676 113676 113964 113964 114278 ...
# $ ModCnt         : num  0 0 0 0 1 1 1 1 1 1 ...
# $ Editnumber2    : num  0 0 1 1 1 1 1 1 1 1 ...
# $ B2             : num  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
# $ B1a            : num  1 1 3 3 1 1 1 1 1 1 ...
# $ PatientGender2 : num  0 0 1 1 1 1 0 0 0 0 ...
# - attr(*, ".internal.selfref")=<externalptr>

tail(r1a1)
   SavingsReversed productID ProviderID ModCnt Editnumber2 B2 B1a PatientGender2
1:            0.00         3    6266065      0           0  9  26              1
2:           32.61         3    6266065      0           0  9  26              0
3:            0.00         1    6266651      0           1  9  26              1
4:            0.00         3    6270643      2           1  7  26              0
5:            0.00         3    6270643      0           1 -1   3              0
6:            0.00         3    6273280      0           0  9  26              0

#reorg r1a1
r1a2 <- r1a1[,c(5,1,2,3,4,6,7,8)]
str(r1a2)
#Data Split
# define an 80%/20% train/test split of the dataset
split=0.80
trainIndex <- createDataPartition(r1a1$Editnumber2, p=split, list=FALSE)
str(trainIndex) # abbreviated here
#int [1:18513, 1] 2 3 5 7 8 9 10 11 12 14 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr "Resample1"
data_train <- r1a1[ trainIndex,]
str(data_train) #abbreviated here
#Classes 'data.table' and 'data.frame':18513 obs. of  8 variables:
data_test <-  r1a1[-trainIndex,]
str(data_test)# abbreviated here
#Classes 'data.table' and 'data.frame':4628 obs. of  8 variables:

# train a naive bayes model
# install.packages("klaR")
# library(klaR)
model <- naiveBayes(Editnumber2~., data=data_train)
# make predictions
x_test <- data_test[,2:8]
y_test <- data_test[,1]
predictions <- predict(model, x_test)
#Error in `[[<-.data.frame`(`*tmp*`, i, value = integer(0)) :  replacement has 0 rows, data has 4628
row.has.na <- apply(r1a1, 1, function(x){any(is.na(x))})
sum(row.has.na) #48
View(row.has.na)
#[1] 0

??tryCatch
tryCatch({predictions <- predict(model, x_test)}, error = function(e)print(e), warning = function(w))

#NOT RUN
# summarize results
confusionMatrix(Editnumber2, y_test)





Confidentiality Notice This message is sent from Zelis. ...{{dropped:13}}