Skip to content

all the MAE metric values are missing (Error message)

7 messages · Neha gupta, David Winsemius, Jim Lemon +1 more

#
I am using the following code to tune the 4 parameters of Gradient Boosting
algorithm using Simulated annealing (optim). When I run the program, after
few seconds it stops and displays the following error:

I point out here that the same code works for RF ( mtry parameter) and SVM
(cost and sigma parameters). So, I guess the problem should be in the 4
parameters of GBM

Something is wrong; all the MAE metric values are missing:
      RMSE        Rsquared        MAE
 Min.   : NA   Min.   : NA   Min.   : NA
 1st Qu.: NA   1st Qu.: NA   1st Qu.: NA
 Median : NA   Median : NA   Median : NA
 Mean   :NaN   Mean   :NaN   Mean   :NaN
 3rd Qu.: NA   3rd Qu.: NA   3rd Qu.: NA
 Max.   : NA   Max.   : NA   Max.   : NA
 NA's   :1     NA's   :1     NA's   :1

Code is here/// If you need the  dataset, I can attach in the email

d=readARFF("dat.arff")   ///DATA IS REGRESSION BASED

index <- createDataPartition(log10(d$Price), p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]

index_2 <- createFolds(log10(tr$Price), returnTrain = TRUE, list = TRUE)
ctrl <- trainControl(method = "cv", index = index_2)

obj <- function(param, maximize = FALSE) {
  mod <- train(log10(Price) ~ ., data = tr,
               method = "gbm",
               preProc = c("center", "scale", "zv"),
               metric = "MAE",
               trControl = ctrl,
       //HERE IN tuneGrid WHEN I USE PARAMETERS FOR SVM    AND RF, IT
WORKS, BUT FOR GBM, IT DOES NOT WORK

               tuneGrid = data.frame(n.trees = 10^(param[1]),
interaction.depth = 10^(param[2]),
                                     shrinkage=10^(param[3]),
n.minobsinnode=10^(param[4])))

  if(maximize)
    -getTrainPerf(mod)[, "TrainMAE"] else
      getTrainPerf(mod)[, "TrainMAE"]
}
num_mods <- 50

## Simulated annealing from base R

/// I JUST USED HERE SOME INITIAL POINTS OF THE 4 PARAMETERS OF GBM

san_res <- optim(par = c(10,1,0.1,1), fn = obj, method = "SANN",
                 control = list(maxit = num_mods))
san_res
#
You need to read the Posting Guide (and study the options of the gmail 
interface). This is a plain text mailing list and the server does not 
accept attachments that are anything other than .txt or .pdf files.
#
Hi Neha,
The error message looks suspicious, as it refers to "all the MAEs"
while there is only one NA value in the summary. I would carefully
check the object that you are passing to san_res.

Jim
On Mon, Dec 23, 2019 at 4:17 AM Neha gupta <neha.bologna90 at gmail.com> wrote:
#
Hi Jim

The objective function is passed to san_res where we have defined the 4
parameters of gbm and the values are initialized in san_res.

The output variable price has only three values: 0, 1, 2 (like categorical
values), so someone told me try to remove the log10 from the price.

I am not sure what to do, I spent two days but did not fix this issue.
On Sun, Dec 22, 2019 at 11:22 PM Jim Lemon <drjimlemon at gmail.com> wrote:

            

  
  
#
Hi Neha,
Well, that's a clue to why you are getting NAs:

log10(0)
[1] -Inf

Another possibility is that the values used in the initial calculation
have been read in as factors.

Jim
On Mon, Dec 23, 2019 at 10:55 AM Neha gupta <neha.bologna90 at gmail.com> wrote:
#
Hi Jim,

Another possibility is that the values used in the initial calculation
have been read in as factors

Which calculation you are talking about? I did not use factors as variable.

Regards
On Mon, Dec 23, 2019 at 3:12 AM Jim Lemon <drjimlemon at gmail.com> wrote:

            

  
  
#
What Jim is alluding to is that sometimes in the process of reading in 
data a small typo can mean that what was intended to be a numeric 
variable is read in as a factor. So he was suggesting that you double 
check that this has not happened to you.

Michael
On 23/12/2019 11:45, Neha gupta wrote: