I am using the following code to tune the 4 parameters of Gradient Boosting
algorithm using Simulated annealing (optim). When I run the program, after
few seconds it stops and displays the following error:
I point out here that the same code works for RF ( mtry parameter) and SVM
(cost and sigma parameters). So, I guess the problem should be in the 4
parameters of GBM
Something is wrong; all the MAE metric values are missing:
RMSE Rsquared MAE
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA Median : NA
Mean :NaN Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA Max. : NA
NA's :1 NA's :1 NA's :1
Code is here/// If you need the dataset, I can attach in the email
d=readARFF("dat.arff") ///DATA IS REGRESSION BASED
index <- createDataPartition(log10(d$Price), p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(log10(tr$Price), returnTrain = TRUE, list = TRUE)
ctrl <- trainControl(method = "cv", index = index_2)
obj <- function(param, maximize = FALSE) {
mod <- train(log10(Price) ~ ., data = tr,
method = "gbm",
preProc = c("center", "scale", "zv"),
metric = "MAE",
trControl = ctrl,
//HERE IN tuneGrid WHEN I USE PARAMETERS FOR SVM AND RF, IT
WORKS, BUT FOR GBM, IT DOES NOT WORK
tuneGrid = data.frame(n.trees = 10^(param[1]),
interaction.depth = 10^(param[2]),
shrinkage=10^(param[3]),
n.minobsinnode=10^(param[4])))
if(maximize)
-getTrainPerf(mod)[, "TrainMAE"] else
getTrainPerf(mod)[, "TrainMAE"]
}
num_mods <- 50
## Simulated annealing from base R
/// I JUST USED HERE SOME INITIAL POINTS OF THE 4 PARAMETERS OF GBM
san_res <- optim(par = c(10,1,0.1,1), fn = obj, method = "SANN",
control = list(maxit = num_mods))
san_res
all the MAE metric values are missing (Error message)
7 messages · Neha gupta, David Winsemius, Jim Lemon +1 more
You need to read the Posting Guide (and study the options of the gmail interface). This is a plain text mailing list and the server does not accept attachments that are anything other than .txt or .pdf files.
David.
On 12/22/19 9:15 AM, Neha gupta wrote:
> I am using the following code to tune the 4 parameters of Gradient Boosting
> algorithm using Simulated annealing (optim). When I run the program, after
> few seconds it stops and displays the following error:
>
> I point out here that the same code works for RF ( mtry parameter) and SVM
> (cost and sigma parameters). So, I guess the problem should be in the 4
> parameters of GBM
>
> Something is wrong; all the MAE metric values are missing:
> RMSE Rsquared MAE
> Min. : NA Min. : NA Min. : NA
> 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
> Median : NA Median : NA Median : NA
> Mean :NaN Mean :NaN Mean :NaN
> 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
> Max. : NA Max. : NA Max. : NA
> NA's :1 NA's :1 NA's :1
>
> Code is here/// If you need the dataset, I can attach in the email
>
> d=readARFF("dat.arff") ///DATA IS REGRESSION BASED
>
> index <- createDataPartition(log10(d$Price), p = .70,list = FALSE)
> tr <- d[index, ]
> ts <- d[-index, ]
>
> index_2 <- createFolds(log10(tr$Price), returnTrain = TRUE, list = TRUE)
> ctrl <- trainControl(method = "cv", index = index_2)
>
> obj <- function(param, maximize = FALSE) {
> mod <- train(log10(Price) ~ ., data = tr,
> method = "gbm",
> preProc = c("center", "scale", "zv"),
> metric = "MAE",
> trControl = ctrl,
> //HERE IN tuneGrid WHEN I USE PARAMETERS FOR SVM AND RF, IT
> WORKS, BUT FOR GBM, IT DOES NOT WORK
>
> tuneGrid = data.frame(n.trees = 10^(param[1]),
> interaction.depth = 10^(param[2]),
> shrinkage=10^(param[3]),
> n.minobsinnode=10^(param[4])))
>
> if(maximize)
> -getTrainPerf(mod)[, "TrainMAE"] else
> getTrainPerf(mod)[, "TrainMAE"]
> }
> num_mods <- 50
>
> ## Simulated annealing from base R
>
> /// I JUST USED HERE SOME INITIAL POINTS OF THE 4 PARAMETERS OF GBM
>
> san_res <- optim(par = c(10,1,0.1,1), fn = obj, method = "SANN",
> control = list(maxit = num_mods))
> san_res
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hi Neha, The error message looks suspicious, as it refers to "all the MAEs" while there is only one NA value in the summary. I would carefully check the object that you are passing to san_res. Jim
On Mon, Dec 23, 2019 at 4:17 AM Neha gupta <neha.bologna90 at gmail.com> wrote:
I am using the following code to tune the 4 parameters of Gradient Boosting
algorithm using Simulated annealing (optim). When I run the program, after
few seconds it stops and displays the following error:
I point out here that the same code works for RF ( mtry parameter) and SVM
(cost and sigma parameters). So, I guess the problem should be in the 4
parameters of GBM
Something is wrong; all the MAE metric values are missing:
RMSE Rsquared MAE
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA Median : NA
Mean :NaN Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA Max. : NA
NA's :1 NA's :1 NA's :1
Hi Jim The objective function is passed to san_res where we have defined the 4 parameters of gbm and the values are initialized in san_res. The output variable price has only three values: 0, 1, 2 (like categorical values), so someone told me try to remove the log10 from the price. I am not sure what to do, I spent two days but did not fix this issue.
On Sun, Dec 22, 2019 at 11:22 PM Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Neha, The error message looks suspicious, as it refers to "all the MAEs" while there is only one NA value in the summary. I would carefully check the object that you are passing to san_res. Jim On Mon, Dec 23, 2019 at 4:17 AM Neha gupta <neha.bologna90 at gmail.com> wrote:
I am using the following code to tune the 4 parameters of Gradient
Boosting
algorithm using Simulated annealing (optim). When I run the program,
after
few seconds it stops and displays the following error: I point out here that the same code works for RF ( mtry parameter) and
SVM
(cost and sigma parameters). So, I guess the problem should be in the 4
parameters of GBM
Something is wrong; all the MAE metric values are missing:
RMSE Rsquared MAE
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA Median : NA
Mean :NaN Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA Max. : NA
NA's :1 NA's :1 NA's :1
Hi Neha, Well, that's a clue to why you are getting NAs: log10(0) [1] -Inf Another possibility is that the values used in the initial calculation have been read in as factors. Jim
On Mon, Dec 23, 2019 at 10:55 AM Neha gupta <neha.bologna90 at gmail.com> wrote:
Hi Jim The objective function is passed to san_res where we have defined the 4 parameters of gbm and the values are initialized in san_res. The output variable price has only three values: 0, 1, 2 (like categorical values), so someone told me try to remove the log10 from the price.
Hi Jim, Another possibility is that the values used in the initial calculation have been read in as factors Which calculation you are talking about? I did not use factors as variable. Regards
On Mon, Dec 23, 2019 at 3:12 AM Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Neha, Well, that's a clue to why you are getting NAs: log10(0) [1] -Inf Another possibility is that the values used in the initial calculation have been read in as factors. Jim On Mon, Dec 23, 2019 at 10:55 AM Neha gupta <neha.bologna90 at gmail.com> wrote:
Hi Jim The objective function is passed to san_res where we have defined the 4
parameters of gbm and the values are initialized in san_res.
The output variable price has only three values: 0, 1, 2 (like
categorical values), so someone told me try to remove the log10 from the price.
What Jim is alluding to is that sometimes in the process of reading in data a small typo can mean that what was intended to be a numeric variable is read in as a factor. So he was suggesting that you double check that this has not happened to you. Michael
On 23/12/2019 11:45, Neha gupta wrote:
Hi Jim, Another possibility is that the values used in the initial calculation have been read in as factors Which calculation you are talking about? I did not use factors as variable. Regards On Mon, Dec 23, 2019 at 3:12 AM Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Neha, Well, that's a clue to why you are getting NAs: log10(0) [1] -Inf Another possibility is that the values used in the initial calculation have been read in as factors. Jim On Mon, Dec 23, 2019 at 10:55 AM Neha gupta <neha.bologna90 at gmail.com> wrote:
Hi Jim The objective function is passed to san_res where we have defined the 4
parameters of gbm and the values are initialized in san_res.
The output variable price has only three values: 0, 1, 2 (like
categorical values), so someone told me try to remove the log10 from the price.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.