Getting error message, "LOOCV is not compatible with `resamples()` since only one resampling estimate is available. " - R-help

Tue, Mar 3, 2020 3:29 AM #

Hi, I am using different validation methods for random search and grid
search. The validation methods are 10 fold CV, bootstrap and LOOCV but for
LOOCV, I get the error message when I draw boxplots for all the results.

Error is , LOOCV is not compatible with `resamples()` since only one
resampling estimate is available.

The code is below.

d=readARFF("china.arff")
index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)




ct_rand <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="grid")


ct_boot1 <- trainControl(method = "boot", number=100,  index = index_2,
search="random")
ct_boot2 <- trainControl(method = "boot", number=100,  index = index_2,
search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)

## ## ## ## ##grid search CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

set.seed(30218)
ran_boot <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_boot1)
getTrainPerf(ran_boot)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search boot

set.seed(30218)
grid_boot <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_boot2)

getTrainPerf(grid_boot)


set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search CV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)


rValues <- resamples(list(Random_Search_CV=ran_CV, Grid_Search_CV=grid_CV,
Random_Search_Boot=ran_boot, Grid_Search_Boot=grid_boot ,
                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")

Bert Gunter

Tue, Mar 3, 2020 8:07 AM #

2 1/2 suggestions:

1. Provide a small reproducible example with **minimal code** . It can be
difficult to sort through dozens of lines of code, and I, anyway, would be
unwilling to spend time trying to debug/isolate the problem when you have
apparently not made much of an effort to do so yourself. Others may well be
both more knowledgeable and more tolerant, of course.

2. If, **after a suitable wait ** you have not received useful answers,
contact the package maintainer of the package you used **which you have
again failed to identify** (the caret package?) . Also check to see whether
the package has its own user support structure. Some do, and this should be
the first point of contact anyway if so.

2 1/2 . Post in **plain text** not html, though I don't think it mattered
here.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Mar 3, 2020 at 3:30 AM javed khan <javedbtk111 at gmail.com> wrote:

Hi, I am using different validation methods for random search and grid
search. The validation methods are 10 fold CV, bootstrap and LOOCV but for
LOOCV, I get the error message when I draw boxplots for all the results.

Error is , LOOCV is not compatible with `resamples()` since only one
resampling estimate is available.

The code is below.

d=readARFF("china.arff")
index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)




ct_rand <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="grid")


ct_boot1 <- trainControl(method = "boot", number=100,  index = index_2,
search="random")
ct_boot2 <- trainControl(method = "boot", number=100,  index = index_2,
search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)

## ## ## ## ##grid search CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

set.seed(30218)
ran_boot <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_boot1)
getTrainPerf(ran_boot)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search boot

set.seed(30218)
grid_boot <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_boot2)

getTrainPerf(grid_boot)


set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search CV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)


rValues <- resamples(list(Random_Search_CV=ran_CV, Grid_Search_CV=grid_CV,
Random_Search_Boot=ran_boot, Grid_Search_Boot=grid_boot ,
                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

javed khan

Tue, Mar 3, 2020 12:27 PM #

The data is as follows:  I included the code for 10 fold CV and LOOCV

structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15), Language = c(1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1,
1, 1, 3), Hardware = c(1, 2, 3, 1, 2, 4, 4, 2, 1, 1, 1, 5, 6,
1, 1), Duration = c(17, 7, 15, 18, 13, 5, 5, 11, 14, 5, 13, 31,
20, 26, 14), KSLOC = c(253.6, 40.5, 450, 214.4, 449.9, 50, 43,
200, 289, 39, 254.2, 128.6, 161.4, 164.8, 60.2), AdjFP = c(1217.1,
507.3, 2306.8, 788.5, 1337.6, 421.3, 99.9, 993, 1592.9, 240,
1611, 789, 690.9, 1347.5, 1044.3), RAWFP = c(1010, 457, 2284,
881, 1583, 411, 97, 998, 1554, 250, 1603, 724, 705, 1375, 976
), EffortMM = c(287, 82.5, 1107.31, 86.9, 336.3, 84, 23.2, 130.3,
116, 72, 258.7, 230.7, 157, 246.9, 69.9)), class = "data.frame", row.names
= c(NA,
-15L))


index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)

ct_rand <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

 ## ## ## ## ##Random Search for for 10 fold CV

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search  for 10 fold CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

  ## ## ## ## ##Random Search for LOOCV

set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


 ## ## ## ## ##Grid Search for LOOCV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)

rValues <- resamples(list(Random_Search_CV=ran_CV, Grid_Search_CV=grid_CV,

                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")

On Tue, Mar 3, 2020 at 5:07 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:

2 1/2 suggestions:

1. Provide a small reproducible example with **minimal code** . It can be
difficult to sort through dozens of lines of code, and I, anyway, would be
unwilling to spend time trying to debug/isolate the problem when you have
apparently not made much of an effort to do so yourself. Others may well be
both more knowledgeable and more tolerant, of course.

2. If, **after a suitable wait ** you have not received useful answers,
contact the package maintainer of the package you used **which you have
again failed to identify** (the caret package?) . Also check to see whether
the package has its own user support structure. Some do, and this should be
the first point of contact anyway if so.

2 1/2 . Post in **plain text** not html, though I don't think it mattered
here.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 3, 2020 at 3:30 AM javed khan <javedbtk111 at gmail.com> wrote:

Hi, I am using different validation methods for random search and grid
search. The validation methods are 10 fold CV, bootstrap and LOOCV but for
LOOCV, I get the error message when I draw boxplots for all the results.

Error is , LOOCV is not compatible with `resamples()` since only one
resampling estimate is available.

The code is below.

d=readARFF("china.arff")
index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)




ct_rand <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="grid")


ct_boot1 <- trainControl(method = "boot", number=100,  index = index_2,
search="random")
ct_boot2 <- trainControl(method = "boot", number=100,  index = index_2,
search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)

## ## ## ## ##grid search CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

set.seed(30218)
ran_boot <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_boot1)
getTrainPerf(ran_boot)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search boot

set.seed(30218)
grid_boot <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_boot2)

getTrainPerf(grid_boot)


set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search CV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)


rValues <- resamples(list(Random_Search_CV=ran_CV, Grid_Search_CV=grid_CV,
Random_Search_Boot=ran_boot, Grid_Search_Boot=grid_boot ,
                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Bert Gunter

Tue, Mar 3, 2020 12:57 PM #

... and you **still** have not told us what package(s)...

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Mar 3, 2020 at 12:28 PM javed khan <javedbtk111 at gmail.com> wrote:

The data is as follows:  I included the code for 10 fold CV and LOOCV

structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15), Language = c(1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1,
1, 1, 3), Hardware = c(1, 2, 3, 1, 2, 4, 4, 2, 1, 1, 1, 5, 6,
1, 1), Duration = c(17, 7, 15, 18, 13, 5, 5, 11, 14, 5, 13, 31,
20, 26, 14), KSLOC = c(253.6, 40.5, 450, 214.4, 449.9, 50, 43,
200, 289, 39, 254.2, 128.6, 161.4, 164.8, 60.2), AdjFP = c(1217.1,
507.3, 2306.8, 788.5, 1337.6, 421.3, 99.9, 993, 1592.9, 240,
1611, 789, 690.9, 1347.5, 1044.3), RAWFP = c(1010, 457, 2284,
881, 1583, 411, 97, 998, 1554, 250, 1603, 724, 705, 1375, 976
), EffortMM = c(287, 82.5, 1107.31, 86.9, 336.3, 84, 23.2, 130.3,
116, 72, 258.7, 230.7, 157, 246.9, 69.9)), class = "data.frame", row.names
= c(NA,
-15L))


index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)

ct_rand <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10, repeats=10,index
= index_2, search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

 ## ## ## ## ##Random Search for for 10 fold CV

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search  for 10 fold CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

  ## ## ## ## ##Random Search for LOOCV

set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


 ## ## ## ## ##Grid Search for LOOCV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)

rValues <- resamples(list(Random_Search_CV=ran_CV, Grid_Search_CV=grid_CV,

                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")




On Tue, Mar 3, 2020 at 5:07 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:

2 1/2 suggestions:

1. Provide a small reproducible example with **minimal code** . It can be
difficult to sort through dozens of lines of code, and I, anyway, would be
unwilling to spend time trying to debug/isolate the problem when you have
apparently not made much of an effort to do so yourself. Others may well be
both more knowledgeable and more tolerant, of course.

2. If, **after a suitable wait ** you have not received useful answers,
contact the package maintainer of the package you used **which you have
again failed to identify** (the caret package?) . Also check to see whether
the package has its own user support structure. Some do, and this should be
the first point of contact anyway if so.

2 1/2 . Post in **plain text** not html, though I don't think it mattered
here.


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 3, 2020 at 3:30 AM javed khan <javedbtk111 at gmail.com> wrote:

Hi, I am using different validation methods for random search and grid
search. The validation methods are 10 fold CV, bootstrap and LOOCV but
for
LOOCV, I get the error message when I draw boxplots for all the results.

Error is , LOOCV is not compatible with `resamples()` since only one
resampling estimate is available.

The code is below.

d=readARFF("china.arff")
index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)




ct_rand <- trainControl(method = "repeatedcv", number=10,
repeats=10,index
= index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10,
repeats=10,index
= index_2, search="grid")


ct_boot1 <- trainControl(method = "boot", number=100,  index = index_2,
search="random")
ct_boot2 <- trainControl(method = "boot", number=100,  index = index_2,
search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)

## ## ## ## ##grid search CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

set.seed(30218)
ran_boot <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_boot1)
getTrainPerf(ran_boot)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search boot

set.seed(30218)
grid_boot <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_boot2)

getTrainPerf(grid_boot)


set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search CV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)


rValues <- resamples(list(Random_Search_CV=ran_CV,
Grid_Search_CV=grid_CV,
Random_Search_Boot=ran_boot, Grid_Search_Boot=grid_boot ,
                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

javed khan

Tue, Mar 3, 2020 1:57 PM #

I am sorry for that... I am just using caret package.

Thanks

On Tue, Mar 3, 2020 at 9:57 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:

... and you **still** have not told us what package(s)...

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 3, 2020 at 12:28 PM javed khan <javedbtk111 at gmail.com> wrote:

The data is as follows:  I included the code for 10 fold CV and LOOCV

structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15), Language = c(1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1,
1, 1, 3), Hardware = c(1, 2, 3, 1, 2, 4, 4, 2, 1, 1, 1, 5, 6,
1, 1), Duration = c(17, 7, 15, 18, 13, 5, 5, 11, 14, 5, 13, 31,
20, 26, 14), KSLOC = c(253.6, 40.5, 450, 214.4, 449.9, 50, 43,
200, 289, 39, 254.2, 128.6, 161.4, 164.8, 60.2), AdjFP = c(1217.1,
507.3, 2306.8, 788.5, 1337.6, 421.3, 99.9, 993, 1592.9, 240,
1611, 789, 690.9, 1347.5, 1044.3), RAWFP = c(1010, 457, 2284,
881, 1583, 411, 97, 998, 1554, 250, 1603, 724, 705, 1375, 976
), EffortMM = c(287, 82.5, 1107.31, 86.9, 336.3, 84, 23.2, 130.3,
116, 72, 258.7, 230.7, 157, 246.9, 69.9)), class = "data.frame",
row.names = c(NA,
-15L))


index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)

ct_rand <- trainControl(method = "repeatedcv", number=10,
repeats=10,index = index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10,
repeats=10,index = index_2, search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

 ## ## ## ## ##Random Search for for 10 fold CV

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search  for 10 fold CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

  ## ## ## ## ##Random Search for LOOCV

set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


 ## ## ## ## ##Grid Search for LOOCV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)

rValues <- resamples(list(Random_Search_CV=ran_CV,
Grid_Search_CV=grid_CV,

                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")




On Tue, Mar 3, 2020 at 5:07 PM Bert Gunter <bgunter.4567 at gmail.com>
wrote:

2 1/2 suggestions:

1. Provide a small reproducible example with **minimal code** . It can
be difficult to sort through dozens of lines of code, and I, anyway, would
be unwilling to spend time trying to debug/isolate the problem when you
have apparently not made much of an effort to do so yourself. Others may
well be both more knowledgeable and more tolerant, of course.

2. If, **after a suitable wait ** you have not received useful answers,
contact the package maintainer of the package you used **which you have
again failed to identify** (the caret package?) . Also check to see whether
the package has its own user support structure. Some do, and this should be
the first point of contact anyway if so.

2 1/2 . Post in **plain text** not html, though I don't think it
mattered here.


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 3, 2020 at 3:30 AM javed khan <javedbtk111 at gmail.com> wrote:

Hi, I am using different validation methods for random search and grid
search. The validation methods are 10 fold CV, bootstrap and LOOCV but
for
LOOCV, I get the error message when I draw boxplots for all the results.

Error is , LOOCV is not compatible with `resamples()` since only one
resampling estimate is available.

The code is below.

d=readARFF("china.arff")
index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)




ct_rand <- trainControl(method = "repeatedcv", number=10,
repeats=10,index
= index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10,
repeats=10,index
= index_2, search="grid")


ct_boot1 <- trainControl(method = "boot", number=100,  index = index_2,
search="random")
ct_boot2 <- trainControl(method = "boot", number=100,  index = index_2,
search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)

## ## ## ## ##grid search CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

set.seed(30218)
ran_boot <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_boot1)
getTrainPerf(ran_boot)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search boot

set.seed(30218)
grid_boot <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_boot2)

getTrainPerf(grid_boot)


set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search CV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)


rValues <- resamples(list(Random_Search_CV=ran_CV,
Grid_Search_CV=grid_CV,
Random_Search_Boot=ran_boot, Grid_Search_Boot=grid_boot ,
                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

javed khan

Wed, Mar 4, 2020 2:35 AM #

In response to my question, I want to confirm that the LOOCV gives me the
result (RMSE/MAE values), but when I use it as rvalues=resamples(list() ,
it gives error. All other works like boot, k fold cv and even LGOCV (Leave
Group Out Cross Validation).

Regards

On Tue, Mar 3, 2020 at 10:57 PM javed khan <javedbtk111 at gmail.com> wrote:

I am sorry for that... I am just using caret package.

Thanks

On Tue, Mar 3, 2020 at 9:57 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:

... and you **still** have not told us what package(s)...

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 3, 2020 at 12:28 PM javed khan <javedbtk111 at gmail.com> wrote:

The data is as follows:  I included the code for 10 fold CV and LOOCV

structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15), Language = c(1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1,
1, 1, 3), Hardware = c(1, 2, 3, 1, 2, 4, 4, 2, 1, 1, 1, 5, 6,
1, 1), Duration = c(17, 7, 15, 18, 13, 5, 5, 11, 14, 5, 13, 31,
20, 26, 14), KSLOC = c(253.6, 40.5, 450, 214.4, 449.9, 50, 43,
200, 289, 39, 254.2, 128.6, 161.4, 164.8, 60.2), AdjFP = c(1217.1,
507.3, 2306.8, 788.5, 1337.6, 421.3, 99.9, 993, 1592.9, 240,
1611, 789, 690.9, 1347.5, 1044.3), RAWFP = c(1010, 457, 2284,
881, 1583, 411, 97, 998, 1554, 250, 1603, 724, 705, 1375, 976
), EffortMM = c(287, 82.5, 1107.31, 86.9, 336.3, 84, 23.2, 130.3,
116, 72, 258.7, 230.7, 157, 246.9, 69.9)), class = "data.frame",
row.names = c(NA,
-15L))


index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)

ct_rand <- trainControl(method = "repeatedcv", number=10,
repeats=10,index = index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10,
repeats=10,index = index_2, search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

 ## ## ## ## ##Random Search for for 10 fold CV

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search  for 10 fold CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

  ## ## ## ## ##Random Search for LOOCV

set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


 ## ## ## ## ##Grid Search for LOOCV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)

rValues <- resamples(list(Random_Search_CV=ran_CV,
Grid_Search_CV=grid_CV,

                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")




On Tue, Mar 3, 2020 at 5:07 PM Bert Gunter <bgunter.4567 at gmail.com>
wrote:

2 1/2 suggestions:

1. Provide a small reproducible example with **minimal code** . It can
be difficult to sort through dozens of lines of code, and I, anyway, would
be unwilling to spend time trying to debug/isolate the problem when you
have apparently not made much of an effort to do so yourself. Others may
well be both more knowledgeable and more tolerant, of course.

2. If, **after a suitable wait ** you have not received useful answers,
contact the package maintainer of the package you used **which you have
again failed to identify** (the caret package?) . Also check to see whether
the package has its own user support structure. Some do, and this should be
the first point of contact anyway if so.

2 1/2 . Post in **plain text** not html, though I don't think it
mattered here.


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 3, 2020 at 3:30 AM javed khan <javedbtk111 at gmail.com>
wrote:

Hi, I am using different validation methods for random search and grid
search. The validation methods are 10 fold CV, bootstrap and LOOCV but
for
LOOCV, I get the error message when I draw boxplots for all the
results.

Error is , LOOCV is not compatible with `resamples()` since only one
resampling estimate is available.

The code is below.

d=readARFF("china.arff")
index <- createDataPartition(d$Effort, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)




ct_rand <- trainControl(method = "repeatedcv", number=10,
repeats=10,index
= index_2, search="random")
ct_grid <- trainControl(method = "repeatedcv", number=10,
repeats=10,index
= index_2, search="grid")


ct_boot1 <- trainControl(method = "boot", number=100,  index = index_2,
search="random")
ct_boot2 <- trainControl(method = "boot", number=100,  index = index_2,
search="grid")

ct_locv <- trainControl(method = "LOOCV",  search="random")
ct_locv2 <- trainControl(method = "LOOCV",   search="grid")

set.seed(30218)
ran_CV <- train(Effort ~ ., data = tr,
                    method = "pls",
                    tuneLength = 15,
                    metric = "MAE",
                    preProc = c("center", "scale", "zv"),
                    trControl = ct_rand)
getTrainPerf(ran_CV)
rn <- predict(ran_CV, newdata = ts)

## ## ## ## ##grid search CV

set.seed(30218)
grid_CV <- train(Effort ~ ., data = tr,
                     method = "pls",
                     metric = "MAE",
                     preProc = c("center", "scale", "zv"),
                     trControl = ct_grid)

getTrainPerf(grid_CV)

set.seed(30218)
ran_boot <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_boot1)
getTrainPerf(ran_boot)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search boot

set.seed(30218)
grid_boot <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_boot2)

getTrainPerf(grid_boot)


set.seed(30218)
ran_locv <- train(Effort ~ ., data = tr,
                method = "pls",
                tuneLength = 15,
                metric = "MAE",
                preProc = c("center", "scale", "zv"),
                trControl = ct_locv)
getTrainPerf(ran_locv)
rn <- predict(ran_search, newdata = ts)
##MAE(rn, ts$Effort)


## ## ## ## ##grid search CV

set.seed(30218)
grid_locv <- train(Effort ~ ., data = tr,
                 method = "pls",
                 metric = "MAE",
                 preProc = c("center", "scale", "zv"),
                 trControl = ct_locv2)

getTrainPerf(grid_locv)


rValues <- resamples(list(Random_Search_CV=ran_CV,
Grid_Search_CV=grid_CV,
Random_Search_Boot=ran_boot, Grid_Search_Boot=grid_boot ,
                          Random_Search_LOOCV=ran_locv,
Grid_Search_LOOCV=grid_locv))

bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.