Dear all,
I am trying to evaluate the influence of several oceanographic
environmental parameters on the presence/absence of a fish species in an
estuary using boosted regression trees. For that I tried the gbm.step
function provided in the package dismo.
Since I have many predictors, I used gbm.simplify to drop the
non-informative predictors and improve the predictive performance of the
models.
But one of my datasets has a very small number of observations, n=44.
Although the function gbm.step appears to run fine on this dataset, when I
apply gbm.simplify to the model, I get the following error:
*Error in gbm.fit(x, y, offset = offset, distribution = distribution, w =
w, : *
* The dataset size is too small or subsampling rate is too large:
nTrain*bag.fraction <= n.minobsinnode*
I provide an example using Anguilla_train the dataset:
data(Anguilla_train)
# reduce data set to 44 obs.
Anguilla_train <- Anguilla_train[245:288,]
# apply gbm.step with a bag.fraction=0.75
model <- gbm.step(data=Anguilla_train, gbm.x = c("SegSumT", "SegTSeas",
"SegLowFlow", "DSDist", "DSMaxSlope", "USAvgT",
"USRainDays", "USSlope", "USNative",
"DSDam", "Method", "LocSed"),
gbm.y = "Angaus", family = "bernoulli",
tree.complexity = 1,
learning.rate = 0.001, bag.fraction =
0.75, n.folds =5)
#apply gbm.simplify to the model
model.simp<- gbm.simplify(model, n.drops=3)
When I check the components of my model object:
model$nTrain
#[1] 44
model$bag.fraction
#[1] 0.75
model$n.minobsinnode
# [1] 10
So if I understand correctly, 44*0.75>10, which allowed the model to be
built with the function gbm.step. I assume gbm.simplify would run based on
the settings established previously for the model... So why does the error
message only appear for this function and not for both?
If I change the bag.fraction to 1, the same happens.
I also tried to include a setting of n.minobsinnode=5 in the the gbm.step
function, but the default remains the same (=10). I guess that happens
because gbm.step function is an extension of the gbm functions in the gbm
package...
Any thoughts will be highly appreciated.
Thanks in advance.
Eva Amorim
gbm.simplify Error (nTrain*bag.fraction)
1 message · Eva Amorim