Skip to content
Prev 4147 / 7420 Next

Cross validate model with calibrate

Dear Philipp,

I cannot see anything wrong in your approach. The only problem is that RDA may not do very well. It is very common that the calibrated values are outside the original range of the data and if the predictions are not very good, they can also be "unrealistic" values. RDA has no idea what is realistic for given data. 

You get cross-validated values outside data range even in the Dune meadow data you used. Here an example that applies 4-fold cross validation (in line with 3/4 subsetting in your example):

library(vegan)
data(dune, dune.env)
vec <- rep(1:4, length=nrow(dune))
pred <- rep(NA, length=nrow(dune))
set.seed(4711)
take <- sample(vec)
for(i in 1:4) pred[take==i] <- calibrate(rda(dune ~ A1, dune.env, subset = take != i), newdata = subset(dune, take==i))
range(pred)  
## [1]  0.9104486 12.3583826
with(dune.env, range(A1))
## [1]  2.8 11.5

The predicted values extend nearly 2cm below and 1cm above the data values. In this case the predicted values were not negative, but that could happen, too. The tool (RDA) may not work very well in all cases, but RMSE will tell you how good RDA is for the job. If it is bad, do not use it. It seems RDA is very poor in this dune example: RMSE is larger than standard deviation of A1 (but you could almost guess this by looking at the low constrained eigenvalues of the model).

The code above uses the 'subset' argument of vegan::rda and the subset() function of R to make the code a bit easier to write.

Cheers, Jari Oksanen
On 01/11/2013, at 15:53 PM, Philipp01 wrote: