On Sun, Jun 23, 2013 at 5:51 AM, Tomislav Hengl
<hengl at spatial-analyst.net> wrote:
Dear list,
I have a question about the randomForest models. I'm trying to figure out a
way to estimate the prediction variance (spatially) for the randomForest
function (http://cran.r-project.org/web/packages/randomForest/).
If I run a GLM I can also derive the prediction variance using:
demo(meuse, echo=FALSE)
meuse.ov <- over(meuse, meuse.grid)
meuse.ov <- cbind(meuse.ov, meuse at data)
omm0 <- glm(log1p(om)~dist+ffreq, meuse.ov, family=gaussian())
om.glm <- predict.glm(omm0, meuse.grid, se.fit=TRUE)
str(om.glm)
List of 3
$ fit : Named num [1:3103] 2.34 2.34 2.32 2.29 2.34 ...
..- attr(*, "names")= chr [1:3103] "1" "2" "3" "4" ...
$ se.fit : Named num [1:3103] 0.0491 0.0491 0.0481 0.046 0.0491 ...
..- attr(*, "names")= chr [1:3103] "1" "2" "3" "4" ...
$ residual.scale: num 0.357
when I fit a randomForest model, I do not get any estimate of the model
uncertainty (for each pixel) but just the predictions:
meuse.ov <- meuse.ov[-omm0$na.action,]
x <- randomForest(log1p(om)~dist+ffreq, meuse.ov)
om.rf <- predict(x, meuse.grid)
str(om.rf)
Named num [1:3103] 2.49 2.49 2.51 2.44 2.49 ...
- attr(*, "names")= chr [1:3103] "1" "2" "3" "4" ...
Does anyone has an idea how to map the prediction variance (i.e. estimated
or propagated error) for the randomForest models spatially?
I've tried deriving a propagated error for the randomForest models (every
fit gives another model due to random component):
l.rfk <- data.frame(om_1 = rep(NA, nrow(meuse.grid)))
for(i in 1:50){
+ suppressWarnings(suppressMessages(x <-
randomForest(log1p(om)~dist+ffreq, meuse.ov)))
+ l.rfk[,paste("om",i,sep="_")] <- predict(x, meuse.grid)
+ } ## takes ca 1 minute
meuse.grid$om.rfkvar <- om.rfk at predicted$var1.var + apply(l.rfk, 1, var)