On 08/28/2014 05:10 PM, Tomislav Hengl wrote:
Dear list,
I'm trying to standardize a procedure to compare performance of
competing spatial prediction methods. I know that this has been
discussed in various literature and on various mailing lists, but I
would be interested in any opinion I could get.
I am comparing (see below) 2 spatial prediction methods
(regression-kriging and inverse distance interpolation) using 5-fold
cross-validation and then testing if the difference between the two is
significant. What I concluded is that there are two possible tests for
the final residuals:
1. F-test to compare variances (cross-validation residuals),
2. t-test to compare mean values,
If you think in terms of accuracy vs. precision, I'd say both tests are
equally important. Ideally you want your method to be precise (low
variance) and accurate (low deviation around mean). What I usually tend
to do is repeated random sub-sampling with 100+ runs.
Both tests might be important, nevertheless the F-test ("var.test")
seems to be more interesting to really be able to answer "is the
method B significantly more accurate than method A?". It appears that
the second test ("t.test") is only important if it fails -> which
would mean that one of the methods systematically over or
under-estimates the mean value (which should be 0). Did I maybe miss
some important test?
Thank you!
R> library(GSIF)
R> library(gstat)
R> library(sp)
R> set.seed(2419)
R> demo(meuse, echo=FALSE)
R> omm1 <- fit.gstatModel(meuse, log1p(om)~dist+soil, meuse.grid)
Fitting a linear model...
Fitting a 2D variogram...
Saving an object of class 'gstatModel'...
R> rk1 <- predict(omm1, meuse.grid)
R> meuse.s <- meuse[!is.na(meuse$om),]
R> ok1 <- krige.cv(log1p(om)~1, meuse.s, nfold=5)
R> var.test(ok1$residual, rk1 at validation$residual, alternative =
"greater")
F test to compare two variances
data: ok1$residual and rk1 at validation$residual
F = 1.2283, num df = 152, denom df = 152, p-value =
0.103
alternative hypothesis: true ratio of variances is greater than 1
95 percent confidence interval:
0.9398662 Inf
sample estimates:
ratio of variances
1.228322
R> ## No significant difference
R> t.test(ok1$residual, rk1 at validation$residual)
Welch Two Sample t-test
data: ok1$residual and rk1 at validation$residual
t = -0.0204, df = 300.842, p-value = 0.9837
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.07084667 0.06939220
sample estimates:
mean of x mean of y
0.0004766718 0.0012039089
R> ## Again, no significant difference
R> sessionInfo()
R version 3.0.3 (2014-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
other attached packages:
[1] randomForest_4.6-7 nortest_1.0-2
[3] gstat_1.0-19 GSIF_0.4-2
[5] sp_1.0-15 gap_1.1-12