I am trying to extract the per-pixel median from a raster stack, and
am finding that the process takes a very long time relative to taking
the per-pixel mean. It is long enough that I have never actually seen
whether it successfully completes on my data, but this dummy example
shows the code structure I am using and the time differences between
the two functions:
stk <- stack(lapply(1:10, function(x) {
?r <- raster(nrow = 1000, ncol = 1000)
?r2 <- setValues(r, sample(1:100, size = ncell(r), replace = T, prob =
NULL))
}))
setOptions(todisk = TRUE) ?# I used this option to rule out memory issues
t1 <- Sys.time()
med <- calc(stk, Median, filename = "med.tif", datatype = "INT2S",
overwrite = T)
Sys.time() - t1 ?# 7.26 mins
t2 <- Sys.time()
med <- calc(stk, mean, filename = "mean.tif", datatype = "INT2S",
overwrite = T)
Sys.time() - t2 ?# 2.34 secs
setOptions(todisk = FALSE)
Am I doing something wrong with my code, or is the speed difference I
am seeing expected?
Lyndon,
Median (big M) was intended for use like Median(s) not for use in calc
(which did not exist yet); but I am surprised how slow it is in calc. Now we
have calc, you can use median (small m) and that is much quicker; and about
the same as Median(s); I think I will remove Median as it is no longer
needed.
Either way, median will be slower than mean, which is expected:
v = runif(1e+7)
system.time(mean(v))
? user ?system elapsed
? 0.09 ? ?0.00 ? ?0.09
system.time(median(v)) ?# 7 times slower
? user ?system elapsed
? 0.56 ? ?0.06 ? ?0.63
s <- stack(sapply(1:10, function(x) setValues(r, runif(ncell(r)))))
r <- raster()
system.time(mean(s))
? user ?system elapsed
? 4.75 ? ?0.02 ? ?4.79
system.time(calc(s, mean))
? user ?system elapsed
? 0.25 ? ?0.00 ? ?0.26
? user ?system elapsed
? 7.73 ? ?0.00 ? ?7.75
system.time(calc(s, median))
? user ?system elapsed
? 8.48 ? ?0.00 ? ?8.54
system.time(calc(s, Median)) # slow!
? user ?system elapsed
?28.15 ? ?0.03 ? 28.27