Skip to content

raster Median function very slow on stack compared to mean: expected behavior?

3 messages · Lyndon Estes, Robert J. Hijmans

#
Dear List,

I am trying to extract the per-pixel median from a raster stack, and
am finding that the process takes a very long time relative to taking
the per-pixel mean. It is long enough that I have never actually seen
whether it successfully completes on my data, but this dummy example
shows the code structure I am using and the time differences between
the two functions:

stk <- stack(lapply(1:10, function(x) {
 r <- raster(nrow = 1000, ncol = 1000)
 r2 <- setValues(r, sample(1:100, size = ncell(r), replace = T, prob = NULL))
}))

setOptions(todisk = TRUE)  # I used this option to rule out memory issues
t1 <- Sys.time()
med <- calc(stk, Median, filename = "med.tif", datatype = "INT2S",
overwrite = T)
Sys.time() - t1  # 7.26 mins

t2 <- Sys.time()
med <- calc(stk, mean, filename = "mean.tif", datatype = "INT2S", overwrite = T)
Sys.time() - t2  # 2.34 secs
setOptions(todisk = FALSE)

Am I doing something wrong with my code, or is the speed difference I
am seeing expected?

Thanks in advance for your advice.

Cheers, Lyndon
R version 2.13.1 (2011-07-08)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C/en_US.UTF-8/C/C/C/C

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets
methods   base

other attached packages:
 [1] doBy_4.4.0         MASS_7.3-13        snow_0.3-5
lme4_0.999375-39   Matrix_0.999375-50 lattice_0.19-30
multcomp_1.2-6
 [8] mvtnorm_0.9-999    R2HTML_2.2         survival_2.36-9
fields_6.3         spam_0.23-0        rgdal_0.7-1        raster_1.8-39
[15] sp_0.9-83

loaded via a namespace (and not attached):
[1] grid_2.13.1   nlme_3.1-101  stats4_2.13.1 tools_2.13.1
#
Lyndon, 

Median (big M) was intended for use like Median(s) not for use in calc
(which did not exist yet); but I am surprised how slow it is in calc. Now we
have calc, you can use median (small m) and that is much quicker; and about
the same as Median(s); I think I will remove Median as it is no longer
needed. 

Either way, median will be slower than mean, which is expected:
user  system elapsed 
   0.09    0.00    0.09
user  system elapsed 
   0.56    0.06    0.63
user  system elapsed 
   4.75    0.02    4.79
user  system elapsed 
   0.25    0.00    0.26
user  system elapsed 
   7.73    0.00    7.75
user  system elapsed 
   8.48    0.00    8.54
user  system elapsed 
  28.15    0.03   28.27
Robert




--
View this message in context: http://r-sig-geo.2731867.n2.nabble.com/raster-Median-function-very-slow-on-stack-compared-to-mean-expected-behavior-tp6609202p6611150.html
Sent from the R-sig-geo mailing list archive at Nabble.com.
1 day later
#
Hi Robert,

Many thanks for the answer.  I tried this again using "median" rather
than "Median", and got the following results on my example:

setOptions(todisk = TRUE)  # I used this option to rule out memory issues
system.time(med <- calc(stk, median, filename = "med.tif", datatype =
"INT2S", overwrite = T))
#   user  system elapsed
#108.856   1.158 115.998

system.time(med <- calc(stk, mean, filename = "mean.tif", datatype =
"INT2S", overwrite = T))
setOptions(todisk = FALSE)
#   user  system elapsed
#  1.232   0.336   1.663

So slower than mean, of course, but I can now get a result on my
larger rasters.

Thanks again,

Lyndon
On Fri, Jul 22, 2011 at 12:50 PM, Robert Hijmans <r.hijmans at gmail.com> wrote: