class : RasterBrick
dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
resolution : 500, 500 (x, y)
extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
crs : NA
source : /tmp/Rtmp9W8UNc/raster/r_tmp_2019-11-27_191205_2929_20324.grd
names : index_5, index_6, index_7, index_1, index_2, index_3,
index_4
min values : 444.6946, 440.2028, 429.6900, 442.7436, 440.0467, 444.9182,
437.1589
max values : 565.8671, 560.1375, 561.7972, 556.2471, 563.8341, 561.7687,
560.4509
class : RasterStack
dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
resolution : 500, 500 (x, y)
extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
crs : NA
names : layer.1, layer.2, layer.3, layer.4, layer.5, layer.6,
layer.7
min values : 442.7436, 440.0467, 444.9182, 437.1589, 444.6946, 440.2028,
429.6900
max values : 556.2471, 563.8341, 561.7687, 560.4509, 565.8671, 560.1375,
561.7972
On 11/26/19 10:58 PM, Vijay Lulla wrote:
Hmm...it appears that stackApply is using different conditions which might
be causing this problem. Below is the snippet of the code which I think
might be the problem.
## For canProcessInMemory
if (rowcalc) {
v <- lapply(uin, function(i) fun(x[, ind == i, drop = FALSE], na.rm =
na.rm))
}
else {
v <- lapply(uin, function(i, ...) apply(x[, ind == i, drop = FALSE], 1,
fun, na.rm = na.rm))
}
## If canProcessInMemory is not TRUE
if (rowcalc) {
v <- lapply(uin, function(i) fun(a[, ind == uin[i], drop = FALSE], na.rm
= na.rm))
}
else {
v <- lapply(uin, function(i, ...) apply(a[, ind == uin[i], drop =
FALSE], 1, fun, na.rm = na.rm))
}
I think they should both be same but it appears that they aren't and
that's what you've discovered. Maybe you can try fix(stackApply) to see if
it really is the problem and can tell us what you find. Anyways, good
catch...and...sorry for wasting your time.
Cordially,
Vijay.
On Tue, Nov 26, 2019 at 2:53 PM Leonidas Liakos <leonidas_liakos at yahoo.gr>
wrote:
Thank you!
The problem is not with the resulting values but with the index mapping.
Values are correct in all three cases.
As I wrote in a previous post in the thread (
https://stat.ethz.ch/pipermail/r-sig-geo/2019-November/027821.html) ,
stackApply behaves inconsistently depending on whether the exported stack
will remain in memory or it will be stored, due to its large size, on the
hard disk.
In the first case the indices are identical to my function (ver_mean) and
the lubridate::wday indexing system (and are correct) while in the second
they are shuffled.
So, while Sunday has index 1 and while in the first case (when the result
is in memory) stackApply returns the correct index, in the second case
(when the result is stored on the hard disk) it returns index_4! So how can
one be sure if index_1 corresponds to Sunday or another day using
stackApply since it sometimes enumerates it with index_1 and sometimes
index_4?
Try to run this example (when the resulting stack remains in memory) to
see that the indexes are identical (stackApply = ver_median =
lubridate::wday)
https://gist.github.com/kokkytos/5d554b5a725bb48d2189e2d1fa0e2206
Thank you again
On 11/26/19 9:00 PM, Vijay Lulla wrote:
I'm sorry for the miscommunication. What I meant to say is that the
output from stackApply and zApply are the same (because zApply uses
stackApply internally) except the names. The behavior of stackApply makes
sense because AFAIUI R doesn't automatically reorder vectors/indices that
it receives. Your observation about inconsistent result with ver_mean is
very valid though! And, that's what I meant with my comment that using
sapply with the explicit ordering that you want is the best way to control
what raster package will output. In R the input order should be maintained
(this is the prime difference between SQL and R) but packages/tools do not
always adhere to this...so it's never clear how the output will be
ordered. Sorry for the confusion.
On Tue, Nov 26, 2019 at 12:22 PM Leonidas Liakos <
leonidas_liakos at yahoo.gr> wrote:
Why do they seem logical since they do not match?
Check for example index 1 (Sunday). The results are different for the
three processes
class : RasterBrick
dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
resolution : 500, 500 (x, y)
extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
crs : NA
source :
/tmp/RtmpkRMXLb/raster/r_tmp_2019-11-26_191359_7710_20324.grd
names : index_5, index_6, index_7, index_1, index_2,
index_3, index_4
min values : 440.0467, 444.9182, 437.1589, 444.6946, 440.2028, 429.6900,
442.7436
max values : 563.8341, 561.7687, 560.4509, 565.8671, 560.1375, 561.7972,
556.2471
class : RasterStack
dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
resolution : 500, 500 (x, y)
extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
crs : NA
names : layer.1, layer.2, layer.3, layer.4, layer.5,
layer.6, layer.7
min values : 442.7436, 440.0467, 444.9182, 437.1589, 444.6946, 440.2028,
429.6900
max values : 556.2471, 563.8341, 561.7687, 560.4509, 565.8671, 560.1375,
561.7972
class : RasterBrick
dimensions : 300, 300, 90000, 7 (nrow, ncol, ncell, nlayers)
resolution : 500, 500 (x, y)
extent : 0, 150000, 0, 150000 (xmin, xmax, ymin, ymax)
crs : NA
source :
/tmp/RtmpkRMXLb/raster/r_tmp_2019-11-26_191439_7710_04780.grd
names : X1, X2, X3, X4, X5,
X6, X7
min values : 440.0467, 444.9182, 437.1589, 444.6946, 440.2028, 429.6900,
442.7436
max values : 563.8341, 561.7687, 560.4509, 565.8671, 560.1375, 561.7972,
556.2471
: 1, 2, 3, 4, 5, 6, 7
On 11/26/19 7:03 PM, Vijay Lulla wrote:
If you read the code/help for `stackApply` and `zApply` you'll see that
the results that you obtain make sense (at least they seem
sensible/reasonable to me). IMO, if you want to control the ordering of
your layers then just use sapply, like how you've used for ver_mean. IMO,
this is the only reliable (safe?), and quite a readable, way to accomplish
what you're trying to do.
Just my 2 cents.
-- Vijay.
On Tue, Nov 26, 2019 at 11:19 AM Leonidas Liakos via R-sig-Geo <
r-sig-geo at r-project.org> wrote:
I added raster::zApply in my tests to validate the results. However, the
indices of the names of the results are different now. Recall that the
goal is to calculate from a raster stack time series the mean per day of
the week. And that problem I have is that stackApply, zApply and
calc/sapply return different indices in the result names. New code is
available here:
https://gist.github.com/kokkytos/93f315a5ecf59c0b183f9788754bc170
I'm really curious about missing something.
On 11/20/19 3:30 AM, Frederico Faleiro wrote:
Hi Leonidas,
both results are in the same order, but the name is different.
You can rename the first as in the second:
names(res) <- names(res2)
I provided an example to help you understand the logic.
library(raster)
beginCluster(2)
r <- raster()
values(r) <- 1
# simple sequential stack from 1 to 6 in all cells
s <- stack(r, r*2, r*3, r*4, r*5, r*6)
s
res <- clusterR(s, stackApply, args = list(indices=c(2,2,3,3,1,1), fun
= mean))
res
res2 <- stackApply(s, c(2,2,3,3,1,1), mean)
res2
dif <- res - res2
# exatly the same order because the difference is zero for all layers
dif
# rename
names(res) <- names(res2)
Best regards,
Frederico Faleiro
On Tue, Nov 19, 2019 at 4:15 PM Leonidas Liakos via R-sig-Geo
<r-sig-geo at r-project.org <mailto:r-sig-geo at r-project.org>> wrote:
I run the example with clusterR:
no_cores <- parallel::detectCores() -1
raster::beginCluster(no_cores)
?????? res <- raster::clusterR(inp, raster::stackApply, args =
list(indices=c(2,2,3,3,1,1),fun = mean))
raster::endCluster()
And the result is:
class?????????? : RasterBrick
dimensions : 180, 360, 64800, 3?? (nrow, ncol, ncell, nlayers)
resolution : 1, 1?? (x, y)
extent???????? : -180, 180, -90, 90?? (xmin, xmax, ymin, ymax)
crs?????????????? : +proj=longlat +datum=WGS84 +ellps=WGS84
+towgs84=0,0,0
source???????? : memory
names?????????? : layer.1, layer.2, layer.3
min values :???????? 1.5,???????? 3.5,???????? 5.5
max values :???????? 1.5,???????? 3.5,???????? 5.5??
layer.1, layer.2, layer.3 (?)
So what corrensponds to what?
If I run:
res2 <- stackApply(inp,c(2,2,3,3,1,1),mean)
The result is:
class : RasterBrick
dimensions : 180, 360, 64800, 3 (nrow, ncol, ncell, nlayers)
resolution : 1, 1 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +ellps=WGS84
source : memory
names : index_2, index_3, index_1
min values : 1.5, 3.5, 5.5
max values : 1.5, 3.5, 5.5
There is no consistency with the names of the output and obscure
correspondence with the indices in the case of clusterR
[[alternative HTML version deleted]]