parallel raster processing with calc and mc2d monte carlo simulation
The below works for me. You used function 'f' to clusterR, where it
should have been 'calc'
library(raster)
library(snow)
library(mc2d)
f <- function(x) {
dBC_BA <- mcdata(x[1], type="0")
dBC_BA_SE <- mcdata(x[2], type = "0")
SlopePer <-x[3]
stBA <- mcstoc(rnorm, type = "U", rtrunc = TRUE, mean = dBC_BA, sd =
dBC_BA_SE, linf = 0, lhs = FALSE)
BC_AGWBC <- 0.6419 + 0.9307*stBA + (-0.0176)*SlopePer
AGWBC <- (0.2626263 * BC_AGWBC + 1)^(1/0.2626263)-1
quantile(AGWBC[], c(0.025, 0.5, 0.975), na.rm=TRUE)
}
b <- brick(system.file("external/rlogo.grd", package="raster"))
x <- calc(b, f)
# new function for cluster. You could also rewrite f to include calc
ff <- function(x) calc(x, f)
beginCluster()
y <- clusterR(b, fun=ff, export='f')
endCluster()
On Tue, Mar 17, 2015 at 9:21 AM, Sean Kearney
<sean.kearney at alumni.ubc.ca> wrote:
Hi Robert,
Thanks for the advice. So I tried exporting the objects defined in the function (lm.final and lambda_DV) with the following code:
library(raster)
library(rgdal)
library(mc2d)
brick <- brick(BC_BA, BC_BA_SE, SlopePer) ## Stack three rasters into one RasterBrick
testbrick <- crop(brick, extent(299700, 300100, 1553550, 1553650)) ##Crop Raster Brick to manageable size
####
#### Predict Biomass using final linear model with Monte Carlo Simulation
ndunc(101)
fun.CROP_AGWBC <- function(x) {
require(mc2d)
dBC_BA <- mcdata(x[[1]], type="0")
dBC_BA_SE <- mcdata(x[[2]], type = "0")
SlopePer <-x[[3]]
stBA <- mcstoc(rnorm, type = "U", rtrunc = TRUE,
mean = dBC_BA, sd = dBC_BA_SE, linf = 0, lhs = FALSE)
BC_AGWBC <- lm.final$coefficients[1] +
lm.final$coefficients[2]*stBA +
lm.final$coefficients[3]*SlopePer
AGWBC <- (lambda_DV * BC_AGWBC + 1)^(1/lambda_DV)-1
quantile(AGWBC[], c(0.025, 0.5, 0.975), na.rm=TRUE)
}
####
#### Check function using calc
CROP_AGWBC <- calc(testbrick, fun.CROP_AGWBC)
plot(CROP_AGWBC)
CROP_AGWBC
####
#### Run function using parallel processing
library(snow)
beginCluster(8)
library(parallel)
cl <- getCluster()
clusterExport(cl, list("lm.final", "lambda_DV"))
clusterR(x = testbrick, fun = fun.CROP_AGWBC)
endCluster()
RESULT: Again, the calc process works fine to produce object CROP_AGWBC but when I try to run the last bit with parallel processing, I still get the same error:
clusterR(x = testbrick, fun = fun.CROP_AGWBC)
[1] "data should be numeric or logical"
attr(,"class")
[1] "snow-try-error" "try-error"
Error in clusterR(x = testbrick, fun = fun.CROP_AGWBC) : cluster error
I was suspicious that maybe the issue was still with the export because while ?lambda_DV" is a constant, lm.final is actually an lm model, so I replaced these objects in the function with constants (see code below) but still get the same error:
REVISED CODE WITH CONSTANTS:
library(raster)
library(rgdal)
library(mc2d)
brick <- brick(BC_BA, BC_BA_SE, SlopePer) ## Stack three rasters into one RasterBrick
testbrick <- crop(brick, extent(299700, 300100, 1553550, 1553650)) ##Crop Raster Brick to manageable size
####
#### Predict Biomass using final linear model with Monte Carlo Simulation
ndunc(101)
fun2.CROP_AGWBC <- function(x) {
require(mc2d)
dBC_BA <- mcdata(x[[1]], type="0")
dBC_BA_SE <- mcdata(x[[2]], type = "0")
SlopePer <-x[[3]]
stBA <- mcstoc(rnorm, type = "U", rtrunc = TRUE,
mean = dBC_BA, sd = dBC_BA_SE, linf = 0, lhs = FALSE)
BC_AGWBC <- 0.6419 +
0.9307*stBA +
(-0.0176)*SlopePer
AGWBC <- (0.2626263 * BC_AGWBC + 1)^(1/0.2626263)-1
quantile(AGWBC[], c(0.025, 0.5, 0.975), na.rm=TRUE)
}
####
#### Check function using calc
CROP_AGWBC_2 <- calc(testbrick, fun2.CROP_AGWBC)
plot(CROP_AGWBC_2)
CROP_AGWBC_2
####
#### Run function using parallel processing
library(snow)
beginCluster(8)
clusterR(x = testbrick, fun = fun2.CROP_AGWBC)
endCluster()
RESULT: same error:
clusterR(x = testbrick, fun = fun2.CROP_AGWBC)
[1] "data should be numeric or logical" attr(,"class") [1] "snow-try-error" "try-error" Error in clusterR(x = testbrick, fun = fun2.CROP_AGWBC) : cluster error Any other thoughts? Many thanks, sean On Mar 17, 2015, at 8:14 AM, Robert J. Hijmans <r.hijmans at gmail.com> wrote:
Sean,
fun.CROP_AGWBC refers to objects that are not defined inside the
function ("lm.final" and "lambda_DV"). I assume that this is
intentional and that these represent constants; and that they are
available in your global environment. If so, you need to export these
objects to the cluster nodes. See the 'export' argument in clusterR.
You also need to load necessary packages before calling beginCluster
Robert
On Mon, Mar 16, 2015 at 1:46 PM, spkearney <sean.kearney at alumni.ubc.ca> wrote:
Hello all, and thanks in advance for any and all help you can give on this: I have set up a function to extract the 2.5%, 50% and 97.5% percentiles from a monte carlo simulation on three rasters that is to be called up using calc() in the raster package and it works great on a test-sized stack/brick, thanks to suggestions at this post here: http://grokbase.com/t/r/r-sig-geo/123cb3daaq/apply-monte-carlo-simulation-for-each-cell-in-a-matrix-originally-raster My problem, is that I want to run this function on a much larger Raster Brick that, as written, takes hours to process. I need to do this multiple times, so I am trying to speed up the processing using clusterR (or another option such as rasterEngine with multi-core processing). However, I can't get it to work! Here is the code that works on the test raster brick: brick <- brick(BC_BA, BC_BA_SE, SlopePer) ## Stack three rasters into one Raster Brick testbrick <- crop(brick, extent(299700, 300100, 1553550, 1553650)) ## Crop brick to manageable size ndunc(101) fun.CROP_AGWBC <- function(x) { dBC_BA <- mcdata(x[[1]], type="0") dBC_BA_SE <- mcdata(x[[2]], type = "0") SlopePer <-x[[3]] stBA <- mcstoc(rnorm, type = "U", rtrunc = TRUE, mean = dBC_BA, sd = dBC_BA_SE, linf = 0, lhs = FALSE) BC_AGWBC <- lm.final$coefficients[1] + lm.final$coefficients[2]*stBA + lm.final$coefficients[3]*SlopePer AGWBC <- (lambda_DV * BC_AGWBC + 1)^(1/lambda_DV)-1 quantile(AGWBC[], c(0.025, 0.5, 0.975), na.rm=TRUE) } CROP_AGWBC <- calc(teststack, fun.CROP_AGWBC) ##Run the calc function CROP_AGWBC ##Check the result plot(CROP_AGWBC) ##Plot the three-raster brick result ##Extract the individual raster layers CROP_AGWBC_PRED <- CROP_AGWBC[[2]] CROP_AGWBC_LWR <- CROP_AGWBC[[1]] CROP_AGWBC_UPR <- CROP_AGWBC[[3]] As I mentioned, the code works great on a small sample. I tried to speed it up using clusterR as follows, first testing it on the 'testbrick' Raster Brick with hopes to use it on the whole Raster Brick: beginCluster(8) clusterR(x = testbrick, fun = fun.CROP_AGWBC) and I get the following error: [1] "data should be numeric or logical" attr(,"class") [1] "snow-try-error" "try-error" Error in clusterR(x = testbrick, fun = fun.CROP_AGWBC) : cluster error It is interesting because, if I try to run it again, I get this error instead: Error in as.vector((x[, 1] - 1) * ncol(object) + x[, 2]) : error in evaluating the argument 'x' in selecting a method for function 'as.vector': Error in x[, 2] : subscript out of bounds I have tried this many different ways, including along the lines of: f <- function (x) calc(x, fun.CROP_AGWBC) y <- clusterR(testbrick, f) which gives me the same error (more or less) of: Error in checkForRemoteErrors(lapply(cl, recvResult)) : 2 nodes produced errors; first error: data should be numeric or logical And I have tried using the rasterEngine() function (first without parallel processing) by changing up the code in two ways, the first being: ndunc(101) fun.CROP_AGWBC <- function(x) { dBC_BA <- mcdata(x[[1]], type="0") dBC_BA_SE <- mcdata(x[[2]], type = "0") SlopePer <-x[[3]] stBA <- mcstoc(rnorm, type = "U", rtrunc = TRUE, mean = dBC_BA, sd = dBC_BA_SE, linf = 0, lhs = FALSE) BC_AGWBC <- lm.final$coefficients[1] + lm.final$coefficients[2]*stBA + lm.final$coefficients[3]*SlopePer AGWBC <- (lambda_DV * BC_AGWBC + 1)^(1/lambda_DV)-1 output <- quantile(AGWBC[], c(0.025, 0.5, 0.975), na.rm=TRUE) output_array <- array(output,dim=c(dim(x)[1],dim(x)[2],3)) return(output_array) } re <- rasterEngine(x = testbrick, fun = fun.CROP_AGWBC) which runs but gives me a 3-layer Raster Brick all with NA's or Inf. The second thing I tried used the same fun.CROP_AGWBC function as above, but with the following rasterEngine code to call up the calc formula: f <- function(x) {reout <- calc(x, fun.CROP_AGWBC) reout_array <- array(getValues(reout), dim=c(dim(x)[1],dim(x)[2],3)) return(reout_array) } re <- rasterEngine(x = testbrick, fun = f, chunk_format = "raster") which gives me the following error, even though I thought I converted the output to an array: chunk processing units require array vector outputs. Please check your function. Error in focal_hpc_test(x, fun, window_center, window_dims, args, layer_names, : So, my questions are as follows: *1) Does anyone know why the clusterR does not work for the calc() function in my first attempt? I imagine it has something to do with the conversion of rasters to mcnodes in the function, but can't figure it out! Any suggestions? 2) Any thoughts on why I can't get this to work with the rasterEngine() function? I am converting the outputs to arrays with the same dimensions as the input file, but still no luck.* Again, any help is much appreciated. Any suggestions for improving this question are welcome and I'll do my best to update it - this is my first post! Kind Regards, sean -- View this message in context: http://r-sig-geo.2731867.n2.nabble.com/parallel-raster-processing-with-calc-and-mc2d-monte-carlo-simulation-tp7587901.html Sent from the R-sig-geo mailing list archive at Nabble.com.
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo