Skip to content

rgrass7 and snow

8 messages · Rainer M Krug, Didier Leibovici, Roger Bivand

#
Hi,

we are trying to use 'grass' in parallel programming with 'snow'

The code does multiple simulations of a r.viewshed resampling the points 
generating the viewshed.
(I guess the other solution would be to have an equivalent of r.viewshed 
in an R library or script) ...
I think the problems are to do with the gisDbase ...

Testing with 2 clusters we start with:
 >  clusterCall(cl,initGRASS,"/usr/lib/grass70",home=getwd(), 
gisDbase="GRASS_TEMP", override=TRUE )
[[1]]
gisdbase    GRASS_TEMP
location    file5edb6bf06d70
mapset      file5edb4b0cda81
rows        1
columns     1
north       1
south       0
west        0
east        1
nsres       1
ewres       1
projection  NA

[[2]]
gisdbase    GRASS_TEMP
location    file5edb6bf06d70
mapset      file5edb4b0cda81

then read a DEM
clusterCall(cl,execGRASS,"r.in.gdal", flags="o", 
parameters=list(input=baseDemFilename, output="DEM"))
   clusterCall(cl,execGRASS, "g.region", parameters=list(raster="DEM"))

and then loop on the sampled points ... involving
  execGRASS("r.viewshed", parameters = list(input = "DEM", output = 
"cumulativeViewshed", max_distance=maxDistance, coordinates = 
as.integer(coords[i,])), flags = c("overwrite" , "b","quiet"))

  and cumulating the viewshed (reading the ouput using 
readRAST("cumulativeViewshed")), all this (loop over the points)within a 
function simul() called
by a
  parSapply(cl,1:nDsimul,simul)


Here is the error
Error in checkForRemoteErrors(val) :
   2 nodes produced errors; first error: no such file: 
GRASS_TEMP/file5edb6bf06d70/file5edb4b0cda81/.tmp/geoprocessing/cumulativeViewshed
Calls: parSapply ... clusterApply -> staticClusterApply -> 
checkForRemoteErrors

any idea?

thanks,

DIdier
#
Le vendredi 22 avril 2016, Dr Didier G. Leibovici <
didier.leibovici at nottingham.ac.uk> a ?crit :
The problem is likely that you are working in parallel in the same map set.
I would suggest to use a separate Mauser for each parallel task to write
to, read from a different map set which no parallel task is writing to, and
finally, after all threads are finished, you can collect the results from
each thread in one map set and delete the temporary map sets.

Cheers,

Rainer

  
    
#
On 22/04/2016 16:21, Rainer M Krug wrote:
Yes all the problem is there for each cluster to work on a separate 
GRASS_TEMP
( we have verified that it was running with I cl only!)
Can we generate randomly the location or mapset so they will not 
collapse onto the same?

DIdier

  
    
#
"Dr Didier G. Leibovici" <didier.leibovici at nottingham.ac.uk> writes:
I used a setup for parallel processing (simulating spread under
different scenarios), where I had

1) one locaction in which everything happened, which contained:
2) one mapset which was read-only for each parallel task
3) one mapset for each task in which the results were written; this was
a folder, which was created for each parallel task
4) one mapset into which the individual simulation tasks were analysed into.

So finally, I had a mapset for each simulation, and one in which the
analysis was.

So yes, why shouldn't you be able to create a mapset per task? You can
do it during the initialization by using GRASS.

Hope this helps,

Rainer

  
    
#
can you paste the call to create n mapsets. thanks
On 22/04/2016 17:37, Rainer M Krug wrote:

  
    
#
"Dr Didier G. Leibovici" <didier.leibovici at nottingham.ac.uk> writes:
You can create a mapset with

g.mapset -c mapset=THE_NAME_OF_THE_MAPSET

You can do this in a loop in an R script, or bash script - whatever you
are more familiar with.


Rainer

  
    
#
Yes but then how the clusters use a different one i.e. how did you write 
your initGRASS()

thanks
On 22/04/2016 18:27, Rainer M Krug wrote:

  
    
#
On Fri, 22 Apr 2016, Dr Didier G. Leibovici wrote:

            
I think that if you play around with execGRASS(), you'll find that your 
temporary location has a PERMANENT mapset, and a working mapset. Look at 
the directory to which you pointed home=; those two directories should be 
present. Maybe use list.files(). Then use execGRASS("g.mapset", 
mapset="<new0>", flags="c") to create one and switch to it. I guess you'd 
need as many new mapsets as cores, and a vector of mapset names. Remember 
that you can use g.mapsets to see or change your mapset search path, and 
you can use the @mapset notation for any vector or raster, so the nodes 
can each have their own spaces. Maybe use g.copy to copy data out to nodes 
to avoid race conditions if reading from the same GRASS database file by 
multiple nodes.

It would be useful to have an example of what you need to do, simulation 
being fairly obvious, but I think r.in.gdal will be problematic, better to 
read once and copy out to the simulation mapsets by repeated g.copy. The 
more limited the use of initGRASS on each core, probably the better.

Hope this helps,

Roger