Skip to content

readRAST6() in {spgrass6}

13 messages · Pierre Roudier, Sam Prentice, Dylan Beaudette +1 more

#
On Tue, 10 May 2011, Sam Prentice wrote:

            
Is the difficulty perhaps associated with the size of these rasters? The 
total size of the object will be over 2GB, are you using 64-bit Windows? 
How much memory do you have (over 16GB?)? Have you considered exporting 
the data manually (or using execGRASS()) from GRASS and using the raster 
package to use the data without necessarily reading it all into R, which 
will try to hold everything in memory unless you choose not to do so?

Could you make the same classification on subsets of your data? If you set 
your GRASS region to say 1000 by 1000, the problem may be resolved, 
because the object size will be much smaller, about 50MB. Older versions 
of r.out.gdal in GRASS, used here for loose coupling, did not respect 
region settings - is your GRASS version current, and the output region as 
chosen?

Hope this helps,

Roger

  
    
#
Could that message have anything to do with the fact that this is a
layer from the PERMANENT mapset?

My two cents,

Pierre

2011/5/11 Sam Prentice <sep at umail.ucsb.edu>:

  
    
5 days later
#
Thanks you for your suggestions. This error is not affected by choice of
mapset, and is not limited by my hardware.

Further info: I'm running 64-bit windows, 16MB memory, w/multiple cores.
GRASS 6.4.1, R 2.12.2. I'm running standalone R using initGRASS() to set
GRASS env parameters. The GRASS files I'm trying to read-in are DEM
derivatives, each ~170MB. The intermediate .tmp file in the error message is
300MB.

It's been suggested that I am "likely trying to import too much data into
R". Assuming non-limiting computer power, what is the upper bound on amount
of data that can be read into R from GRASS? If not a fixed amount, what are
the main conditions that cause it to vary? I did not see this addressed in
the {spgrass6} documentation, so I'm assuming it's a base R limitation, but
maybe not? Quantifying this limitation will help determined how to move
forward (e.g, resampling at smaller scale versus subsetting my DEM
derivatives).

Thanks,
Sam

|-----Original Message-----
|From: Pierre Roudier [mailto:pierre.roudier at gmail.com]
|Sent: Tuesday, May 10, 2011 3:40 PM
|To: Sam Prentice
|Cc: r-sig-geo at r-project.org
|Subject: Re: [R-sig-Geo] readRAST6() in {spgrass6}
|
|Could that message have anything to do with the fact that this is a layer
from
|the PERMANENT mapset?
|
|My two cents,
|
|Pierre
|
|2011/5/11 Sam Prentice <sep at umail.ucsb.edu>:
|> Hi,
|>
|>
|>
|> I'm running R 2.12.2 via Tinn-R on Windows Server 2008. I'm using R
|> for cluster analysis for terrain classification and I'm getting the
|> following error when parsing GRASS data into R:
|>
|>
|>
|>> x =
|>
|readRAST6(c("param_elev3","param_crosC_1m","param_longC_1m","param
|_slope5","
|> param_profC","param_miniC_1m","param_maxiC_1m"))
|>
|> D:/GRASSdata/Sedgwick2/PERMANENT/.tmp/param_elev3 has GDAL driver
|> GTiff
|>
|> and has 6224 rows and 6242 columns
|>
|>
|>
|> Error in deleteDataset(DS) :
|>
|> ? ? ? ? ? ? ? ?GDAL Error 1: Deleting
|> D:/GRASSdata/Sedgwick2/PERMANENT/.tmp/param_elev3 failed:
|>
|> Permission denied
|>
|>
|>
|> On the surface this looked like an issue with user privileges, since I
|> do not have admin-level user privileges on this machine. However, this
|> has been corrected - I now have permissions on the .tmp directory
|> listed in the error, and I can create, append, and delete any file in
|> that location, but the error is still occurring.
|>
|>
|>
|> Thoughts?
|>
|>
|>
|> Thanks,
|>
|> Sam
|>
|>
|> ? ? ? ?[[alternative HTML version deleted]]
|>
|> _______________________________________________
|> R-sig-Geo mailing list
|> R-sig-Geo at r-project.org
|> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
|>
|
|
|
|--
|Scientist
|Landcare Research, New Zealand
#
It is hard to know what the upper-limit is on Windows-- as the memory
is usually fragmented by the operating system.

Since you are using GRASS, I would first try a coarsened version of
your terrain-shape rasters via the built-in re-sampling functionality.
While this is not ideal, as it is NN-based, it will give a quick way
to try out methods.

After you have initialized your GRASS connection in R, execute something like:

system('g.region res=10 -ap')

This will align the region along a 10x10 map unit grid. All further
interaction with raster data will be automatically re-sampled to this
grid system.

Dylan
On Mon, May 16, 2011 at 2:25 PM, Sam Prentice <sep at umail.ucsb.edu> wrote:
#
On Mon, 16 May 2011, Dylan Beaudette wrote:

            
Correct, the upper limit on other OS is higher in practice.
Right, for some res=, you will get a systematic sample that will let you 
decimate your data. Randomly shifting the window while resampling will 
help you see whether your classification model boundaries change.

As both Dylan and I have said, there is no good reason for calibrating the 
classification model with more data than necessary, even if it were 
possible. The only reason to use very much data would be if discrimination 
between very rare classes is your target, but even then you could stratify 
your sample to get better representation of critical areas.

You seem to be looking for classification signatures, from which you are 
going to predict. Sampling will give you distributions of these 
signatures, which could be used for similation by tile. If you are very 
concerned about the statistical quality of your classification signatures, 
you could go fuzzy, but your main goal is to get to the distributions of 
these signatures, IMO. Once you have them, you can predict. You do not 
need to have GBs of dat in memory to do this adequately.

So one could agree that software (OS, R, whatever) is your limitation, but 
it is only a limitation if you are not able to consider more statistical 
approaches to your apparent problem, which is finding out how to classify 
your input, and then to predict to output.

Roger

  
    
#
Thanks for the help. I understand there are more elegant and efficient
approaches to classification, and they are becoming more clear in time, out
of experience and necessity of my research. Such is learning. Nonetheless,
I'm also fairly confident about what I'm trying to achieve in the moment (a
classified landform element map), why (to guide representative soil sampling
of said landform elements), and what my constraints are (impending soil dry
down). I assumed incorrectly that a published approach could be emulated not
with ease, but without a several month time commitment. Please excuse my
naivet? in regards to the tools, and thanks again for your feedback.

Sam 

|-----Original Message-----
|From: Roger Bivand [mailto:Roger.Bivand at nhh.no]
|Sent: Monday, May 16, 2011 9:28 PM
|To: Sam Prentice
|Cc: Dylan Beaudette; r-sig-geo at r-project.org
|Subject: Re: [R-sig-Geo] readRAST6() in {spgrass6}
|
|On Mon, 16 May 2011, Dylan Beaudette wrote:
|
|> It is hard to know what the upper-limit is on Windows-- as the memory
|> is usually fragmented by the operating system.
|
|Correct, the upper limit on other OS is higher in practice.
|
|>
|> Since you are using GRASS, I would first try a coarsened version of
|> your terrain-shape rasters via the built-in re-sampling functionality.
|> While this is not ideal, as it is NN-based, it will give a quick way
|> to try out methods.
|>
|> After you have initialized your GRASS connection in R, execute something
|like:
|>
|> system('g.region res=10 -ap')
|>
|> This will align the region along a 10x10 map unit grid. All further
|> interaction with raster data will be automatically re-sampled to this
|> grid system.
|
|Right, for some res=, you will get a systematic sample that will let you
|decimate your data. Randomly shifting the window while resampling will help
|you see whether your classification model boundaries change.
|
|As both Dylan and I have said, there is no good reason for calibrating the
|classification model with more data than necessary, even if it were
possible.
|The only reason to use very much data would be if discrimination between
|very rare classes is your target, but even then you could stratify your
sample
|to get better representation of critical areas.
|
|You seem to be looking for classification signatures, from which you are
going
|to predict. Sampling will give you distributions of these signatures, which
could
|be used for similation by tile. If you are very concerned about the
statistical
|quality of your classification signatures, you could go fuzzy, but your
main goal
|is to get to the distributions of these signatures, IMO. Once you have
them,
|you can predict. You do not need to have GBs of dat in memory to do this
|adequately.
|
|So one could agree that software (OS, R, whatever) is your limitation, but
it is
|only a limitation if you are not able to consider more statistical
approaches to
|your apparent problem, which is finding out how to classify your input, and
|then to predict to output.
|
|Roger
|
|>
|> Dylan
|>
|> On Mon, May 16, 2011 at 2:25 PM, Sam Prentice <sep at umail.ucsb.edu>
|wrote:
|>> Thanks you for your suggestions. This error is not affected by choice
|>> of mapset, and is not limited by my hardware.
|>>
|>> Further info: I'm running 64-bit windows, 16MB memory, w/multiple cores.
|>> GRASS 6.4.1, R 2.12.2. I'm running standalone R using initGRASS() to
|>> set GRASS env parameters. The GRASS files I'm trying to read-in are
|>> DEM derivatives, each ~170MB. The intermediate .tmp file in the error
|>> message is 300MB.
|>>
|>> It's been suggested that I am "likely trying to import too much data
|>> into R". Assuming non-limiting computer power, what is the upper
|>> bound on amount of data that can be read into R from GRASS? If not a
|>> fixed amount, what are the main conditions that cause it to vary? I
|>> did not see this addressed in the {spgrass6} documentation, so I'm
|>> assuming it's a base R limitation, but maybe not? Quantifying this
|>> limitation will help determined how to move forward (e.g, resampling
|>> at smaller scale versus subsetting my DEM derivatives).
|>>
|>> Thanks,
|>> Sam
|>>
|>> |-----Original Message-----
|>> |From: Pierre Roudier [mailto:pierre.roudier at gmail.com]
|>> |Sent: Tuesday, May 10, 2011 3:40 PM
|>> |To: Sam Prentice
|>> |Cc: r-sig-geo at r-project.org
|>> |Subject: Re: [R-sig-Geo] readRAST6() in {spgrass6}
|>> |
|>> |Could that message have anything to do with the fact that this is a
|>> |layer
|>> from
|>> |the PERMANENT mapset?
|>> |
|>> |My two cents,
|>> |
|>> |Pierre
|>> |
|>> |2011/5/11 Sam Prentice <sep at umail.ucsb.edu>:
|>> |> Hi,
|>> |>
|>> |>
|>> |>
|>> |> I'm running R 2.12.2 via Tinn-R on Windows Server 2008. I'm using
|>> |> R for cluster analysis for terrain classification and I'm getting
|>> |> the following error when parsing GRASS data into R:
|>> |>
|>> |>
|>> |>
|>> |>> x =
|>> |>
|>>
||readRAST6(c("param_elev3","param_crosC_1m","param_longC_1m","para
|m
|>> |_slope5","
|>> |> param_profC","param_miniC_1m","param_maxiC_1m"))
|>> |>
|>> |> D:/GRASSdata/Sedgwick2/PERMANENT/.tmp/param_elev3 has GDAL
|driver
|>> |> GTiff
|>> |>
|>> |> and has 6224 rows and 6242 columns
|>> |>
|>> |>
|>> |>
|>> |> Error in deleteDataset(DS) :
|>> |>
|>> |> ? ? ? ? ? ? ? ?GDAL Error 1: Deleting
|>> |> D:/GRASSdata/Sedgwick2/PERMANENT/.tmp/param_elev3 failed:
|>> |>
|>> |> Permission denied
|>> |>
|>> |>
|>> |>
|>> |> On the surface this looked like an issue with user privileges,
|>> |> since I do not have admin-level user privileges on this machine.
|>> |> However, this has been corrected - I now have permissions on the
|>> |> .tmp directory listed in the error, and I can create, append, and
|>> |> delete any file in that location, but the error is still occurring.
|>> |>
|>> |>
|>> |>
|>> |> Thoughts?
|>> |>
|>> |>
|>> |>
|>> |> Thanks,
|>> |>
|>> |> Sam
|>> |>
|>> |>
|>> |> ? ? ? ?[[alternative HTML version deleted]]
|>> |>
|>> |> _______________________________________________
|>> |> R-sig-Geo mailing list
|>> |> R-sig-Geo at r-project.org
|>> |> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
|>> |>
|>> |
|>> |
|>> |
|>> |--
|>> |Scientist
|>> |Landcare Research, New Zealand
|>>
|>> _______________________________________________
|>> R-sig-Geo mailing list
|>> R-sig-Geo at r-project.org
|>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
|>>
|>
|> _______________________________________________
|> R-sig-Geo mailing list
|> R-sig-Geo at r-project.org
|> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
|>
|
|--
|Roger Bivand
|Economic Geography Section, Department of Economics, Norwegian School of
|Economics and Business Administration, Helleveien 30, N-5045 Bergen,
|Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
|e-mail: Roger.Bivand at nhh.no
#
Hello,

After resampling down and subsetting my DEM derivatives using g.region, I am
getting the same error. The first input file ("clip1_er1m") is now 550KB,
and the .tmp file in the error is 800KB. Code and output copied below. Am I
correct now in concluding this is not an issue with data size/structure? Any
further thoughts about how to proceed?

Sam

# PREPEND
require(spgrass6)
require(cluster)

# READ IN SPATIAL DATA FROM GRASS IN R SESSION
initGRASS("C:/Program Files (x86)/GRASS 6.4.1", home = tempdir(),
,"D:/GRASSdata", "Sedgwick2", "PERMANENT", override = "TRUE")
gmeta6 = gmeta6()
gisdbase    D:/GRASSdata 
location    Sedgwick2 
mapset      PERMANENT 
rows        500 
columns     402 
north       -3569501 
south       -3570001 
west        -19670403 
east        -19670001 
nsres       1 
ewres       1 
projection  +proj=utm +no_defs +zone=10 +a=6378137 +rf=298.257223563
+towgs84=0.000,0.000,0.000 +to_meter=1

#read in raster files
x =
readRAST6(c("clip1_er1m","clip1_crosC_1m","clip1_longC_1m","clip1_slope5","c
lip1_profC","clip1_miniC_1m","clip1_maxiC_1m"))
D:/GRASSdata/Sedgwick2/PERMANENT/.tmp/clip1_er1m has GDAL driver GTiff 
and has 500 rows and 402 columns

Permission denied
Error in deleteDataset(DS) : 
	GDAL Error 1: Deleting
D:/GRASSdata/Sedgwick2/PERMANENT/.tmp/clip1_er1m failed:
Permission denied



|-----Original Message-----
|From: Roger Bivand [mailto:Roger.Bivand at nhh.no]
|Sent: Monday, May 16, 2011 9:28 PM
|To: Sam Prentice
|Cc: Dylan Beaudette; r-sig-geo at r-project.org
|Subject: Re: [R-sig-Geo] readRAST6() in {spgrass6}
|
|On Mon, 16 May 2011, Dylan Beaudette wrote:
|
|> It is hard to know what the upper-limit is on Windows-- as the memory
|> is usually fragmented by the operating system.
|
|Correct, the upper limit on other OS is higher in practice.
|
|>
|> Since you are using GRASS, I would first try a coarsened version of
|> your terrain-shape rasters via the built-in re-sampling functionality.
|> While this is not ideal, as it is NN-based, it will give a quick way
|> to try out methods.
|>
|> After you have initialized your GRASS connection in R, execute something
|like:
|>
|> system('g.region res=10 -ap')
|>
|> This will align the region along a 10x10 map unit grid. All further
|> interaction with raster data will be automatically re-sampled to this
|> grid system.
|
|Right, for some res=, you will get a systematic sample that will let you
|decimate your data. Randomly shifting the window while resampling will help
|you see whether your classification model boundaries change.
|
|As both Dylan and I have said, there is no good reason for calibrating the
|classification model with more data than necessary, even if it were
possible.
|The only reason to use very much data would be if discrimination between
|very rare classes is your target, but even then you could stratify your
sample
|to get better representation of critical areas.
|
|You seem to be looking for classification signatures, from which you are
going
|to predict. Sampling will give you distributions of these signatures, which
could
|be used for similation by tile. If you are very concerned about the
statistical
|quality of your classification signatures, you could go fuzzy, but your
main goal
|is to get to the distributions of these signatures, IMO. Once you have
them,
|you can predict. You do not need to have GBs of dat in memory to do this
|adequately.
|
|So one could agree that software (OS, R, whatever) is your limitation, but
it is
|only a limitation if you are not able to consider more statistical
approaches to
|your apparent problem, which is finding out how to classify your input, and
|then to predict to output.
|
|Roger
|
|>
|> Dylan
|>
|> On Mon, May 16, 2011 at 2:25 PM, Sam Prentice <sep at umail.ucsb.edu>
|wrote:
|>> Thanks you for your suggestions. This error is not affected by choice
|>> of mapset, and is not limited by my hardware.
|>>
|>> Further info: I'm running 64-bit windows, 16MB memory, w/multiple cores.
|>> GRASS 6.4.1, R 2.12.2. I'm running standalone R using initGRASS() to
|>> set GRASS env parameters. The GRASS files I'm trying to read-in are
|>> DEM derivatives, each ~170MB. The intermediate .tmp file in the error
|>> message is 300MB.
|>>
|>> It's been suggested that I am "likely trying to import too much data
|>> into R". Assuming non-limiting computer power, what is the upper
|>> bound on amount of data that can be read into R from GRASS? If not a
|>> fixed amount, what are the main conditions that cause it to vary? I
|>> did not see this addressed in the {spgrass6} documentation, so I'm
|>> assuming it's a base R limitation, but maybe not? Quantifying this
|>> limitation will help determined how to move forward (e.g, resampling
|>> at smaller scale versus subsetting my DEM derivatives).
|>>
|>> Thanks,
|>> Sam
|>>
|>> |-----Original Message-----
|>> |From: Pierre Roudier [mailto:pierre.roudier at gmail.com]
|>> |Sent: Tuesday, May 10, 2011 3:40 PM
|>> |To: Sam Prentice
|>> |Cc: r-sig-geo at r-project.org
|>> |Subject: Re: [R-sig-Geo] readRAST6() in {spgrass6}
|>> |
|>> |Could that message have anything to do with the fact that this is a
|>> |layer
|>> from
|>> |the PERMANENT mapset?
|>> |
|>> |My two cents,
|>> |
|>> |Pierre
|>> |
|>> |2011/5/11 Sam Prentice <sep at umail.ucsb.edu>:
|>> |> Hi,
|>> |>
|>> |>
|>> |>
|>> |> I'm running R 2.12.2 via Tinn-R on Windows Server 2008. I'm using
|>> |> R for cluster analysis for terrain classification and I'm getting
|>> |> the following error when parsing GRASS data into R:
|>> |>
|>> |>
|>> |>
|>> |>> x =
|>> |>
|>>
||readRAST6(c("param_elev3","param_crosC_1m","param_longC_1m","para
|m
|>> |_slope5","
|>> |> param_profC","param_miniC_1m","param_maxiC_1m"))
|>> |>
|>> |> D:/GRASSdata/Sedgwick2/PERMANENT/.tmp/param_elev3 has GDAL
|driver
|>> |> GTiff
|>> |>
|>> |> and has 6224 rows and 6242 columns
|>> |>
|>> |>
|>> |>
|>> |> Error in deleteDataset(DS) :
|>> |>
|>> |>                GDAL Error 1: Deleting
|>> |> D:/GRASSdata/Sedgwick2/PERMANENT/.tmp/param_elev3 failed:
|>> |>
|>> |> Permission denied
|>> |>
|>> |>
|>> |>
|>> |> On the surface this looked like an issue with user privileges,
|>> |> since I do not have admin-level user privileges on this machine.
|>> |> However, this has been corrected - I now have permissions on the
|>> |> .tmp directory listed in the error, and I can create, append, and
|>> |> delete any file in that location, but the error is still occurring.
|>> |>
|>> |>
|>> |>
|>> |> Thoughts?
|>> |>
|>> |>
|>> |>
|>> |> Thanks,
|>> |>
|>> |> Sam
|>> |>
|>> |>
|>> |>        [[alternative HTML version deleted]]
|>> |>
|>> |> _______________________________________________
|>> |> R-sig-Geo mailing list
|>> |> R-sig-Geo at r-project.org
|>> |> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
|>> |>
|>> |
|>> |
|>> |
|>> |--
|>> |Scientist
|>> |Landcare Research, New Zealand
|>>
|>> _______________________________________________
|>> R-sig-Geo mailing list
|>> R-sig-Geo at r-project.org
|>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
|>>
|>
|> _______________________________________________
|> R-sig-Geo mailing list
|> R-sig-Geo at r-project.org
|> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
|>
|
|--
|Roger Bivand
|Economic Geography Section, Department of Economics, Norwegian School of
|Economics and Business Administration, Helleveien 30, N-5045 Bergen,
|Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
|e-mail: Roger.Bivand at nhh.no
#
On Tuesday, May 17, 2011, Sam Prentice wrote:
Try something like this and report back:
x <- 
readRAST6(vname=c("clip1_er1m","clip1_crosC_1m","clip1_longC_1m","clip1_slope5","clip1_profC","clip1_miniC_1m","clip1_maxiC_1m"), 
plugin=TRUE, useGDAL=FALSE)

Dylan
#
On Tue, 17 May 2011, Sam Prentice wrote:

            
Fortunately, you have now given some useful information, suggesting that 
you have not understood how to use the interface.

I do not have your location, so cannot check. It appears that your 
workflow is highly non-standard - which you have not made clear 
previously. If you are using spgrass6 to interface R with an existing 
GRASS location - which seems to be the case, you should not use 
initGRASS(), and should not use the PERMANENT mapset. The initGRASS() 
function is for creating one-time, throwaway GRASS locations to use GRASS 
modules from R when no GISDBASE or LOCATION is required. Under Linux, I 
cannot reproduce this problem, in addition you seem to be using a Windows 
server OS, to which I have no access.

Start GRASS under Windows from the MSYS/GRASS icon, selecting the location 
and mapset (not PERMANENT) in the usual way for interactive use. Then 
start R from the MSYS console prompt - perhaps other possibilities exist 
for starting R within GRASS for the Windows native build of 6.4.1. R will 
then be "inside" the GRASS session defined by the GRASS environmental 
variables and the GISRC file. I do not know if you can run non-interactive 
on Windows, it isn't a common situation. If you can start GRASS from a 
Windows console/terminal - interactively, next start R from the GRASS 
prompt - you will need to be very careful to set PATH variables. Windows 
by definition is supposed to be run interactively, as its name suggests.

Finally, this is not the correct list for detailed R/GRASS questions - you 
are asking about things which are very specific. Use this list instead:

http://lists.osgeo.org/mailman/listinfo/grass-stats

Roger

  
    
#
On Tue, 17 May 2011, Dylan Beaudette wrote:

            
Dylan:

Sam will not have a GRASS/GDAL plugin on Windows - it isn't in the 
standard rgdal binary, and Sam will not want to compile from source, I 
think. I believe that the problem is that initGRASS() is not appropriate 
here - see my reply to Sam.

Roger

  
    
#
On Tuesday, May 17, 2011, Roger Bivand wrote:
readRAST6(c("clip1_er1m","clip1_crosC_1m","clip1_longC_1m","clip1_slope5","c
readRAST6(vname=c("clip1_er1m","clip1_crosC_1m","clip1_longC_1m","clip1_slope5","clip1_profC","clip1_miniC_1m","clip1_maxiC_1m"),
Thanks for the clarification Roger. It seems that I may have given some bad 
advice to Same, based on my lack of experience with tying GRASS-R together on 
Windows...

Dylan

  
    
#
On Tuesday, May 17, 2011, Roger Bivand wrote:
This is good to know. I may have given some bad advice based on _not_ 
understanding this concept.

I'll try and cook-up a related example with the raster library that may 
generalize better across platforms.

Dylan