Skip to content

Large Data Sets

3 messages · Caroline Keef, Michael Sumner, Roger Bivand

#
I am having trouble manipulating a large data set to change a spatial
polygons data frame

I have read in an ESRI polygon shapefile using rgdal and readOGR that
contains 18804 polygons.  I am trying to add two variables into the
spatial data frame but it's not working.  I'm doing this using

cbind(SpatialPolygonsDataFrame,new variable,rank(new variable)).

Before I killed R Task Manager said it was using more than 1,050,000K
virtual memory, which given my whole computer ground to a halt I'm
guessing is the limit!

I regularly handle larger data frames than this using R without any
problem so I don't think the problem is R in general.  

Does anyone have any suggestions as to either the size of the data set
that it is possible to handle as a spatial data frame or a better way of
combining spatial data frames?

Thank you

Caroline
 
JBA Consulting - Engineers and Scientists
South Barn, Broughton Hall, Skipton, North Yorkshire, BD23 3AE, UK
t: +44 (0)1756 799919   f: +44 (0)1756 799449
 
JBA is a Carbon Neutral Company.  Please don't print this e-mail unless you really need to.
This email is covered by JBA Consulting's email disclaimer at www.jbaconsulting.co.uk/emaildisclaimer
#
Caroline Keef wrote:
Try

spdf$newvariable <- newvariable
spdf$ranknewvariable <- rank(newvariable)

or similar, depending on your actual variable and column names, and the 
workflow used to generate your vectors.

This would be identical, exploiting the features of data frames via SPDF 
methods (again, depending on your actual names):

spdf[["newvariable"]] <- newvariable
spdf[["ranknewvariable"]] <- rank(newvariable)


Using cbind like that looks like it will coerce to matrix, but it's not 
going to do what you want. You could  access the underlying spdf at data 
more directly,  but using the method syntax for data frames is advisable.

Cheers, Mike.
#
On Tue, 4 Sep 2007, Caroline Keef wrote:

            
If we call your SpatialPolygonsDataFrame "x", then why not do:

x$nv <- new_variable
x$rnv <- rank(new_variable)

cbind is an S3 generic function, but no methods are defined for the 
SpatialPolygonsDataFrame class.

You can use the spCbind() method in the maptools package, but first you 
need to put your new variables into a data frame, and make sure that the 
set of row names of the data frame in the data slot of the 
SpatialPolygonsDataFrame object and the new data frame agree. You can use 
spCbind on a single vector:

x1 <- spCbind(x, new_variable)

but

x$nv <- new_variable

seems easier.

cbind methods do exist for SpatialGridDataFrame objects, but are for 
cbind'ing two SpatialGridDataFrame objects together, rather than for 
arguments of arbitrary classes; it stops you if the GridTopologies are 
not identical.

Hope this helps,

Roger
The default cbind method had got completely lost in the input object.