Dear Emmanuel,
I have the same problem. I either can not run processing with large data set in R or I can not even
load such data to R. Then, if I want to do any geostatistics, it takes forever. R (gstat/geoR) is
simply not that efficient with large spatial data as e.g. GIS software.
What you can definitively try is to subset your point data randomly by using e.g.:
points.sub <- points[runif(length(points at data[[1]]))<0.1] # 10% sample!
This will allow you to fit variograms etc.
Then, if you really want to interpolate all of your 157k points, you might consider using SAGA. For
example, you can also sub-set randomly points from a shape file:
# too many points; subset to 5% ("Split Shapes Layer Randomly" in SAGA):
SHAPES="var.shp", BVARIANCE=F, BLOCK=F, FIELD=1, BLOG=F, MODEL=1, TARGET=0, NPOINTS_MIN=10,
NPOINTS_MAX=60, NUGGET=rvgm.Pb$psill[1], SILL=1.65, RANGE=1238, MAXRADIUS=50000,
USER_CELL_SIZE=cell.size, USER_X_EXTENT_MIN=gridmaps at bbox[1,1]+cell.size/2,
USER_X_EXTENT_MAX=gridmaps at bbox[1,2]-cell.size/2, USER_Y_EXTENT_MIN=gridmaps at bbox[2,1]+cell.size/2,
USER_Y_EXTENT_MAX=gridmaps at bbox[2,2]-cell.size/2))
# the same way you can run regression-kriging/universal kriging;
You will soon find out that there is a big difference in the efficiency between SAGA and R - SAGA
will interpolate your 157k points within few minutes or less. On the other hand, SAGA has very very
limited geostatistical functionality (for example it can not fit variograms etc.), so what you
really need is a combination of SAGA and R!
Here are more examples:
http://geomorphometry.org/view_scripts.asp?id=24
HTH,
T. Hengl
http://spatial-analyst.net
-----Original Message-----
From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] On Behalf
Of Poizot Emmanuel
Sent: Thursday, July 16, 2009 9:55 AM
To: r-sig-geo at stat.math.ethz.ch
Subject: [R-sig-Geo] Problem with size of dataset
Dear all,
I would like to perform a geostatistical analysis using R.
To do so, I'm classicaly use geoR ou GSTAT packages.
In the present case, I have a dataset of around 157000 locations where I
have a value (depth).
I've been not able to create both geodata or gstat valid R object
because apparently of the size of the dataset.
DOes anybody have an idear of how to conduct such study with such
dataset size ?
Regards
--
Cordialement
------------------------------------------------
Emmanuel Poizot
Cnam/Intechmer
B.P. 324
50103 Cherbourg Cedex
Phone (Direct) : (00 33)(0)233887342
Fax : (00 33)(0)233887339
------------------------------------------------