Creating very large spatial weight matrix
On Thu, 18 Nov 2010, Aleksandr Andreev wrote:
Yes, sorry, I'm running R 2.12.0 on Ubuntu 64-bit (kernel 2.6.32-25-generic)
The actual answer is to use the function needed for this operation: library(spdep) coords <- cbind(Lon, Lat) dnb <- dnearneigh(coords, 0, dmax, longlat=TRUE) where dmax is a small distance in km. Of course, if you really need all the distances, all bets are off, but this would be an unusually specified picture of the underlying spatial process. I suggest not worrying about ensuring that all observations have at least one neighbour - for such a global measure as Moran's I for N=120', dropping a few cannot matter much. Go with a tight dmax, and it should just work. If dmax is loose, and the average number of neighbours creeps up, the nb object (and the following listw object) will get denser, with possibly some observations with thousands of neighbours, so oversmoothing the process. If this is continental rather than whole-world, consider projecting to the plane and using graph-based neighbours (?graph2nb). Hope this helps, Roger
Thanks for pointing out ff. ------------------------ Aleksandr Andreev Graduate Student - Department of Economics University of North Carolina at Chapel Hill Mobile: +1 303 507 93 88 Skype: typiconman 2010/11/18 Michael Sumner <mdsumner at gmail.com>:
And, please report your OS and version of R (64-bit presumably?). On Fri, Nov 19, 2010 at 10:39 AM, Michael Sumner <mdsumner at gmail.com> wrote:
In general you need at least twice the required memory, and it has to be contiguous. Try with a fresh instance of R and try to create a single vector of that size, that might show that you *could* do it. Otherwise, check out the ff package, and see other options in the High Performance Computing Task View on CRAN. There may be other techniques you can use to solve the problem, but those two things are my direct answers to your questions. Cheers, Mike. On Fri, Nov 19, 2010 at 10:28 AM, Aleksandr Andreev <aleksandr.andreev at gmail.com> wrote:
Hello list,
I have 120,000 geocoded observations, for which I'm trying to create a
distance-based spatial weighting matrix so that I can perform a Moran
test.
Each observation has Lat and Lon.
Unfortunately, when I run
dists <- as.matrix(dist(cbind(Lon, Lat)))
I get the message:
Error in vector("double", length) : vector size specified is too large
Now I realize that 120,000^2 / 2 is on the order of 6 GB. However, I
seem to be running into software limitations on the vector size before
I hit RAM limitations. Also, in principle, it should be possible
(though slow) to use hard disk space to store this matrix. Does anyone
have any ideas on how to do this in R?
Thanks,
------------------------
Aleksandr Andreev
Graduate Student - Department of Economics
University of North Carolina at Chapel Hill
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsumner at gmail.com
-- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsumner at gmail.com
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no