merging tables by columns AND row names (coordinates)
On Fri, 8 Sep 2006, Mikkel Grum wrote:
merge(Table1, Table2,
by = intersect(c("XCOORD", "YCOORD"),
c("XCOORD", "YCOORD")), all = TRUE)
It might not handle the amount of data you have, but,
if your tables are normal dataframes, it would do the
job with a smaller dataset. It doesn't work with
Spatial*DataFrames (yet?).
I would be wary of this with coords as floating point, because they ought to be snapped together. I believe that the original data were from a regular grid with missing cells. If that is the case, and the coordinates can be mapped to integer row and column IDs, then certainly your route will work. You are right that there is as yet no cbind/rbind/merge facility for Spatial*DataFrames. Roger
Mikkel --- Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Fri, 8 Sep 2006, Michael Sumner wrote:
Hello, I can think of a couple of simple-minded
approaches that would
take some time - either relying on direct
string-matching for the unique
coordinates, or by some contrived overlay. However, there's probably far better approaches -
a couple of questions:
Can you predefine the set of all unique
coordinates without reading all
the tables from file? - if so you might simplify the identification of
each individual
coordinate, for matching the records Are the coordinates (intended to be) on a regular
grid? (This seems
unlikely, although it is nearly true given your X
coordinates). The key question is what the data are. To me they look like a global regular grid with some slippage in the print() - the underlying diff() of the unique x's and y's is almost certainly regular. I'm not sure why they are in text files either (model output?). But some bits of the grid may be missing, the question being whether this is regular. If as an earlier response indicated different data sets have different grid cells missing, then we need the overall grid to start with, then grab the row and column indices (and/or grid index), and attach these to the data rows. If the solution needs to be robust, and have a longer term utility, I would go for using MySQL, Terralib, and aRT. The data representation is that of the Terralib Cell object, so the question would be how to upload to the database from the text files. aRT is at: http://www.est.ufpr.br/aRT/ By the way, 1M by 100 by 8 bytes is pushing 32-bit R - but handing off a lot of the data storage to a database relieves this greatly. Roger
Cheers, Mike. isidora k wrote:
Hi everyone! I have 100 tables of the form: XCOORD,YCOORD,OBSERVATION 27.47500,42.52641,177 27.48788,42.52641,177 27.50075,42.52641,179 27.51362,42.52641,178 27.52650,42.52641,180 27.53937,42.52641,178 27.55225,42.52641,181 27.56512,42.52641,177 27.57800,42.52641,181 27.59087,42.52641,181 27.60375,42.52641,180 27.61662,42.52641,181 ..., ..., ... with approximately 1000000 observations for
each. All
these tables have the same xcoord and ycoord and
I
would like to get a table of the form XCOORD,YCOORD,OBSERVATION1,OBSERVATION2,... 27.47500,42.52641,177,233,... 27.48788,42.52641,177,345,... 27.50075,42.52641,179,233,... 27.51362,42.52641,178,123,... 27.52650,42.52641,180,178,... 27.53937,42.52641,178,...,... 27.55225,42.52641,181,... 27.56512,42.52641,177,... 27.57800,42.52641,181,... 27.59087,42.52641,181,... 27.60375,42.52641,180,... 27.61662,42.52641,181,... In other words I would like to merge all the
tables
taking into account the common row names of
their
xcoords AND ycoords. Not all tables have the same number of
observations
which means that not all pairs of x and y coords match. Is there a way to do this in R? I would be grateful for any advice. Many thanks Isidora
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no