Currently if I print a spatial polygon data frame I get the list
representation, which almost always scrolls way of the screen as giant
lists of lists of coordinates whizz past. It's nearly always useless
and luckily ESS lets me C-c C-o and zap the output. For
SpatialPointsDF you get:
coordinates letters LETTERS
1 (1, 0.0486677) a A
2 (2, 0.520911) b B
3 (3, 0.207873) c C
4 (4, 0.466571) d D
- for spatial polys and lines would it be better to have such a
compact representation as the default print? I'd rather use the word
'geometry' and have it print as a (truncated) pseudo-WKT, something
like:
geometry letters LETTERS
1 POINT(1 0.0486677) a A
2 POINT(2 0.520911) b B
for points, and:
geometry letters LETTERS
1 LINESTRING(...) a A
for lines, and:
geometry letters LETTERS
1 POLYGON(...) a A
for polygons. Or MULTIPOLYGON, whichever is appropriate. I think it
should literally print dot-dot-dot, since for anything other than
points its going to be voluminous.
Today I am a random idea factory...
Barry
Better print method for Spatial*DataFrames?
11 messages · Barry Rowlingson, Edzer Pebesma, Etienne Bellemare Racine +2 more
8 days later
Nice suggestion! I did this for points (committed to cvs), as option in print, and get
options(width=60) print(meuse[1:3,], sWKT=T)
geometry cadmium copper lead zinc elev
1 POINT(333611 181072) 11.7 85 299 1022 7.909
2 POINT(333558 181025) 8.6 81 277 1141 6.983
3 POINT(333537 181165) 6.5 68 199 640 7.800
dist om ffreq soil lime landuse dist.m
1 0.00135803 13.6 1 1 1 Ah 50
2 0.01222430 14.0 1 1 1 Ah 30
3 0.10302900 13.0 1 1 1 Ah 150
For (multi)lines / polygons, would it be useful to print the first
coordinate followed by ..., so that some kind of identification is possible?
On 05/18/2010 05:04 PM, Barry Rowlingson wrote:
Currently if I print a spatial polygon data frame I get the list
representation, which almost always scrolls way of the screen as giant
lists of lists of coordinates whizz past. It's nearly always useless
and luckily ESS lets me C-c C-o and zap the output. For
SpatialPointsDF you get:
coordinates letters LETTERS
1 (1, 0.0486677) a A
2 (2, 0.520911) b B
3 (3, 0.207873) c C
4 (4, 0.466571) d D
- for spatial polys and lines would it be better to have such a
compact representation as the default print? I'd rather use the word
'geometry' and have it print as a (truncated) pseudo-WKT, something
like:
geometry letters LETTERS
1 POINT(1 0.0486677) a A
2 POINT(2 0.520911) b B
for points, and:
geometry letters LETTERS
1 LINESTRING(...) a A
for lines, and:
geometry letters LETTERS
1 POLYGON(...) a A
for polygons. Or MULTIPOLYGON, whichever is appropriate. I think it
should literally print dot-dot-dot, since for anything other than
points its going to be voluminous.
Today I am a random idea factory...
Barry
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Edzer Pebesma Institute for Geoinformatics (ifgi), University of M?nster Weseler Stra?e 253, 48151 M?nster, Germany. Phone: +49 251 8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de http://www.52north.org/geostatistics e.pebesma at wwu.de
1 day later
I taught I could add my two cents.
Nice suggestion!
I agree !
options(width=60)
print(meuse[1:3,], sWKT=T)
I don't know what's sWKT, but the folowing output is the kind of printing I would like by default. Sometimes I make the mistake of printing a spatial polygon data frame and it can take literally 5 minutes to output. So if it could just be the default, I'd be happy.
geometry cadmium copper lead zinc elev
1 POINT(333611 181072) 11.7 85 299 1022 7.909
2 POINT(333558 181025) 8.6 81 277 1141 6.983
3 POINT(333537 181165) 6.5 68 199 640 7.800
dist om ffreq soil lime landuse dist.m
1 0.00135803 13.6 1 1 1 Ah 50
2 0.01222430 14.0 1 1 1 Ah 30
3 0.10302900 13.0 1 1 1 Ah 150
For (multi)lines / polygons, would it be useful to print the first
coordinate followed by ..., so that some kind of identification is possible?
I think it's a good idea, but long output are always a pain to read. So I suggest someting compact. Maybe there could be kind of an offset before the display. So if you had like POINT(349600.8 5387597) POINT(349597.0 5387597) POINT(349590.4 5387595) POINT(349569.9 5387591) POINT(349557.1 5387586) POINT(349548.5 5387581) POINT(349542.9 5387575) ... Maybe it could print the coordinates as 349000+ 5387500+ POINT(600.8 97) POINT(597.0 97) POINT(590.4 95) POINT(569.9 91) POINT(557.1 86) POINT(548.5 81) POINT(542.9 75) ... Maybe the coordinate to display should be the "labpt" slot ? I think for a matter of identification someting compact is much more useful. Talking about compactness, as I don't know of any way to put many geometry types in one class spatial*dataframe, is it necessary to repeat POINT, or (MULTI)LINE, or POLYGON ? Would it be possible to only display (random thaught here) P, M, L, Y? or S for surface ? I don't know. I like compactness ! Also, is it possible to add the same identifier (coordinate) to View() ? Etienne
On Fri, 28 May 2010, Etienne Bellemare Racine wrote:
I taught I could add my two cents.
Nice suggestion!
I agree !
No. Only for SpatialPointDataFrame objects, which is what it does already.
Please, understand that str() is a *much* better choice in effectively all
cases where summary() isn't used. For the Spatial* objects, set a
max.level=2 or similar, and you can *see* what is in it. The proposed
print() method for a big multiband raster will also run away with you. Do
str(), not print()!!!
library(maptools)
xx <- readShapeSpatial(system.file("shapes/sids.shp",
package="maptools")[1], IDvar="FIPSNO",
proj4string=CRS("+proj=longlat +ellps=clrk66"))
summary(xx)
str(xx, max.level=2)
To avoid having to remember to write max.level=2, could someone contribute
a generic str() for S4 Spatial*?
Roger
options(width=60) print(meuse[1:3,], sWKT=T)
I don't know what's sWKT, but the folowing output is the kind of printing I would like by default. Sometimes I make the mistake of printing a spatial polygon data frame and it can take literally 5 minutes to output. So if it could just be the default, I'd be happy.
geometry cadmium copper lead zinc elev
1 POINT(333611 181072) 11.7 85 299 1022 7.909
2 POINT(333558 181025) 8.6 81 277 1141 6.983
3 POINT(333537 181165) 6.5 68 199 640 7.800
dist om ffreq soil lime landuse dist.m
1 0.00135803 13.6 1 1 1 Ah 50
2 0.01222430 14.0 1 1 1 Ah 30
3 0.10302900 13.0 1 1 1 Ah 150
For (multi)lines / polygons, would it be useful to print the first
coordinate followed by ..., so that some kind of identification is
possible?
I think it's a good idea, but long output are always a pain to read. So I suggest someting compact. Maybe there could be kind of an offset before the display. So if you had like POINT(349600.8 5387597) POINT(349597.0 5387597) POINT(349590.4 5387595) POINT(349569.9 5387591) POINT(349557.1 5387586) POINT(349548.5 5387581) POINT(349542.9 5387575) ... Maybe it could print the coordinates as 349000+ 5387500+ POINT(600.8 97) POINT(597.0 97) POINT(590.4 95) POINT(569.9 91) POINT(557.1 86) POINT(548.5 81) POINT(542.9 75) ... Maybe the coordinate to display should be the "labpt" slot ? I think for a matter of identification someting compact is much more useful. Talking about compactness, as I don't know of any way to put many geometry types in one class spatial*dataframe, is it necessary to repeat POINT, or (MULTI)LINE, or POLYGON ? Would it be possible to only display (random thaught here) P, M, L, Y? or S for surface ? I don't know. I like compactness ! Also, is it possible to add the same identifier (coordinate) to View() ? Etienne
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
On Fri, May 28, 2010 at 5:18 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Fri, 28 May 2010, Etienne Bellemare Racine wrote:
I taught I could add my two cents.
Nice suggestion!
I agree !
No. Only for SpatialPointDataFrame objects, which is what it does already. Please, understand that str() is a *much* better choice in effectively all cases where summary() isn't used. For the Spatial* objects, set a max.level=2 or similar, and you can *see* what is in it. The proposed print() method for a big multiband raster will also run away with you. Do str(), not print()!!!
I'm not sure what you're saying 'No' to here, Roger. Neither str(xx) nor summary(xx) present the object as a data frame. Conceptually its a data frame where one of the columns is a geometry, and seeing it print as such is a good thing (imho). I'd like to never have to use xx at data again! I'm not sure trying to truncate the coordinates for nice formatting is a good idea though, but some indication when printing a Spatial*DataFrame that its a dataframe with geometries seems a good idea. Barry
On Fri, 28 May 2010, Barry Rowlingson wrote:
On Fri, May 28, 2010 at 5:18 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Fri, 28 May 2010, Etienne Bellemare Racine wrote:
I taught I could add my two cents.
Nice suggestion!
I agree !
No. Only for SpatialPointDataFrame objects, which is what it does already. Please, understand that str() is a *much* better choice in effectively all cases where summary() isn't used. For the Spatial* objects, set a max.level=2 or similar, and you can *see* what is in it. The proposed print() method for a big multiband raster will also run away with you. Do str(), not print()!!!
I'm not sure what you're saying 'No' to here, Roger. Neither str(xx) nor summary(xx) present the object as a data frame. Conceptually its a data frame where one of the columns is a geometry, and seeing it print as such is a good thing (imho). I'd like to never have to use xx at data again!
Just pragmatics, since things which have rushed off the top of my screen really are not much help, I find. I use as(xx, "data.frame") when needed, but most often subset both observations and variables by "[". I'm not sure where displaying all the data gets you for more than a trivial number of observations and variables, though? The output will still swamp the console/terminal buffer. I'm thinking of a multi-band raster, but even standard show(meuse.grid) as a data.frame only leaves rows 2605-3103 on screen for a standard gnome-terminal. The data editor I see doesn't have a scroll bar, so to scroll, one would need an external viewer, I think. In other software systems (octave, Stata, ...), one can turn on and off a more/less screen-by-screen displayer (not scrolling upwards, just chunking), but I'm not aware of an equivalent in R/S. I'm not sure how head() and tail() work in R, and personally use str() by default. If I need to access the coordinates of a particular line or polygon, I print() just that list element (Line or Polygon object). I can see what you mean, but feel that users will benefit much more by using str(), which is a real gem! Roger
I'm not sure trying to truncate the coordinates for nice formatting is a good idea though, but some indication when printing a Spatial*DataFrame that its a dataframe with geometries seems a good idea. Barry
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
On Sat, May 29, 2010 at 10:06 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
In other software systems (octave, Stata, ...), one can turn on and off a more/less screen-by-screen displayer (not scrolling upwards, just chunking), but I'm not aware of an equivalent in R/S. I'm not sure how head() and tail() work in R,
They don't seem to work very well at all for Spatial*DataFrames. If I add coordinates to meuse to get a SpatialPointsDataFrame and head(that) I get all the 'rows' but with only the cadmium measurements. It's slicing it the wrong way. Odd.
and personally use str() by default. If I need to access the coordinates of a particular line or polygon, I print() just that list element (Line or Polygon object). I can see what you mean, but feel that users will benefit much more by using str(), which is a real gem!
str is great if you need to know the str-ucture of an R object. But it doesn't even align the values so you can see across rows of your data, which is what I'd like print to do (by analogy with print.data.frame). Currently if I print a SpatialPolygonsDataFrame I get the structure. Print methods should do better than that - you're almost suggesting not having, for example, a print method for data frames and that we'd be better off having what print.default(anyDataFrame) gives us. So my proposal is that print of a SpatialPolygonsDataFrame class should print like a data frame but with some indicator of the geometry at the start of the row, such as POLYGON(...) - literally with dots, there's no need to spell it out. Similarly for Lines. Another suggestion is for head() and tail methods on Spatial*DataFrame objects - I think just subscripting [1:n,] from the object and returning would do it. I think currently head and tail treat these objects as lists and the results are not pretty. Barry
On Sat, 29 May 2010, Barry Rowlingson wrote:
On Sat, May 29, 2010 at 10:06 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
In other software systems (octave, Stata, ...), one can turn on and off a more/less screen-by-screen displayer (not scrolling upwards, just chunking), but I'm not aware of an equivalent in R/S. I'm not sure how head() and tail() work in R,
They don't seem to work very well at all for Spatial*DataFrames. If I add coordinates to meuse to get a SpatialPointsDataFrame and head(that) I get all the 'rows' but with only the cadmium measurements. It's slicing it the wrong way. Odd.
and personally use str() by default. If I need to access the coordinates of a particular line or polygon, I print() just that list element (Line or Polygon object). I can see what you mean, but feel that users will benefit much more by using str(), which is a real gem!
str is great if you need to know the str-ucture of an R object. But it doesn't even align the values so you can see across rows of your data, which is what I'd like print to do (by analogy with print.data.frame). Currently if I print a SpatialPolygonsDataFrame I get the structure. Print methods should do better than that - you're almost suggesting not having, for example, a print method for data frames and that we'd be better off having what print.default(anyDataFrame) gives us. So my proposal is that print of a SpatialPolygonsDataFrame class should print like a data frame but with some indicator of the geometry at the start of the row, such as POLYGON(...) - literally with dots, there's no need to spell it out. Similarly for Lines. Another suggestion is for head() and tail methods on Spatial*DataFrame objects - I think just subscripting [1:n,] from the object and returning would do it. I think currently head and tail treat these objects as lists and the results are not pretty.
Right, because they see S4 objects as lists with no components, only with attributes. str() does have support for S4 objects. They would need to be wrapped around an S4 show/print method, with the output captured, as in capture.output(). Would it make sense to have the default print/show for Spatial* be str() with max.level= set, and for Spatial*DataFrame be the print method for the data slot prepended with some text (perhaps POINT, MULTILINESTRING, MULTIPOLYGON, PIXEL, CELL, or better an abbreviation)? One would do this by cbind()ing the text in front of the as(, "data.frame"), I think, as a "geometry" variable. Roger
Barry
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
Hello,
I am attempting to use the sample code in "Applied Spatial Data Analysis
with R" but cannot get this to work and get this error:
> nc = readShapePoly(system.file("shapes/sids.shp", package="maptools")[1],
+ IDvar="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))
?????? read.dbf(filen) : unable to open DBF file
Any ideas?
Thanks,
Pete
On Sat, 29 May 2010, Peter Larson wrote:
Hello, I am attempting to use the sample code in "Applied Spatial Data Analysis with R" but cannot get this to work and get this error:
nc = readShapePoly(system.file("shapes/sids.shp", package="maptools")[1],
+ IDvar="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))
?????? read.dbf(filen) : unable to open DBF file
Any ideas?
Please update your installed packages - this looks like a mismatch between foreign and maptools. Roger
Thanks, Pete
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
On 05/29/2010 11:47 AM, Roger Bivand wrote:
On Sat, 29 May 2010, Barry Rowlingson wrote:
On Sat, May 29, 2010 at 10:06 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
In other software systems (octave, Stata, ...), one can turn on and off a more/less screen-by-screen displayer (not scrolling upwards, just chunking), but I'm not aware of an equivalent in R/S. I'm not sure how head() and tail() work in R,
They don't seem to work very well at all for Spatial*DataFrames. If I add coordinates to meuse to get a SpatialPointsDataFrame and head(that) I get all the 'rows' but with only the cadmium measurements. It's slicing it the wrong way. Odd.
and personally use str() by default. If I need to access the coordinates of a particular line or polygon, I print() just that list element (Line or Polygon object). I can see what you mean, but feel that users will benefit much more by using str(), which is a real gem!
str is great if you need to know the str-ucture of an R object. But it doesn't even align the values so you can see across rows of your data, which is what I'd like print to do (by analogy with print.data.frame). Currently if I print a SpatialPolygonsDataFrame I get the structure. Print methods should do better than that - you're almost suggesting not having, for example, a print method for data frames and that we'd be better off having what print.default(anyDataFrame) gives us. So my proposal is that print of a SpatialPolygonsDataFrame class should print like a data frame but with some indicator of the geometry at the start of the row, such as POLYGON(...) - literally with dots, there's no need to spell it out. Similarly for Lines. Another suggestion is for head() and tail methods on Spatial*DataFrame objects - I think just subscripting [1:n,] from the object and returning would do it. I think currently head and tail treat these objects as lists and the results are not pretty.
Right, because they see S4 objects as lists with no components, only with attributes. str() does have support for S4 objects. They would need to be wrapped around an S4 show/print method, with the output captured, as in capture.output(). Would it make sense to have the default print/show for Spatial* be str() with max.level= set, and for Spatial*DataFrame be the print method for the data slot prepended with some text (perhaps POINT, MULTILINESTRING, MULTIPOLYGON, PIXEL, CELL, or better an abbreviation)?
In the following example:
require(maptools)
nc = readShapePoly(system.file("shapes/sids.shp", package =
"maptools")[1], IDvar="FIPSNO", proj4string=CRS("+proj=longlat
+ellps=clrk66"))
str(as(nc, "SpatialPolygons"))
as(nc, "SpatialPolygons")
I personally find the output of the (current) print method producing
much easier readable than that of str. Partly because I've grown
accustomed to it, but also partly because I have never liked the output
of str. I tend to use the current default show method used for
SpatialLines* and SpatialPolygons* (the generic show for S4 objects) to
figure out what the structure of the data is, not how to use it. So I
guess for those who want to use the data without bothering about the
deeper structure, these print methods (both: current show.S4 and str)
are not so useful. If you disagree with this: please respond!
As for Barry's proposal, I find it a bit repetitive (and space
consuming) to have a POINT(1 1) instead of the current (1,1) (which,
credits where credits go, is from a package Barry wrote that preceded
sp). I can very well understand that many people will not know how to
read WKT [1], as it again is something that programmers tend to find
useful, not users; to be right we need the awfully long words
MULTILINESTRING and MULTIPOLYGON to represent the sp classes, and then
can't write the whole string but need to abbreviate. I agree with Barry
that a representation as much as possible like a data.frame is most useful.
I suggest the folloging: for points:
geometry attr1 attr2 attr3
PT(234 45) 333 xxy 22.5
PT(455 68) 221 xxx 13.2
for polygons: use PN(3;2335) to express that this MULTIPOLYGON consists
of 3 POLYGONS, and has 2335 coordinates (in total)
geometry attr1 attr2 attr3
PN(3;2335) 333 xxy 22.5
PN(45;345) 221 xxx 13.2
for lines:
geometry attr1 attr2 attr3
LI(3;2335) 333 xxy 22.5
LI (5;345) 221 xxx 13.2
for pixels: use points, replace PT with PX
for grids: don't print all the values, but a very short summary.
To really educate users that we "glue" data.frame attribute tables to
geometries, they need to see this, and therefore I want to print a
SpatialPoints object as:
geometry
PT(234 45)
PT(455 68)
and do the same for SpatialLines and SpatialPolygons:
geometry
PN(3;2335)
PN(45;345)
geometry
LI(3;2335)
LI (5;345)
what head and tail should do is then obvious.
Next thing is that developers/programmers need to find out how to print
all the gory details -- they will need to use str(nc) or show(unclass(nc)).
For those from Europe: thank you for all the points in the song contest.
We also like Lena a lot, here at home.
[1] http://en.wikipedia.org/wiki/Well-known_text
Edzer Pebesma Institute for Geoinformatics (ifgi), University of M?nster Weseler Stra?e 253, 48151 M?nster, Germany. Phone: +49 251 8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de http://www.52north.org/geostatistics e.pebesma at wwu.de