Skip to content

Polygon hole=TRUE not being preserved?

5 messages · Don MacQueen, Jochen Albrecht, Roger Bivand

#
I'm puzzled by an apparent change of @hole from TRUE to FALSE when a 
Polygon object becomes part of a Polygons object.

Here's an example, adapted from ?overlay

r2 <- cbind(c(180042, 180545, 180553, 180314, 179955, 179142, 179437,
   179524, 179979, 180042),
   c(332373, 332026, 331426, 330889, 330683,
     331133, 331623, 332152, 332357, 332373))

r3 <- cbind( c( 179662.7, 179759.0, 180188.4, 180179.6, 179662.7),
   c(331630.3, 331270.5, 331428.4, 331674.1, 331630.3)
   )

sr2=Polygons(list(Polygon(r2)),"r2")

p3 <- Polygon(r3,hole=TRUE)
sr3=Polygons(list(p3),"r3")


Then check:
Formal class 'Polygon' [package "sp"] with 5 slots
   ..@ labpt  : num [1:2] 179930 331501
   ..@ area   : num 148546
   ..@ hole   : logi TRUE
   ..@ ringDir: int -1
   ..@ coords : num [1:5, 1:2] 179663 179759 180188 180180 179663 ...
Note that @hole is TRUE
Formal class 'Polygons' [package "sp"] with 5 slots
   ..@ Polygons :List of 1
   .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
   .. .. .. ..@ labpt  : num [1:2] 179930 331501
   .. .. .. ..@ area   : num 148546
   .. .. .. ..@ hole   : logi FALSE
   .. .. .. ..@ ringDir: int 1
   .. .. .. ..@ coords : num [1:5, 1:2] 179663 180180 180188 179759 179663 ...
   ..@ plotOrder: int 1
   ..@ labpt    : num [1:2] 179930 331501
   ..@ ID       : chr "r3"
   ..@ area     : num 148546

@hole is now FALSE


This arose because when I continue with the adapted example from ?overlay:

sr=SpatialPolygons(list(sr2,sr3))
srdf=SpatialPolygonsDataFrame(sr, data.frame(i=cbind(1:2,4:3), 
row.names=c("r2","r3")))

And plot with

plot(srdf,col='red')

the red fills the entire outer polygon. I was expecting there to be a 
hole in the fill -- based on remembering this from last time I did 
this, which was *not* recent.

Am I doing something wrong?

Thanks
-Don

Session information:


Sampling[327]% R --no-save --vanilla

R version 2.10.1 (2009-12-14)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

<- cut ->
R version 2.10.1 (2009-12-14)
i386-apple-darwin8.11.1

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] maptools_0.7-29 lattice_0.17-26 sp_0.9-57       foreign_0.8-39

loaded via a namespace (and not attached):
[1] grid_2.10.1
Warning message:
'DESCRIPTION' file has 'Encoding' field and re-encoding is not possible


Note, finally, that although CRAN shows sp at 0.9-60, I have 
installed from three different mirrors and in all cases get 0.9-57. I 
don't know what the issue is there. I have not (yet) tried a manual 
download and install of 0.9-60.

I doubt that the warning message from sessionInfo() is relevant.
#
I am trying to do a natural breaks (Jenks) classification on a data set 
with some 300,000 observations. I started with
     salescat=classIntervals(sales, 100, style="jenks")
and cut this process off after it ran for 12 hours. Then I tried it with 
just ten classes but this made no difference.
With a subset of just 300 observations, it runs for 36 seconds.
For 1,000 records, it runs seven minutes and then throws the following 
error:
     Error in if (mat2[l, j] >= (v + mat2[i4, j - 1])) { :
       missing value where TRUE/FALSE needed
     In addition: Warning message:
     In val * val : NAs produced by integer overflow
I checked the data interactively; there are no missing or non-integer 
values.
ArcMap classifies it instantaneously without hiccups and takes about ten 
seconds for all 300,000 records (though limiting itself to a maximum of 
32 classes).
Do you have any suggestions as to what I am doing wrong or what could be 
done to resolve my problem?
Cheers,
     Jochen
#
On Fri, 26 Feb 2010, Jochen Albrecht wrote:

            
The "jenks" method is programmed in R, the "fisher" method in Fortran. On 
my laptop, 10000 values into 100 classes with "fisher" ran in under a 
minute. The fact that Arc claims to do things (which are not documented in 
code) doesn't mean that what you see is what is actually happening. I 
believe that in an earlier thread it was suggested that Arc samples the 
input data and adds the range to be sure to include everyone. You could do 
the same if you like - "fisher" and "jenks" are very similar.

Hope this helps,

Roger
PS: Integer overflow is caused by the implementation, not by the data
PPS: It gives a classification among many possible "natural breaks" 
classifications, for a version of Jenks. Whether this is misleading or 
not is a different question, as is the legibility of a graphic with 100 
classes and 300000 discrete entities (unless raster, where entities do 
not differ in shape.