On 6 Jul 2020, at 10:47, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Mon, 6 Jul 2020, Graham Leask wrote:
Thank you. That?s helpful and confirms my thoughts that this does not
follow the standard structure that can be read by the available existing
functions within R.
For your subset of the input data, this appears to work:
# o is your data subset
geoms <- o[,2]
library(sf)
l_out <- lapply(geoms, function(geom) {
o1 <- gsub("([0-9])(,)([0-9])", "\\1 \\3", geom)
# between lon and lat
o2 <- gsub("\\]", ")", gsub("\\[", "(", o1))
# all brackets to parentheses
o3 <- gsub("([0-9])\\),\\(([0-9])", "\\1,\\2", o2)
# between coordinate pairs
o4 <- gsub("\\(\\(\\(", "(", gsub("\\)\\)\\)", ")", o3))
# three ((())) to one ()
st_as_sfc(paste0("POLYGON", o4))
})
out <- do.call("c", l_out)
plot(out, col=1:6)
st_is_valid(out)
# [1] FALSE FALSE TRUE TRUE TRUE TRUE
out1 <- st_make_valid(out)
plot(out1, col=1:6)
but having only seen the first 6, there may be further problems. Refreshing regular expressions knowledge is I hope an effective mental exercise ...
Roger
Kind regards
Graham
On Mon, 6 Jul 2020 at 10:23, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Sun, 5 Jul 2020, Graham Leask wrote:
Hi Roger
Here is the file imported from the original .csv file. I suspect it may
have originated as a .qvd file converted to .csv.
The geometry string column contains [] separated geographical coordinates
of somewhat varying formatting (the string versions of first and last
ring coordinates are not always equal, I think), in an undocumented
format.
There is certainly no existing function to read this. Use string handling
to convert to a format that can be read, most likely Well-Known Text. This
will involve quite advanced regular expression handling. The sample data
look like POLYGON objects (an exterior ring and interior rings), but they
might also be MULTIPOLYGON objects. From there, sf::st_as_sfc(). WKT does
not group coordinates as ((1,1),(2,2)), rather as (1 1, 2 2).
It would still be very useful to know the provenance of the file in some
detail. It is not likely that anyone will help you write the regular
expression code to handle this data unless it can be generalised to a
common use case (you suggested QVD (QlikView)).
Roger
dput(head(Geog1))
structure(list(BrickCode = c("101;", "102;", "103;", "104;",
"105;", "106;"), X.Key_Brick_Geometry =
c("[[[[15.066294,54.986481],[15.08849170010846,54.98916685060565],[15.109724384490239,55.009579959623146],[15.120340726681128,55.014414643337815],[15.120340726681128,55.02945588156123],[15.112619750542299,55.03805087483176],[15.123236092733189,55.04503430686406],[15.125166336767895,55.05040617765814],[15.138678045010845,55.05792679676985],[15.140608289045552,55.06759616419919],[15.158945607375271,55.081025841184385],[15.158945607375271,55.089620834454905],[15.151224631236442,55.10036457604306],[15.151224631236442,55.11110831763122],[15.147364143167028,55.11540581426648],[15.158945607375271,55.12829830417227],[15.154119997288504,55.13420736204576],[15.138678045010845,55.144951103633915],[15.117445360629068,55.146025477792726],[15.078840479934925,55.156769219380884],[15.046026331344903,55.17288483176312],[15.017072670824295,55.18040545087483],[14.990049254338395,55.19759543741588],[14.984258522234274,55.20726480484522],[14.968816569956616,55.215859798115744],[14.939862909436009,55.212636675639295],[14.926351201193059,55.220157294751004],[14.889676564533623,55.22767791386271],[14.872304368221258,55.24164477792732],[14.835629731561822,55.25131414535666],[14.820187779284165,55.25990913862718],[14.810536559110629,55.276024751009416],[14.781582898590022,55.29214036339165],[14.779652654555314,55.29966098250336],[14.766140946312365,55.29966098250336],[14.750698994034707,55.291065989232834],[14.749733872017353,55.2679669448183],[14.733326797722343,55.24916539703903],[14.715954601409978,55.23519853297442],[14.703408015184381,55.212099488559886],[14.703408015184381,55.203504495289366],[14.699547527114968,55.1992069986541],[14.699547527114968,55.17557076716016],[14.703408015184381,55.17127327052489],[14.703408015184381,55.12829830417227],[14.699547527114968,55.12400080753701],[14.699547527114968,55.11325706594885],[14.6947219170282,55.10734800807536],[14.685070696854664,55.10412488559892],[14.685070696854664,55.09660426648721],[14.70823362527115,55.081563028263794],[14.7217453335141,55.078339905787345],[14.746838505965293,55.06222429340511],[14.77000143438178,55.05362930013459],[14.79992021691974,55.05094336473755],[14.824048267353579,55.04288555854643],[14.880025344360087,55.03321619111709],[14.89739754067245,55.023546823687745],[14.959165349783081,55.00420808882907],[15.008386572668112,54.99507590847914],[15.043130965292843,54.9929271601615],[15.066294,54.986481]]],[[[15.182108999999999,55.323834],[15.19369,55.321686],[15.19369,55.319537],[15.185969,55.317387999999994],[15.182108999999999,55.319537],[15.182108999999999,55.323834]]],[[[15.182108999999999,55.323834],[15.174387999999999,55.323834],[15.174387999999999,55.325983],[15.182108999999999,55.325983],[15.182108999999999,55.323834]]]]",
"[[[[12.529952999999999,55.631105],[12.525127622017353,55.62519635262449],[12.507755425704989,55.61552698519515],[12.506790303687636,55.60102293405114],[12.52705786605206,55.57577514131897],[12.552151038503252,55.57255201884253],[12.565662746746202,55.55858515477792],[12.598476895336225,55.55643640646029],[12.631291043926247,55.57470076716016],[12.667965680585683,55.58222138627187],[12.679547144793926,55.58866763122476],[12.680512266811279,55.59457668909825],[12.673756412689805,55.602634495289365],[12.684372754880693,55.61606417227456],[12.684372754880693,55.62465916554508],[12.679547144793926,55.6327169717362],[12.66217494848156,55.63594009421265],[12.654453972342733,55.64238633916554],[12.64866324023861,55.65850195154778],[12.638046898047723,55.66763413189771],[12.649628362255964,55.676229125168234],[12.639012020065074,55.683212557200534],[12.637081776030367,55.69073317631224],[12.619709579718004,55.68858442799461],[12.610058359544468,55.678915060565274],[12.59461640726681,55.670320067294746],[12.577244210954445,55.669245693135935],[12.564698,55.661187999999996],[12.563732502711495,55.65527882907133],[12.552151038503252,55.64453508748317],[12.529952999999999,55.631105]]],[[[12.734558999999999,55.609618],[12.749035930043384,55.606931991924625],[12.777024468546637,55.59027919246299],[12.774129102494577,55.58866763122476],[12.75096617407809,55.595113876177656],[12.739384709869848,55.60156012113055],[12.734558999999999,55.609618]]],[[[12.792466,55.607468999999995],[12.762547638286334,55.60585761776581],[12.749035930043384,55.61767573351278],[12.739384709869848,55.618750107671595],[12.730698611713665,55.63755165545087],[12.743245197939261,55.6670969448183],[12.769303492407808,55.671931628532974],[12.779919834598697,55.66494819650067],[12.784745444685466,55.65689039030955],[12.784745444685466,55.62895666218034],[12.792466420824294,55.613915423956925],[12.792466,55.607468999999995]]]]",
"[[[[12.545395,55.684824],[12.564698,55.684824],[12.568558,55.689122],[12.552151038503252,55.70792316285329],[12.545395,55.708459999999995],[12.541535,55.706312],[12.541535,55.702014],[12.537673999999999,55.702014],[12.537673999999999,55.697717],[12.529952999999999,55.697717],[12.529952999999999,55.695567999999994],[12.537673999999999,55.695567999999994],[12.537673999999999,55.691269999999996],[12.541535,55.691269999999996],[12.541535,55.686972999999995],[12.545395,55.684824]]]]",
"[[[[12.510651,55.635403],[12.526093,55.635403],[12.529952999999999,55.631105],[12.552151038503252,55.64453508748317],[12.563732502711495,55.65527882907133],[12.564698,55.661187999999996],[12.556976648590021,55.66548538358008],[12.529952999999999,55.665485],[12.525127622017353,55.6531300807537],[12.511615913774403,55.65205570659488],[12.50293,55.641849],[12.506789999999999,55.6397],[12.506789999999999,55.635403],[12.510651,55.635403]]]]",
"[[[[12.50293,55.641849],[12.511615913774403,55.65205570659488],[12.525127622017353,55.6531300807537],[12.529952999999999,55.665485],[12.538639330260303,55.682138183041715],[12.545395,55.684824],[12.541535,55.686972999999995],[12.541535,55.691269999999996],[12.537673999999999,55.691269999999996],[12.537673999999999,55.695567999999994],[12.529952999999999,55.695567999999994],[12.503894937635573,55.68858442799461],[12.479766999999999,55.67408],[12.480732009219087,55.65742757738896],[12.490383229392624,55.65205570659488],[12.49231347342733,55.6466838358008],[12.50293,55.641849]]]]",
"[[[[12.510651,55.635403],[12.506789999999999,55.635403],[12.506789999999999,55.6397],[12.50293,55.641849],[12.49231347342733,55.6466838358008],[12.490383229392624,55.65205570659488],[12.480732009219087,55.65742757738896],[12.479766999999999,55.67408],[12.469150545010844,55.683212557200534],[12.464324934924077,55.69341911170928],[12.464324934924077,55.70201410497981],[12.453708592733188,55.70684878869448],[12.452743,55.714907],[12.43826664045553,55.71436940780619],[12.425720054229934,55.7084603499327],[12.391940783622559,55.70577441453566],[12.387115,55.699864999999996],[12.393871027657266,55.68965880215343],[12.372638343275487,55.68428693135935],[12.371673221258133,55.661187886944816],[12.36781273318872,55.65474164199192],[12.382289563449023,55.64346071332436],[12.406417999999999,55.613915],[12.432475908351408,55.61337823687752],[12.460464446854663,55.60102293405114],[12.494243717462037,55.60156012113055],[12.50292981561822,55.611766675639295],[12.499069327548806,55.62251041722745],[12.510651,55.635403]]]]"
)), row.names = c(NA, 6L), class = "data.frame")
On 5 Jul 2020, at 15:00, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Sat, 4 Jul 2020, Graham Leask wrote:
Dear List,
I have a postcode file containing geographical coordinates but this is
not in the format of a standard shape file. I list some information below;
Is the smoking gun: 'format.stata = "%9s"'? What generated the data -
was it for example read into R using foreign, haven, or some other function
or package for reading stata objects? What function in Stata generated that
file (if made in Stata)? Could you please provide the full context
including the lines of the Stata do file used? Some seem to use Stata for
mapping:
so we need to know what made the object, and whether it could have made
structure(list(Postcode = structure(c("101", "102", "103", "104",
"105", "106"), label = "Brick code", format.stata = "%9s"),
Postcode_geometry =
structure(c("[[[[15.066294,54.986481],[15.08849170010846,54.98916685060565],[15.109724384490239,55.009579959623146],[15.120340726681128,55.014414643337815],[15.120340726681128,55.02945588156123],[15.112619750542299,55.03805087483176],[15.123236092733189,55.04503430686406],[15.125166336767895,55.05040617765814],[15.138678045010845,55.05792679676985],[15.140608289045552,55.06759616419919],[15.158945607375271,55.081025841184385],[15.158945607375271,55.089620834454905],[15.151224631236442,55.10036457604306],[15.151224631236442,55.11110831763122],[15.147364143167028,55.11540581426648],[15.158945607375271,55.12829830417227],[15.154119997288504,55.13420736204576],[15.138678045010845,55.144951103633915],[15.117445360629068,55.146025477792726],[15.078840479934925,55.156769219380884],[15.046026331344903,55.17288483176312],[15.017072670824295,55.18040545087483],[14.990049254338395,55.19759543741588],[14.984258522234274,55.20726480484522],[14.968816569956616,55.215859798115744],[14.939862909436009,55.212636675639295],[14.926351201193059,55.220157294751004],[14.889676564533623,55.22767791386271],[14.872304368221258,55.24164477792732],[14.835629731561822,55.25131414535666],[14.820187779284165,55.25990913862718],[14.810536559110629,55.276024751009416],[14.781582898590022,55.29214036339165],[14.779652654555314,55.29966098250336],[14.766140946312365,55.29966098250336],[14.750698994034707,55.291065989232834],[14.749733872017353,55.2679669448183],[14.733326797722343,55.24916539703903],[14.715954601409978,55.23519853297442],[14.703408015184381,55.212099488559886],[14.703408015184381,55.203504495289366],[14.699547527114968,55.1992069986541],[14.699547527114968,55.17557076716016],[14.703408015184381,55.17127327052489],[14.703408015184381,55.12829830417227],[14.699547527114968,55.12400080753701],[14.699547527114968,55.11325706594885],[14.6947219170282,55.10734800807536],[14.685070696854664,55.10412488559892],[14.685070696854664,55.09660426648721],[14.70823362527115,55.081563028263794],[14.7217453335141,55.078339905787345],[14.746838505965293,55.06222429340511],[14.77000143438178,55.05362930013459],[14.79992021691974,55.05094336473755],[14.824048267353579,55.04288555854643],[14.880025344360087,55.03321619111709],[14.89739754067245,55.023546823687745],[14.959165349783081,55.00420808882907],[15.008386572668112,54.99507590847914],[15.043130965292843,54.9929271601615],[15.066294,54.986481]]],[[[15.182108999999999,55.323834],[15.19369,55.321686],[15.19369,55.319537],[15.185969,55.317387999999994],[15.182108999999999,55.319537],[15.182108999999999,55.323834]]],[[[15.182108999999999,55.323834],[15.174387999999999,55.323834],[15.174387999999999,55.325983],[15.182108999999999,55.325983],[15.182108999999999,55.323834]]]]",
"[[[[12.529952999999999,55.631105],[12.525127622017353,55.62519635262449],[12.507755425704989,55.61552698519515],[12.506790303687636,55.60102293405114],[12.52705786605206,55.57577514131897],[12.552151038503252,55.57255201884253],[12.565662746746202,55.55858515477792],[12.598476895336225,55.55643640646029],[12.631291043926247,55.57470076716016],[12.667965680585683,55.58222138627187],[12.679547144793926,55.58866763122476],[12.680512266811279,55.59457668909825],[12.673756412689805,55.602634495289365],[12.684372754880693,55.61606417227456],[12.684372754880693,55.62465916554508],[12.679547144793926,55.6327169717362],[12.66217494848156,55.63594009421265],[12.654453972342733,55.64238633916554],[12.64866324023861,55.65850195154778],[12.638046898047723,55.66763413189771],[12.649628362255964,55.676229125168234],[12.639012020065074,55.683212557200534],[12.637081776030367,55.69073317631224],[12.619709579718004,55.68858442799461],[12.610058359544468,55.678915060565274],[12.59461640726681,55.670320067294746],[12.577244210954445,55.669245693135935],[12.564698,55.661187999999996],[12.563732502711495,55.65527882907133],[12.552151038503252,55.64453508748317],[12.529952999999999,55.631105]]],[[[12.734558999999999,55.609618],[12.749035930043384,55.606931991924625],[12.777024468546637,55.59027919246299],[12.774129102494577,55.58866763122476],[12.75096617407809,55.595113876177656],[12.739384709869848,55.60156012113055],[12.734558999999999,55.609618]]],[[[12.792466,55.607468999999995],[12.762547638286334,55.60585761776581],[12.749035930043384,55.61767573351278],[12.739384709869848,55.618750107671595],[12.730698611713665,55.63755165545087],[12.743245197939261,55.6670969448183],[12.769303492407808,55.671931628532974],[12.779919834598697,55.66494819650067],[12.784745444685466,55.65689039030955],[12.784745444685466,55.62895666218034],[12.792466420824294,55.613915423956925],[12.792466,55.607468999999995]]]]",
"[[[[12.545395,55.684824],[12.564698,55.684824],[12.568558,55.689122],[12.552151038503252,55.70792316285329],[12.545395,55.708459999999995],[12.541535,55.706312],[12.541535,55.702014],[12.537673999999999,55.702014],[12.537673999999999,55.697717],[12.529952999999999,55.697717],[12.529952999999999,55.695567999999994],[12.537673999999999,55.695567999999994],[12.537673999999999,55.691269999999996],[12.541535,55.691269999999996],[12.541535,55.686972999999995],[12.545395,55.684824]]]]",
"[[[[12.510651,55.635403],[12.526093,55.635403],[12.529952999999999,55.631105],[12.552151038503252,55.64453508748317],[12.563732502711495,55.65527882907133],[12.564698,55.661187999999996],[12.556976648590021,55.66548538358008],[12.529952999999999,55.665485],[12.525127622017353,55.6531300807537],[12.511615913774403,55.65205570659488],[12.50293,55.641849],[12.506789999999999,55.6397],[12.506789999999999,55.635403],[12.510651,55.635403]]]]",
"[[[[12.50293,55.641849],[12.511615913774403,55.65205570659488],[12.525127622017353,55.6531300807537],[12.529952999999999,55.665485],[12.538639330260303,55.682138183041715],[12.545395,55.684824],[12.541535,55.686972999999995],[12.541535,55.691269999999996],[12.537673999999999,55.691269999999996],[12.537673999999999,55.695567999999994],[12.529952999999999,55.695567999999994],[12.503894937635573,55.68858442799461],[12.479766999999999,55.67408],[12.480732009219087,55.65742757738896],[12.490383229392624,55.65205570659488],[12.49231347342733,55.6466838358008],[12.50293,55.641849]]]]",
"[[[[12.510651,55.635403],[12.506789999999999,55.635403],[12.506789999999999,55.6397],[12.50293,55.641849],[12.49231347342733,55.6466838358008],[12.490383229392624,55.65205570659488],[12.480732009219087,55.65742757738896],[12.479766999999999,55.67408],[12.469150545010844,55.683212557200534],[12.464324934924077,55.69341911170928],[12.464324934924077,55.70201410497981],[12.453708592733188,55.70684878869448],[12.452743,55.714907],[12.43826664045553,55.71436940780619],[12.425720054229934,55.7084603499327],[12.391940783622559,55.70577441453566],[12.387115,55.699864999999996],[12.393871027657266,55.68965880215343],[12.372638343275487,55.68428693135935],[12.371673221258133,55.661187886944816],[12.36781273318872,55.65474164199192],[12.382289563449023,55.64346071332436],[12.406417999999999,55.613915],[12.432475908351408,55.61337823687752],[12.460464446854663,55.60102293405114],[12.494243717462037,55.60156012113055],[12.50292981561822,55.611766675639295],[12.499069327548806,55.62251041722745],[12.510651,55.635403]]]]"
), label = "%Key_Brick_Geometry", format.stata = "%9s")), row.names =
-6L), class = c("tbl_df", "tbl", "data.frame?))
How can I map this file using R? I?ve tried using the sf package with
st_multipolygon and st_multilinestring without success.
Any help as to which package and appropriate commands to successfully
map this data using R will be appreciated.