Count occurrences less memory expensive than superimpose function in several spatial objects
Hi Alexandre,
As far as I can tell (mostly from reading the docs...no prior experience of
using multiplicity or superimpose myself) it appears that they are just
calculating the number of unique values for a combination of x,y coordinate
pairs. So, you can do this by using the group by semantics of either
tidyverse or SQL to generate the res.xy data.frame. Below is an example of
generating res.xy alternatively using data.table (I'm not as familiar with
tidyverse):
target_sub1 <- rbindlist(lapply(target, as.data.table))
res1 <- target_sub1[, .(res=.N), by=.(x,y)]
res.xy1 = res1[target_sub1, on=c("x","y")]
all.equal(res.xy, res.xy1, check.attributes=FALSE) # should return TRUE
If you're using SQL then you just join the raw table with the grouped table
and you should get the table coordinates and occurrences. And, considering
the number of coordinates you have I recommend either data.table or SQL to
generate the final output.
HTH,
Vijay.
On Wed, Aug 19, 2020 at 4:22 PM ASANTOS via R-sig-Geo <
r-sig-geo at r-project.org> wrote:
Dear r-sig-geo Members,
??? I'll like to read several shapefiles, count occurrences in the same
coordinate and create a final shapefile with a threshold number of
occurrences. I try to convert the shapefiles in ppp object (because I
have some part of my data set in shapefile and another in ppp objects)
and applied superimpose function without success. In my synthetic example :
#Packages
library(spatstat)
library(dplyr)
library(sp)
library(rgdal)
library(raster)
#Point process example
data(ants)
ants.df<-as.data.frame(ants) #Convert to data frame
# Sample 75% in original dataset, repeat this 9 times and create a
shapefile in each loop
for(i in 1:9){
s.ants.df<-sample_frac(ants.df, 0.75)
s.ants<-ppp(x=s.ants.df[,1],y=s.ants.df[,2],window=ants$window)#Create
new ppp object
sample.pts<-cbind(s.ants$x,s.ants$y)
pts.sampling = SpatialPoints(sample.pts)
UTMcoor.df <- SpatialPointsDataFrame(pts.sampling,
data.frame(id=1:length(pts.sampling)))
writeOGR(UTMcoor.df, ".",paste0('sample.shape',i), driver="ESRI
Shapefile",overwrite=TRUE)
}
#Read all the 9 shapefiles created
all_shape <- list.files(pattern="\\.shp$", full.names=TRUE)
all_shape_list <- lapply(all_shape, shapefile)
#Convert shapefile to ppp statstat
target <- vector("list", length(all_shape_list))
for(i in 1:length(all_shape_list)){
target[[i]] <- ppp(x=all_shape_list[[i]]@coords[,1],
y=all_shape_list[[i]]@coords[,2],window=ants$window)}
#Join all ppp objects using multiplicity
target_sub<-do.call(superimpose,target)
res<-multiplicity(target_sub)
#Occurrences in the same coordinate > 5
res.xy<-as.data.frame(target_sub$x,target_sub$y,res)
res_F<-res.xy[res.xy$res>5,]
#Final shapefile
final.pts<-cbind(res_F[,1],res_F[,2])
pts.final = SpatialPoints(final.pts)
UTMcoor.df <- SpatialPointsDataFrame(pts.final,
data.frame(id=1:length(pts.final)))
UTMcoor.df2 <-remove.duplicates(UTMcoor.df)
writeOGR(UTMcoor.df2, ".", paste0('final.ants'), driver="ESRI
Shapefile",overwrite=TRUE)
This approach works very well in this synthetic example!!! But in my
real data set a have the 99 shapefiles with 10^7 coordinates and when I
try to use the do.call(superimpose,target) function my 32GB RAM memory
crashed.
Please any ideas for how I can create a new shapefile with a criteria
occurrences exposed but less memory expensive than superimpose all the
objects created?
Thanks in advanced,
Alexandre
--
Alexandre dos Santos
Geotechnologies and Spatial Statistics applied to Forest Entomology
Instituto Federal de Mato Grosso (IFMT) - Campus Caceres
Caixa Postal 244 (PO Box)
Avenida dos Ramires, s/n - Vila Real
Caceres - MT - CEP 78201-380 (ZIP code)
Phone: (+55) 65 99686-6970 / (+55) 65 3221-2674
Lattes CV: http://lattes.cnpq.br/1360403201088680
OrcID: orcid.org/0000-0001-8232-6722
ResearchGate: www.researchgate.net/profile/Alexandre_Santos10
Publons: https://publons.com/researcher/3085587/alexandre-dos-santos/
--
[[alternative HTML version deleted]]
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Vijay Lulla, PhD ORCID | <https://orcid.org/0000-0002-0823-2522> Homepage <http://vlulla.github.io> | Google Scholar <https://scholar.google.com/citations?user=VjhJWOgAAAAJ&hl=en> | Github <https://github.com/vlulla> [[alternative HTML version deleted]]