Skip to content
Prev 27123 / 29559 Next

Aggregating points based on distance

Ha! That is a great take. Thanks Barry.

?On 3/13/19, 11:34 AM, "R-sig-Geo on behalf of Barry Rowlingson" <r-sig-geo-bounces at r-project.org on behalf of b.rowlingson at gmail.com> wrote:
On Wed, Mar 13, 2019 at 6:14 PM Andy Bunn <bunna at wwu.edu> wrote:
> I would like to create averages of all the variables in a
    > SpatialPointsDataFrame when points are within a specified distance of each
    > other. I have a method for doing this but it seems like a silly way to
    > approach the problem. Any ideas for doing this using modern syntax
    > (especially of the tidy variety) would be appreciated.
    >
    >
    > To start, I have a SpatialPointsDataFrame with several variables measured
    > for each point. I'd like to get an average value for each variable for
    > points within a specified distance. E.g., getting average cadmium values
    > from the meuse data for points within 100 m of each other:
    >
    >     library(sf)
    >     library(sp)
    >     data(meuse)
    >     pts <- st_as_sf(meuse, coords = c("x", "y"), remove=FALSE)
    >     pts100 <- st_is_within_distance(pts, dist = 100)
    >     # can use sapply to get mean of a variable. E.g., cadmium
    >     sapply(pts100, function(x){ mean(pts$cadmium[x]) })
    >
    >
    If this is the method you call "silly" then I don't see anything silly at
    all here, only efficient well-written use of base R constructs. The problem
    with "modern" syntax is that its subject to rapid change and often slower
    than using base R, which has had years to stabilise and optimise.
    
    If you want to iterate this over variables then nest your sapplys:
    
    items = c("cadmium", "copper","lead")
    sapply(items, function(item){
     sapply(pts100, function(x){ mean(pts[[item]][x]) })
    })
    
    gets you:
    
             cadmium    copper      lead
      [1,] 10.150000  83.00000 288.00000
      [2,] 10.150000  83.00000 288.00000
      [3,]  6.500000  68.00000 199.00000
      [4,]  2.600000  81.00000 116.00000
    
    
    Barry
    
    
    > Above, I've figured out how to use sapply to do this variable by variable.
    > So I could, if I wanted, calculate the mean for each variable, generate a
    > centroid for each point and then a SpatialPointsDataFrame of the unique
    > values. E.g., for the first few variables:
    >
    >     res <- data.frame(id=1:length(pts100),
    >                       x=NA, y=NA,
    >                       cadmium=NA, copper=NA, lead=NA)
    >     res$x <- sapply(pts100, function(p){ mean(pts$x[p]) })
    >     res$y <- sapply(pts100, function(p){ mean(pts$y[p]) })
    >     res$cadmium <- sapply(pts100, function(p){ mean(pts$cadmium[p]) })
    >     res$copper <- sapply(pts100, function(p){ mean(pts$copper[p]) })
    >     res$lead <- sapply(pts100, function(p){ mean(pts$lead[p]) })
    >     res2 <- res[duplicated(res$cadmium),]
    >     coordinates(res2) <- c("x","y")
    >     bubble(res2,"cadmium")
    >
    >
    > This works but seems cumbersome and like there must be a more efficient
    > way.
    >
    >
    > Thanks for any help, Andy
    >
    >
    >
    > _______________________________________________
    > R-sig-Geo mailing list
    > R-sig-Geo at r-project.org
    > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&amp;data=02%7C01%7Cbunna%40wwu.edu%7C5470ab0ee3cb407f76ef08d6a7e2828f%7Cdc46140ce26f43efb0ae00f257f478ff%7C0%7C0%7C636880988634768989&amp;sdata=upduDGbDHMYznJ35Bv6sJZL8t3JBeJB%2FmCqgePjvmlo%3D&amp;reserved=0
    >
    
    
    _______________________________________________
    R-sig-Geo mailing list
    R-sig-Geo at r-project.org
    https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&amp;data=02%7C01%7Cbunna%40wwu.edu%7C5470ab0ee3cb407f76ef08d6a7e2828f%7Cdc46140ce26f43efb0ae00f257f478ff%7C0%7C0%7C636880988634768989&amp;sdata=upduDGbDHMYznJ35Bv6sJZL8t3JBeJB%2FmCqgePjvmlo%3D&amp;reserved=0