Good morning, I'm afraid I don't even know *exactly* what I'm looking for - apart from some guidance please! I have about 1.5 million (x,y,value) triples - for the most part these are independent from each other - building location and sum insured. I'm sure there are *lots* of clusters but I've no idea how many, and I'm really only interested in looking at the clusters of highest value. I've already programmed a simple tagging of total value within 500 metres of every location - though not every building is accurately tagged - some are only geocoded to UK postcode - so all buildings in a postcode have the same coordinates. I'm looking to highlight "clusters" (definition unclear!) where there are a number of points "close together" (definition unclear!) and the sum of all the values in the "cluster" is "high".? I'm happy to ignore all "low" valued clusters or points which are of low value and all on their own.? There could be a maximum threshold distance (say 5km) or space between points beyond which it is definitely not part of a cluster.? The algorithm doesn't have to perfectly identify all clusters - I'm quite happy to start by looking that a small (say the top 10) set of highest valued "clusters". I've looked at a variety of sources on the web - but it is my understanding that 1 million+ points is considered *very* big for most clustering algorithms. I've only come across clustering by distance rather than sum of value and distance - I'm probably missing something or mis-interpreting what I'm seeing! I think I'm looking for a modified form of density clustering... Clearly I can't create a full-size distance matrix and perfection isn't expected ! :-) A modified DBSCAN looks like it might be what I'm looking for? Clearly an alternative to clustering is some sort of density algorithm that allows for value - but I can't quite get my head around how this might work. Could someone point me in the right direction - what other keywords should I be looking out for?? what R packages are worth a look? Thanks in advance, Sean O'Riordain Dublin, Ireland
spatial clustering taking account of "value"
1 message · Sean O'Riordain