Skip to content

Near function?

5 messages · Bart Joosen, Dieter Menne, jim holtman +1 more

#
Bart Joosen <bartjoosen <at> hotmail.com> writes:
another column of the dataframe.
others by their value. For example:
....
Why not? It's extremely efficient for large series and the only method that
would work with large array. The idea: Keep the indexes of the sort order, mark
the "near others" for example making their index NA, and restore original order.
No for-loop needed.

Dieter
1 day later
#
Dear Bart,

"hclust" might be useful for this as well:

   dat = c(1,20,2,21)

   hc = hclust(dist(dat))

   thresh = 2
   ct = cutree(hc, h=thresh)

   clusteredNumbers = split(dat, ct)
   firstOne = dat[!duplicated(ct)]

 >  clusteredNumbers
$`1`
[1] 1 2
$`2`
[1] 20 21


 > firstOne
[1]  1 20


  Best wishes
   Wolfgang