Near function?

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20070210/9b7aacb0/attachment.pl
Bart Joosen <bartjoosen <at> hotmail.com> writes:
Hi,

I have an integer which is extracted from a dataframe, which is sorted by
another column of the dataframe.
Now I would like to remove some elements of the integer, which are near to
others by their value. For example:
integer: c(1,20,2,21) should be c(1,20).
....
Sorting the integer is not an option, the order is important.
Why not? It's extremely efficient for large series and the only method that
would work with large array. The idea: Keep the indexes of the sort order, mark
the "near others" for example making their index NA, and restore original order.
No for-loop needed.

Dieter
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20070210/e7b2f42f/attachment.pl
Dear Bart,

"hclust" might be useful for this as well:

   dat = c(1,20,2,21)

   hc = hclust(dist(dat))

   thresh = 2
   ct = cutree(hc, h=thresh)

   clusteredNumbers = split(dat, ct)
   firstOne = dat[!duplicated(ct)]

 >  clusteredNumbers
$`1`
[1] 1 2
$`2`
[1] 20 21

 > firstOne
[1]  1 20

  Best wishes
   Wolfgang
I have an integer which is extracted from a dataframe, which is sorted by another column of the dataframe.
Now I would like to remove some elements of the integer, which are near to others by their value. For example: integer: c(1,20,2,21) should be c(1,20).

I tried to write a function, but for some reason, somethings won't work

x <- 1:20
near <- function(x,th) {
    nr <- NROW(x)
        for (i in 1:(nr-1)){
        for (j in (i+1):nr){
            if (j > nr) break
            t=0
            if (abs(x[i] - x[j]) < th) t = 1
            if (t== 1) x <- x[-j]
            if (t== 1) nr <- nr-1
            if (t== 1) j <- (j-1)
            cat (" i",i," j",j,"\n")
            }} 
x
}
near(x,10)

This gives you 1  3  7 13 17 while I was suspecting 1, 20 as the outcome.
If you look at the intermediate results of the cat instruction, you see that, after he substracted a number, he skipped the next one.

Sorting the integer is not an option, the order is important.
I used an integer from 1:20 as an example, while x <- sample((1:20),20) is maybe a bit more representable for our data, but isn't reproducible for the output of the function.

Maybe there is already an R-function, which does such thing, or what is wrong with my coding?

thanks a lot for your time

Bart
	[[alternative HTML version deleted]]

______________________________________________
R-help a stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20070211/2c60b1f6/attachment.pl