Skip to content

find closest value in a vector based on another vector values

9 messages · Andras Farkas, arun, Jorge I Velez +2 more

#
Dear All,

would you please provide your thoughts on the following:
let us say I have:

a <-c(1,5,8,15,32,69)
b <-c(8.5,33)

and I would like to extract from "a" the two values that are closest to the values in "b", where the length of this vectors may change but b will allways be shorter than "a". So at the end based on this example I should have the result "f" as

f <-c(8,32)

appreciate the help,

Andras
#
Hi,
Perhaps this works:

a[sapply(b,function(x) which.min(abs(x-a)))]
#[1]? 8 32

A.K.

----- Original Message -----
From: Andras Farkas <motyocska at yahoo.com>
To: r-help at r-project.org
Cc: 
Sent: Tuesday, June 18, 2013 8:34 AM
Subject: [R] find closest value in a vector based on another vector values

Dear All,

would you please provide your thoughts on the following:
let us say I have:

a <-c(1,5,8,15,32,69)
b <-c(8.5,33)

and I would like to extract from "a" the two values that are closest to the values in "b", where the length of this vectors may change but b will allways be shorter than "a". So at the end based on this example I should have the result "f" as

f <-c(8,32)

appreciate the help,

Andras

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Jorge: No.
[1]  8 32  ##should be  8   33.5

I believe it has to be done explicitly by finding all the differences
and choosing those n with minimum values, depending on what n you
want.

Note that the problem is incompletely specified. What if the same
value of a is closest to several values of b? -- do you want all the
values you choose to be different or not, in which case they may not
be minimum?

a <- c(1, 8, 9)
b <- c(2,3)

Then what are the 2 closest values of a to b?

-- Bert
On Tue, Jun 18, 2013 at 5:43 AM, Jorge I Velez <jorgeivanvelez at gmail.com> wrote:

  
    
#
Bert,

thanks... The values should not repeat themselves if the same a is closest to all b, so probably aruns example extended with a unique command works best?

unique(a[sapply(b,function(x) which.min(abs(x-a)))])

thanks,

Andras
--- On Tue, 6/18/13, Bert Gunter <gunter.berton at gene.com> wrote:

            
#
Andras:

No.
Using the a = c(1,8,9) and b = 2:3 that ** I posted before**,  you get
the single unique value of 1.

Please stop guessing, think carefully about what you want to do, and
**test** your code.

-- Bert
On Tue, Jun 18, 2013 at 7:41 AM, Andras Farkas <motyocska at yahoo.com> wrote:

  
    
#
Bert,

I guess I could have been a little more authoritative: the code 
unique(a[sapply(b,function(x) which.min(abs(x-a)))]) is exactly what I need. Thanks for the input, your comments helped us make the code better,

Andras
--- On Tue, 6/18/13, Bert Gunter <gunter.berton at gene.com> wrote:

            
#
That method could be written as the following function
f0 <- function (a, b, unique = TRUE) 
{
    ret <- a[sapply(b, function(x) which.min(abs(x - a)))]
    if (unique) { 
        ret <- unique(ret)
    }
    ret
}

If 'a' is in sorted order then I think the following, based on findInterval,
does the same thing in less time, especially when 'b' is longish.
If 'a' may not be sorted then add

f1 <- function (a, b, unique = TRUE) 
{
    leftI <- findInterval(b, a)
    rightI <- leftI + 1
    leftI[leftI == 0] <- 1
    rightI[rightI > length(a)] <- length(a)
    ret <- ifelse(abs(b - a[leftI]) < abs(b - a[rightI]), a[leftI],  a[rightI])
    if (unique) { 
        ret <- unique(ret)
    }
    ret
}

E.g.,

R> a <- sort(rnorm(1e6))
R> b <- sort(rnorm(1000))
R> system.time(r0 <- f0(a, b))
   user  system elapsed 
   4.88    3.48    8.36 
R> system.time(r1 <- f1(a, b))
   user  system elapsed 
      0       0       0 
R> identical(r0, r1)
[1] TRUE

If 'a' might be unsorted then add
    if (is.unsorted(a))  a <- sort(a)
at the beginning.  If the output must be in the same order as the original
'a' then use order(a) and subscript 'a' and 'ret' with its output.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com