find closest value in a vector based on another vector values

I guess I could have been a little more authoritative: the code
unique(a[sapply(b,function(x) which.min(abs(x-a)))]) is exactly what I need. 
That method could be written as the following function
f0 <- function (a, b, unique = TRUE) 
{
    ret <- a[sapply(b, function(x) which.min(abs(x - a)))]
    if (unique) { 
        ret <- unique(ret)
    }
    ret
}

If 'a' is in sorted order then I think the following, based on findInterval,
does the same thing in less time, especially when 'b' is longish.
If 'a' may not be sorted then add

f1 <- function (a, b, unique = TRUE) 
{
    leftI <- findInterval(b, a)
    rightI <- leftI + 1
    leftI[leftI == 0] <- 1
    rightI[rightI > length(a)] <- length(a)
    ret <- ifelse(abs(b - a[leftI]) < abs(b - a[rightI]), a[leftI],  a[rightI])
    if (unique) { 
        ret <- unique(ret)
    }
    ret
}

E.g.,

R> a <- sort(rnorm(1e6))
R> b <- sort(rnorm(1000))
R> system.time(r0 <- f0(a, b))
   user  system elapsed 
   4.88    3.48    8.36 
R> system.time(r1 <- f1(a, b))
   user  system elapsed 
      0       0       0 
R> identical(r0, r1)
[1] TRUE

If 'a' might be unsorted then add
    if (is.unsorted(a))  a <- sort(a)
at the beginning.  If the output must be in the same order as the original
'a' then use order(a) and subscript 'a' and 'ret' with its output.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
Of Andras Farkas
Sent: Tuesday, June 18, 2013 10:24 AM
To: Bert Gunter
Cc: R mailing list
Subject: Re: [R] find closest value in a vector based on another vector values

Bert,

I guess I could have been a little more authoritative: the code
unique(a[sapply(b,function(x) which.min(abs(x-a)))]) is exactly what I need. Thanks for the
input, your comments helped us make the code better,

Andras

--- On Tue, 6/18/13, Bert Gunter <gunter.berton at gene.com> wrote:

From: Bert Gunter <gunter.berton at gene.com>
Subject: Re: [R] find closest value in a vector based on another vector values
To: "Andras Farkas" <motyocska at yahoo.com>
Cc: "Jorge I Velez" <jorgeivanvelez at gmail.com>, "R mailing list" <r-help at r-
project.org>
Date: Tuesday, June 18, 2013, 10:55 AM
Andras:

No.
Using the a = c(1,8,9) and b = 2:3 that ** I posted
before**,? you get
the single unique value of 1.

Please stop guessing, think carefully about what you want to
do, and
**test** your code.

-- Bert

On Tue, Jun 18, 2013 at 7:41 AM, Andras Farkas <motyocska at yahoo.com>
wrote:
Bert,

thanks... The values should not repeat themselves if
the same a is closest to all b, so probably aruns example
extended with a unique command works best?
unique(a[sapply(b,function(x) which.min(abs(x-a)))])

thanks,

Andras

--- On Tue, 6/18/13, Bert Gunter <gunter.berton at gene.com>
wrote:

From: Bert Gunter <gunter.berton at gene.com>
Subject: Re: [R] find closest value in a vector
based on another vector values
To: "Jorge I Velez" <jorgeivanvelez at gmail.com>
Cc: "Andras Farkas" <motyocska at yahoo.com>,
"R mailing list" <r-help at r-project.org>
Date: Tuesday, June 18, 2013, 10:07 AM
Jorge: No.

a <-c(1,5,8,15,32,33.5,69)
b <-c(8.5,33)
a[findInterval(b, a)]
[1]? 8 32? ##should be
8???33.5

I believe it has to be done explicitly by finding
all the
differences
and choosing those n with minimum values, depending
on what
n you
want.

Note that the problem is incompletely specified.
What if the
same
value of a is closest to several values of b? -- do
you want
all the
values you choose to be different or not, in which
case they
may not
be minimum?

a <- c(1, 8, 9)
b <- c(2,3)

Then what are the 2 closest values of a to b?

-- Bert

On Tue, Jun 18, 2013 at 5:43 AM, Jorge I Velez
<jorgeivanvelez at gmail.com>
wrote:
Dear Andras,

Try

a[findInterval(b, a)]
[1]? 8 32

HTH,
Jorge.-

On Tue, Jun 18, 2013 at 10:34 PM, Andras
Farkas <motyocska at yahoo.com>
wrote:

Dear All,

would you please provide your thoughts on
the
following:
let us say I have:

a <-c(1,5,8,15,32,69)
b <-c(8.5,33)

and I would like to extract from "a" the
two values
that are closest to
the values in "b", where the length of
this vectors
may change but b will
allways be shorter than "a". So at the end
based on
this example I should
have the result "f" as

f <-c(8,32)

appreciate the help,

Andras

______________________________________________
R-help at r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal,
self-contained,
reproducible code.

???[[alternative
HTML version deleted]]

______________________________________________
R-help at r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal,
self-contained,
reproducible code.

--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
biostatistics/pdb-ncb-home.htm

--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
biostatistics/pdb-ncb-home.htm

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

find closest value in a vector based on another vector values

Thread (9 messages)