find closest value in a vector based on another vector values
I guess I could have been a little more authoritative: the code unique(a[sapply(b,function(x) which.min(abs(x-a)))]) is exactly what I need.
That method could be written as the following function
f0 <- function (a, b, unique = TRUE)
{
ret <- a[sapply(b, function(x) which.min(abs(x - a)))]
if (unique) {
ret <- unique(ret)
}
ret
}
If 'a' is in sorted order then I think the following, based on findInterval,
does the same thing in less time, especially when 'b' is longish.
If 'a' may not be sorted then add
f1 <- function (a, b, unique = TRUE)
{
leftI <- findInterval(b, a)
rightI <- leftI + 1
leftI[leftI == 0] <- 1
rightI[rightI > length(a)] <- length(a)
ret <- ifelse(abs(b - a[leftI]) < abs(b - a[rightI]), a[leftI], a[rightI])
if (unique) {
ret <- unique(ret)
}
ret
}
E.g.,
R> a <- sort(rnorm(1e6))
R> b <- sort(rnorm(1000))
R> system.time(r0 <- f0(a, b))
user system elapsed
4.88 3.48 8.36
R> system.time(r1 <- f1(a, b))
user system elapsed
0 0 0
R> identical(r0, r1)
[1] TRUE
If 'a' might be unsorted then add
if (is.unsorted(a)) a <- sort(a)
at the beginning. If the output must be in the same order as the original
'a' then use order(a) and subscript 'a' and 'ret' with its output.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Andras Farkas Sent: Tuesday, June 18, 2013 10:24 AM To: Bert Gunter Cc: R mailing list Subject: Re: [R] find closest value in a vector based on another vector values Bert, I guess I could have been a little more authoritative: the code unique(a[sapply(b,function(x) which.min(abs(x-a)))]) is exactly what I need. Thanks for the input, your comments helped us make the code better, Andras --- On Tue, 6/18/13, Bert Gunter <gunter.berton at gene.com> wrote:
From: Bert Gunter <gunter.berton at gene.com> Subject: Re: [R] find closest value in a vector based on another vector values To: "Andras Farkas" <motyocska at yahoo.com> Cc: "Jorge I Velez" <jorgeivanvelez at gmail.com>, "R mailing list" <r-help at r-
project.org>
Date: Tuesday, June 18, 2013, 10:55 AM Andras: No. Using the a = c(1,8,9) and b = 2:3 that ** I posted before**,? you get the single unique value of 1. Please stop guessing, think carefully about what you want to do, and **test** your code. -- Bert On Tue, Jun 18, 2013 at 7:41 AM, Andras Farkas <motyocska at yahoo.com> wrote:
Bert, thanks... The values should not repeat themselves if
the same a is closest to all b, so probably aruns example extended with a unique command works best?
unique(a[sapply(b,function(x) which.min(abs(x-a)))]) thanks, Andras --- On Tue, 6/18/13, Bert Gunter <gunter.berton at gene.com>
wrote:
From: Bert Gunter <gunter.berton at gene.com> Subject: Re: [R] find closest value in a vector
based on another vector values
To: "Jorge I Velez" <jorgeivanvelez at gmail.com> Cc: "Andras Farkas" <motyocska at yahoo.com>,
"R mailing list" <r-help at r-project.org>
Date: Tuesday, June 18, 2013, 10:07 AM Jorge: No.
a <-c(1,5,8,15,32,33.5,69) b <-c(8.5,33) a[findInterval(b, a)]
[1]? 8 32? ##should be 8???33.5 I believe it has to be done explicitly by finding
all the
differences and choosing those n with minimum values, depending
on what
n you want. Note that the problem is incompletely specified.
What if the
same value of a is closest to several values of b? -- do
you want
all the values you choose to be different or not, in which
case they
may not be minimum? a <- c(1, 8, 9) b <- c(2,3) Then what are the 2 closest values of a to b? -- Bert On Tue, Jun 18, 2013 at 5:43 AM, Jorge I Velez
<jorgeivanvelez at gmail.com>
wrote:
Dear Andras, Try
a[findInterval(b, a)]
[1]? 8 32 HTH, Jorge.- On Tue, Jun 18, 2013 at 10:34 PM, Andras
Farkas <motyocska at yahoo.com>
wrote:
Dear All, would you please provide your thoughts on
the
following:
let us say I have: a <-c(1,5,8,15,32,69) b <-c(8.5,33) and I would like to extract from "a" the
two values
that are closest to
the values in "b", where the length of
this vectors
may change but b will
allways be shorter than "a". So at the end
based on
this example I should
have the result "f" as f <-c(8,32) appreciate the help, Andras
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ???[[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
biostatistics/pdb-ncb-home.htm
-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
biostatistics/pdb-ncb-home.htm
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.