Find number of elements less than some number: Elegant/fast solution needed
This might be a bit quicker with larger vectors: f <- function(x, y) sum(x > y) vf <- Vectorize(f, "x") vf(x, y)
On Thu, Apr 14, 2011 at 5:37 PM, Marc Schwartz <marc_schwartz at me.com> wrote:
On Apr 14, 2011, at 2:34 PM, Kevin Ummel wrote:
Take vector x and a subset y:
x=1:10
y=c(4,5,7,9)
For each value in 'x', I want to know how many elements in 'y' are less than 'x'.
An example would be:
sapply(x,FUN=function(i) {length(which(y<i))})
[1] 0 0 0 0 1 2 2 3 3 4
But this solution is far too slow when x and y have lengths in the millions.
I'm certain an elegant (and computationally efficient) solution exists, but I'm in the weeds at this point.
Any help is much appreciated.
Kevin
University of Manchester
I started working on a solution to your problem above and then noted the one below. Here is one approach to the above:
colSums(outer(y, x, "<"))
?[1] 0 0 0 0 1 2 2 3 3 4
Take two vectors x and y, where y is a subset of x: x=1:10 y=c(2,5,6,9) If y is removed from x, the original x values now have a new placement (index) in the resulting vector (new): new=x[-y] index=1:length(new) The challenge is: How can I *quickly* and *efficiently* deduce the new 'index' value directly from the original 'x' value -- using only 'y' as an input? In practice, I have very large matrices containing the 'x' values, and I need to convert them to the corresponding 'index' if the 'y' values are removed.
Something like the following might work, if I correctly understand the problem:
match(x, x[-y])
?[1] ?1 NA ?2 ?3 NA NA ?4 ?5 NA ?6 HTH, Marc Schwartz
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.