Hi all, I'd like to know, if I can solve this with a shorter command: a <- rnorm(100) which(a > -0.5 & a < 0.5) # would give me all indices of numbers greater than -0.5 and smaller than +0.5 I have something similar with a dataframe and it produces sometimes quite long commands... I'd like to have something like: which(within.interval(a, -0.5, 0.5)) Is there anything I could use for this purpose? Antje
Find all numbers in a certain interval
8 messages · Antje, Duncan Murdoch, David Winsemius +2 more
Antje wrote:
Hi all, I'd like to know, if I can solve this with a shorter command: a <- rnorm(100) which(a > -0.5 & a < 0.5) # would give me all indices of numbers greater than -0.5 and smaller than +0.5 I have something similar with a dataframe and it produces sometimes quite long commands... I'd like to have something like: which(within.interval(a, -0.5, 0.5)) Is there anything I could use for this purpose?
Not in general, but in this particular case "abs(a) < 0.5" gives you the right result. By the way, some advice I read many years ago (in Kernighan and Plauger): always use < or <=, avoid > or >= in multiple comparisons. It's easier to read -0.5 < a & a < 0.5 than it is to read the form you used, because it is so much like the math notation -0.5 < a < 0.5. Duncan Murdoch
It's not entirely clear what you are asking for, since
which(within.interval(a, -0.5, 0.5)) is actually longer than which(a >
-0.5 & a < 0.5). You mention that you want a solution that applies to
dataframes. Using indexing you can get entire rows of dataframes that
satisfy multiple conditions on one of its columns:
>> DF <- data.frame(a = rnorm(20), b= LETTERS[1:20], c =
letters[20:1], stringsAsFactors=FALSE)
> DF[which( DF$a > -0.5 & DF$a < 0.5 ), ]
# note that one needs to avoid DF[which(a > -0.5 & a<0.5) , ]
# the "a" vector is not the same as the "a" column vector within DF
a b c
3 -0.47310672 C r
6 -0.49784460 F o
9 0.02571058 I l
10 0.16893759 J k
11 -0.11963322 K j
12 0.39378887 L i
16 0.03712263 P e
Could get the indices that satisfy more than one condition:
> which(DF$a > 0.5 & DF$b < "K")
[1] 1 2 6 10
Or you can get rows of DF that satisfy conditions on multiple columns
with the subset function:
> subset(DF, a > 0.5 & b < "K")
a b c
1 2.2500997 A t
2 0.7251357 B s
6 0.7845355 F o
10 1.0685649 J k
Or if you wanted a within.interval function
> within.interval <- function(x,a,b) { x > a & x < b}
> which(within.interval(DF$a, -0.5, 0.5))
[1] 3 4 7 8 9 13 14 17 20
David Winsemius Heritage Labs On Dec 16, 2008, at 5:09 AM, Antje wrote: > Hi all, > > I'd like to know, if I can solve this with a shorter command: > > a <- rnorm(100) > which(a > -0.5 & a < 0.5) > > # would give me all indices of numbers greater than -0.5 and smaller > than +0.5 > > I have something similar with a dataframe and it produces sometimes > quite long commands... > I'd like to have something like: > > which(within.interval(a, -0.5, 0.5)) > > Is there anything I could use for this purpose? > > > Antje > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi David, thanks a lot for your proposal. I got a lot of useful hints from all of you :-) David Winsemius schrieb:
It's not entirely clear what you are asking for, since which(within.interval(a, -0.5, 0.5)) is actually longer than which(a > -0.5 & a < 0.5).
Right but in case 'a' is something with a long name and '0.5' is a variable you might end up with something like this (for the data frame example): DF[which( DF$myReallyLongColumnName > -myReallyLongThreshold & DF$myReallyLongColumnName < -myReallyLongThreshold ), ] instead of: DF[which( within.interval(DF$myReallyLongColumnName, myReallyLongThreshold), ] You mention that you want a solution that applies to
dataframes. Using indexing you can get entire rows of dataframes that satisfy multiple conditions on one of its columns:
>> DF <- data.frame(a = rnorm(20), b= LETTERS[1:20], c = letters[20:1],
stringsAsFactors=FALSE)
> DF[which( DF$a > -0.5 & DF$a < 0.5 ), ]
# note that one needs to avoid DF[which(a > -0.5 & a<0.5) , ]
# the "a" vector is not the same as the "a" column vector within DF
a b c
3 -0.47310672 C r
6 -0.49784460 F o
9 0.02571058 I l
10 0.16893759 J k
11 -0.11963322 K j
12 0.39378887 L i
16 0.03712263 P e
Could get the indices that satisfy more than one condition:
> which(DF$a > 0.5 & DF$b < "K")
[1] 1 2 6 10 Or you can get rows of DF that satisfy conditions on multiple columns with the subset function:
> subset(DF, a > 0.5 & b < "K")
a b c 1 2.2500997 A t 2 0.7251357 B s 6 0.7845355 F o 10 1.0685649 J k Or if you wanted a within.interval function
> within.interval <- function(x,a,b) { x > a & x < b}
> which(within.interval(DF$a, -0.5, 0.5))
[1] 3 4 7 8 9 13 14 17 20
On Dec 16, 2008, at 7:19 AM, Antje wrote:
Hi David, thanks a lot for your proposal. I got a lot of useful hints from all of you :-) David Winsemius schrieb:
It's not entirely clear what you are asking for, since which(within.interval(a, -0.5, 0.5)) is actually longer than which(a > -0.5 & a < 0.5).
Right but in case 'a' is something with a long name and '0.5' is a variable you might end up with something like this (for the data frame example): DF[which( DF$myReallyLongColumnName > -myReallyLongThreshold & DF $myReallyLongColumnName < -myReallyLongThreshold ), ]
I see your point, but I must point out that no cases would ever satisfy that construction.
instead of: DF[which( within.interval(DF$myReallyLongColumnName, myReallyLongThreshold), ]
That would be a different within.interval function than I suggested,
but you could certainly create one which accepted a vector.
within.interval <- function(x, y) { min(y) < x & x < max(y) }
----------
> within.interval2 <- function(x,y) { min(y) < x & x < max(y)}
> y <- c(-.1, -.2, .1,.2)
> which(within.interval2(DF$a,y))
[1] 7 13 14 17
You mention that you want a solution that applies to
dataframes. Using indexing you can get entire rows of dataframes that satisfy multiple conditions on one of its columns:
DF <- data.frame(a = rnorm(20), b= LETTERS[1:20], c =
letters[20:1], stringsAsFactors=FALSE)
DF[which( DF$a > -0.5 & DF$a < 0.5 ), ]
# note that one needs to avoid DF[which(a > -0.5 & a<0.5) , ]
# the "a" vector is not the same as the "a" column vector within DF
a b c
3 -0.47310672 C r
6 -0.49784460 F o
9 0.02571058 I l
10 0.16893759 J k
11 -0.11963322 K j
12 0.39378887 L i
16 0.03712263 P e
Could get the indices that satisfy more than one condition:
which(DF$a > 0.5 & DF$b < "K")
[1] 1 2 6 10 Or you can get rows of DF that satisfy conditions on multiple columns with the subset function:
subset(DF, a > 0.5 & b < "K")
a b c 1 2.2500997 A t 2 0.7251357 B s 6 0.7845355 F o 10 1.0685649 J k Or if you wanted a within.interval function
within.interval <- function(x,a,b) { x > a & x < b}
which(within.interval(DF$a, -0.5, 0.5))
[1] 3 4 7 8 9 13 14 17 20
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Here are a couple of function definitions that may be more intuitive for some people (see the examples below the function defs). They are not perfect, but my tests showed they work left to right, right to left, outside in, but not inside out.
`%<%` <- function(x,y) {
xx <- attr(x,'orig.y')
yy <- attr(y,'orig.x')
if(is.null(xx)) {
xx <- x
x <- rep(TRUE, length(x))
}
if(is.null(yy)) {
yy <- y
y <- rep(TRUE, length(y))
}
out <- x & y & (xx < yy)
attr(out, 'orig.x') <- xx
attr(out, 'orig.y') <- yy
out
}
`%<=%` <- function(x,y) {
xx <- attr(x,'orig.y')
yy <- attr(y,'orig.x')
if(is.null(xx)) {
xx <- x
x <- rep(TRUE, length(x))
}
if(is.null(yy)) {
yy <- y
y <- rep(TRUE, length(y))
}
out <- x & y & (xx <= yy)
attr(out, 'orig.x') <- xx
attr(out, 'orig.y') <- yy
out
}
x <- -3:3
-2 %<% x %<% 2
c( -2 %<% x %<% 2 )
x[ -2 %<% x %<% 2 ]
x[ -2 %<=% x %<=% 2 ]
x <- rnorm(100)
y <- rnorm(100)
x[ -1 %<% x %<% 1 ]
range( x[ -1 %<% x %<% 1 ] )
cbind(x,y)[ -1 %<% x %<% y %<% 1, ]
cbind(x,y)[ (-1 %<% x) %<% (y %<% 1), ]
cbind(x,y)[ ((-1 %<% x) %<% y) %<% 1, ]
cbind(x,y)[ -1 %<% (x %<% (y %<% 1)), ]
cbind(x,y)[ -1 %<% (x %<% y) %<% 1, ] # oops
Hope this helps,
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of Antje Sent: Tuesday, December 16, 2008 3:09 AM To: r-help at stat.math.ethz.ch Subject: [R] Find all numbers in a certain interval Hi all, I'd like to know, if I can solve this with a shorter command: a <- rnorm(100) which(a > -0.5 & a < 0.5) # would give me all indices of numbers greater than -0.5 and smaller than +0.5 I have something similar with a dataframe and it produces sometimes quite long commands... I'd like to have something like: which(within.interval(a, -0.5, 0.5)) Is there anything I could use for this purpose? Antje
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Hi, If you can formulate your question it in terms of actual problem you have with data.frame it would be easier to answer. for the time being check subset() if it is what you want. SV.
On Tue, 16 Dec 2008 11:09:19 +0100, Antje <niederlein-rstat at yahoo.de> wrote:
Hi all, I'd like to know, if I can solve this with a shorter command: a <- rnorm(100) which(a > -0.5 & a < 0.5) # would give me all indices of numbers greater than -0.5 and smaller than +0.5 I have something similar with a dataframe and it produces sometimes quite long commands... I'd like to have something like: which(within.interval(a, -0.5, 0.5)) Is there anything I could use for this purpose? Antje
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks a lot for every answer I got! I could solve my problem! Greg, your proposal seems to be quite useful for me :-) Thank you. Ciao, Antje Antje schrieb:
Hi all, I'd like to know, if I can solve this with a shorter command: a <- rnorm(100) which(a > -0.5 & a < 0.5) # would give me all indices of numbers greater than -0.5 and smaller than +0.5 I have something similar with a dataframe and it produces sometimes quite long commands... I'd like to have something like: which(within.interval(a, -0.5, 0.5)) Is there anything I could use for this purpose? Antje
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.