Skip to content

Selecting all rows of factors which have at least one positive value?

5 messages · Stephan Lindner, Nutter, Benjamin, Patrizio Frederic +2 more

#
Dear all,

I'm trying to select from a dataframe all rows which correspond to a
factor (the id variable) for which there exists at least one positive
value of a certain variable. As an example:

x <- data.frame(matrix(c(rep(11,4),rep(12,3),rep(13,3),rep(0,3),1,rep(0,4),rep(1,2)),ncol=2))
X1 X2
1  11  0
2  11  0
3  11  0
4  11  1
5  12  0
6  12  0
7  12  0
8  13  0
9  13  1
10 13  1 


and I want to select all rows pertaining to factor levels of X1 for
which exists at least one "1" for X2. To be clear, I want rows 1:4
(since there exists at least one observation for X1==11 for which
X2==1) and rows 8:10 (likewise). 

It is easy to obtain the corresponding factor levels (i.e.,
unique(x$X1[x$X2==1])), but I got stalled selecting the corresponding
rows. I tried grep, but then I have to loop and concatenate the
resulting vector. Any ideas?


Thanks a lot!


	Stephan
#
x <-
data.frame(matrix(c(rep(11,4),rep(12,3),rep(13,3),rep(0,3),1,rep(0,4),re
p(1,2)),ncol=2))

id.keep <- unique(subset(x,X2>0)$X1)

x2 <- subset(x,X1 %in% id.keep)

x2

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Stephan Lindner
Sent: Thursday, April 02, 2009 11:26 AM
To: r-help at stat.math.ethz.ch
Subject: [R] Selecting all rows of factors which have at least one
positive value?

Dear all,

I'm trying to select from a dataframe all rows which correspond to a
factor (the id variable) for which there exists at least one positive
value of a certain variable. As an example:

x <-
data.frame(matrix(c(rep(11,4),rep(12,3),rep(13,3),rep(0,3),1,rep(0,4),re
p(1,2)),ncol=2))
X1 X2
1  11  0
2  11  0
3  11  0
4  11  1
5  12  0
6  12  0
7  12  0
8  13  0
9  13  1
10 13  1 


and I want to select all rows pertaining to factor levels of X1 for
which exists at least one "1" for X2. To be clear, I want rows 1:4
(since there exists at least one observation for X1==11 for which
X2==1) and rows 8:10 (likewise). 

It is easy to obtain the corresponding factor levels (i.e.,
unique(x$X1[x$X2==1])), but I got stalled selecting the corresponding
rows. I tried grep, but then I have to loop and concatenate the
resulting vector. Any ideas?


Thanks a lot!


	Stephan
#
or the exactly equivalent form:

x[x$X1 %in% unique(x[x$X2>0,"X1"]), ]

Patrizio

2009/4/2 Nutter, Benjamin <NutterB at ccf.org>:
#
I think the unique function is superfluous:

 > x[x$X1 %in% x$X1[x$X2==1], ]
    X1 X2
1  11  0
2  11  0
3  11  0
4  11  1
8  13  0
9  13  1
10 13  1

--  

David Winsemius
On Apr 2, 2009, at 12:43 PM, Patrizio Frederic wrote:

            
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
#
Here's one way using plyr:

library(plyr)
ddply(x, "X1", subset, any(X2 == 1))

See http://had.co.nz/plyr for more details.

Hadley