An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101214/d6998d48/attachment.pl>
selecting certain rows from data frame
8 messages · Hrithik R, steven mosher, Ivan Calandra +2 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101214/c7e2a5d9/attachment.pl>
On 2010-12-14 23:57, steven mosher wrote:
Hi,
Next time give folks code to produce a toy sample of your problem
DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15))
DF
ID Data Stuff
1 1 2.0628225 1
2 1 0.6599165 2
3 1 0.5672595 3
4 2 -0.5308823 4
5 2 -0.5358471 5
6 2 -0.1414992 6
7 3 -0.1679643 7
8 3 0.9220922 8
9 3 0.8863018 9
10 4 -0.7255916 10
11 4 -1.2446753 11
12 4 0.8165567 12
13 5 0.0925008 13
14 5 -0.8534803 14
15 5 -0.6535016 15
# now I want to select rows where ID = 2 or 5
# Assign DF2 to those elements of DF where the ID variable=2 or 5
DF2<- DF[which(DF$ID==2 | DF$ID==5), ]
Or use subset(): DF2 <- subset(DF, ID %in% c(2,5)) Peter Ehlers
DF2
ID Data Stuff
4 2 -0.5308823 4
5 2 -0.5358471 5
6 2 -0.1414992 6
13 5 0.0925008 13
14 5 -0.8534803 14
15 5 -0.6535016 15
On Tue, Dec 14, 2010 at 10:10 PM, Hrithik R<rithrr at yahoo.com> wrote:
Hi, if I have a dataframe such that ID Time Earn 1 1 10 1 2 50 1 3 68 2 1 40 2 2 78 2 4 88 3 1 50 3 2 60 3 3 98 4 1 33 4 2 48 4 4 58 ..... .... ..... Now if I have to select the all the rows from the data frame which does not include rows with certain IDs, say for example (prime) ID == 2& 3, how do I do it Thanks Rith
Hi, Just to note that which() is unnecessary here: DF2 <- DF[DF$ID==2 | DF$ID==5, ] Ivan Le 12/15/2010 08:57, steven mosher a ?crit :
Hi,
Next time give folks code to produce a toy sample of your problem
DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15))
DF
ID Data Stuff
1 1 2.0628225 1
2 1 0.6599165 2
3 1 0.5672595 3
4 2 -0.5308823 4
5 2 -0.5358471 5
6 2 -0.1414992 6
7 3 -0.1679643 7
8 3 0.9220922 8
9 3 0.8863018 9
10 4 -0.7255916 10
11 4 -1.2446753 11
12 4 0.8165567 12
13 5 0.0925008 13
14 5 -0.8534803 14
15 5 -0.6535016 15
# now I want to select rows where ID = 2 or 5
# Assign DF2 to those elements of DF where the ID variable=2 or 5
DF2<- DF[which(DF$ID==2 | DF$ID==5), ]
DF2
ID Data Stuff
4 2 -0.5308823 4
5 2 -0.5358471 5
6 2 -0.1414992 6
13 5 0.0925008 13
14 5 -0.8534803 14
15 5 -0.6535016 15
On Tue, Dec 14, 2010 at 10:10 PM, Hrithik R<rithrr at yahoo.com> wrote:
Hi,
if I have a dataframe such that
ID Time Earn
1 1 10
1 2 50
1 3 68
2 1 40
2 2 78
2 4 88
3 1 50
3 2 60
3 3 98
4 1 33
4 2 48
4 4 58
.....
....
.....
Now if I have to select the all the rows from the data frame which does not
include rows with certain IDs, say for example (prime) ID == 2& 3, how do
I do
it
Thanks
Rith
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
On Dec 15, 2010, at 4:18 AM, Ivan Calandra wrote:
Hi, Just to note that which() is unnecessary here: DF2 <- DF[DF$ID==2 | DF$ID==5, ]
And to further note that it is only unnecessary of you have no NA's in that ID column. > DF[4,1] <- NA > DF[8,1] <- NA > DF2 <- DF[DF$ID==2 | DF$ID==5, ] (These NA rows would not appear if which() were used.)
David. > > Ivan > > Le 12/15/2010 08:57, steven mosher a ?crit : >> Hi, >> Next time give folks code to produce a toy sample of your problem >> >> DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15)) >> DF >> ID Data Stuff >> 1 1 2.0628225 1 >> 2 1 0.6599165 2 >> 3 1 0.5672595 3 >> 4 2 -0.5308823 4 >> 5 2 -0.5358471 5 >> 6 2 -0.1414992 6 >> 7 3 -0.1679643 7 >> 8 3 0.9220922 8 >> 9 3 0.8863018 9 >> 10 4 -0.7255916 10 >> 11 4 -1.2446753 11 >> 12 4 0.8165567 12 >> 13 5 0.0925008 13 >> 14 5 -0.8534803 14 >> 15 5 -0.6535016 15 >> >> # now I want to select rows where ID = 2 or 5 >> # Assign DF2 to those elements of DF where the ID variable=2 or 5 >> >> DF2<- DF[which(DF$ID==2 | DF$ID==5), ] >> DF2 >> ID Data Stuff >> 4 2 -0.5308823 4 >> 5 2 -0.5358471 5 >> 6 2 -0.1414992 6 >> 13 5 0.0925008 13 >> 14 5 -0.8534803 14 >> 15 5 -0.6535016 15 >> >> On Tue, Dec 14, 2010 at 10:10 PM, Hrithik R<rithrr at yahoo.com> wrote: >> >>> Hi, >>> if I have a dataframe such that >>> >>> ID Time Earn >>> 1 1 10 >>> 1 2 50 >>> 1 3 68 >>> 2 1 40 >>> 2 2 78 >>> 2 4 88 >>> 3 1 50 >>> 3 2 60 >>> 3 3 98 >>> 4 1 33 >>> 4 2 48 >>> 4 4 58 >>> ..... >>> .... >>> ..... >>> >>> Now if I have to select the all the rows from the data frame which >>> does not >>> include rows with certain IDs, say for example (prime) ID == 2& >>> 3, how do >>> I do >>> it >>> >>> >>> Thanks >>> >>> Rith >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Ivan CALANDRA > PhD Student > University of Hamburg > Biozentrum Grindel und Zoologisches Museum > Abt. S?ugetiere > Martin-Luther-King-Platz 3 > D-20146 Hamburg, GERMANY > +49(0)40 42838 6231 > ivan.calandra at uni-hamburg.de > > ********** > http://www.for771.uni-bonn.de > http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101215/698d01f0/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101215/e8271aa2/attachment.pl>
On 2010-12-15 09:44, Hrithik R wrote:
Hi Steven and Peter, I apologise for not providing the code for the sample I now realise what I need may be a bit tricky... my dataframe has hundreds of IDs in which case Steven's solution will not be optimum Peter's solution seems best, but how do I reverse this and use it to select the dataframe rows which */_do not_/* contain particular IDs say for example IDs 2 and 5 in this case.
That's easy; use the 'NOT' operator ('!' in R):
DF2 <- subset(DF, !(ID %in% c(2,5)))
Peter Ehlers
Thanks again for your time Rith ------------------------------------------------------------------------ *From:* Peter Ehlers <ehlers at ucalgary.ca> *To:* steven mosher <moshersteven at gmail.com> *Cc:* Hrithik R <rithrr at yahoo.com>; "r-help at r-project.org" <r-help at r-project.org> *Sent:* Wed, December 15, 2010 3:26:14 AM *Subject:* Re: [R] selecting certain rows from data frame On 2010-12-14 23:57, steven mosher wrote:
> Hi, > Next time give folks code to produce a toy sample of your problem > > DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15)) > DF > ID Data Stuff > 1 1 2.0628225 1 > 2 1 0.6599165 2 > 3 1 0.5672595 3 > 4 2 -0.5308823 4 > 5 2 -0.5358471 5 > 6 2 -0.1414992 6 > 7 3 -0.1679643 7 > 8 3 0.9220922 8 > 9 3 0.8863018 9 > 10 4 -0.7255916 10 > 11 4 -1.2446753 11 > 12 4 0.8165567 12 > 13 5 0.0925008 13 > 14 5 -0.8534803 14 > 15 5 -0.6535016 15 > > # now I want to select rows where ID = 2 or 5 > # Assign DF2 to those elements of DF where the ID variable=2 or 5 > > DF2<- DF[which(DF$ID==2 | DF$ID==5), ]
Or use subset(): DF2 <- subset(DF, ID %in% c(2,5)) Peter Ehlers
> DF2 > ID Data Stuff > 4 2 -0.5308823 4 > 5 2 -0.5358471 5 > 6 2 -0.1414992 6 > 13 5 0.0925008 13 > 14 5 -0.8534803 14 > 15 5 -0.6535016 15 > > On Tue, Dec 14, 2010 at 10:10 PM, Hrithik R<rithrr at yahoo.com
<mailto:rithrr at yahoo.com>> wrote:
>
>> Hi, >> if I have a dataframe such that >> >> ID Time Earn >> 1 1 10 >> 1 2 50 >> 1 3 68 >> 2 1 40 >> 2 2 78 >> 2 4 88 >> 3 1 50 >> 3 2 60 >> 3 3 98 >> 4 1 33 >> 4 2 48 >> 4 4 58 >> ..... >> .... >> ..... >> >> Now if I have to select the all the rows from the data frame which
does not
>> include rows with certain IDs, say for example (prime) ID == 2& 3,
how do
>> I do >> it >> >> >> Thanks >> >> Rith >>