Skip to content

How to conditionally remove dataframe rows?

4 messages · Francisco Carvalho Diniz, David Winsemius, arun +1 more

#
On Mar 6, 2013, at 3:21 PM, Francisco Carvalho Diniz wrote:

            
Try this:

dfrm <- dfrm[ order(dfrm[[1]], -dfrm[[2]] ) , ]  
#put desired rows at top of each Point_counts category

# then take top item in each category

dfrm[ !duplicated(dfrm[[1]]) , ]
#
Hi,

dfrm<- read.table(text="
??????? Point_counts????? Psi_Sp

1??????????? A????????????????????? 0
2??????????? A????????????????????? 1
3??????????? B????????????????????? 1
4??????????? B????????????????????? 2
5??????????? B????????????????????? 0
6??????????? C????????????????????? 1
7??????????? D????????????????????? 1
8??????????? D????????????????????? 2
",sep="",header=TRUE,stringsAsFactors=FALSE)
?res<-do.call(rbind,lapply(split(dfrm,dfrm$Point_counts),function(x) x[which.max(x$Psi_Sp),]))
?row.names(res)<-1:nrow(res)
?# Point_counts Psi_Sp
#1??????????? A????? 1
#2??????????? B????? 2
#3??????????? C????? 1 #your input data doesn't have 0
#4??????????? D????? 2
A.K.



----- Original Message -----
From: Francisco Carvalho Diniz <chicocdiniz at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Wednesday, March 6, 2013 6:21 PM
Subject: [R] Fwd: How to conditionally remove dataframe rows?

Hi,

I have a data frame with two columns. I need to remove duplicated rows in
first column, but I need to do it conditionally to values of the second
column.

Example:

? ? ? ? Point_counts? ? ?  Psi_Sp

1? ? ? ? ? ? A? ? ? ? ? ? ? ? ? ? ?  0
2? ? ? ? ? ? A? ? ? ? ? ? ? ? ? ? ?  1
3? ? ? ? ? ? B? ? ? ? ? ? ? ? ? ? ?  1
4? ? ? ? ? ? B? ? ? ? ? ? ? ? ? ? ?  2
5? ? ? ? ? ? B? ? ? ? ? ? ? ? ? ? ?  0
6? ? ? ? ? ? C? ? ? ? ? ? ? ? ? ? ?  1
7? ? ? ? ? ? D? ? ? ? ? ? ? ? ? ? ?  1
8? ? ? ? ? ? D? ? ? ? ? ? ? ? ? ? ?  2


I need to turn this data frame in one without duplicated rows at
point-counts (one visit per point) but maintain the ones with maximum value
at Psi_Sp, e.g. remove row 1 and maintain 2 or remove rows 3 and 5 and
maintain 4. At the end I want a data frame like the one below:

? ? ? ?  Point_counts? ? ? ? ?  Psi_Sp

1? ? ? ? ? ? ? A? ? ? ? ? ? ? ? ? ? ? ? ?  1
2? ? ? ? ? ? ? B? ? ? ? ? ? ? ? ? ? ? ? ?  2
3? ? ? ? ? ? ? C? ? ? ? ? ? ? ? ? ? ? ? ?  0
4? ? ? ? ? ? ? D? ? ? ? ? ? ? ? ? ? ? ? ?  2

How can I do it? I found several ways to edit data frames, but
unfortunately I cound not use none of them.

I appreciate

Francisco

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Just to add another option to what Arun has provided below. That approach is very generalizable to data frames with >2 columns, where you want to filter based upon a finding a maximum value (or other perhaps more complex criteria) within one or more grouping columns and return all of the columns in the original data frame.

In this special case of a two column data frame, you can use ?aggregate easily with a formula based approach that might be easier to read. aggregate() essentially encapsulates what Arun has done below.

Thus:
Point_counts Psi_Sp
1            A      0
2            A      1
3            B      1
4            B      2
5            B      0
6            C      1
7            D      1
8            D      2
Point_counts Psi_Sp
1            A      1
2            B      2
3            C      1
4            D      2


Regards,

Marc Schwartz
On Mar 6, 2013, at 8:42 PM, arun <smartpink111 at yahoo.com> wrote: