An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20110309/b200f6d8/attachment.pl>
Help to eliminate duplicated from data.frame but Special Problem
3 messages · gianni lavaredo, Sarah Goslee, Jon Olav Skoien
So you want to look at all rows, not just the index? Then specify that:
my.df[!duplicated(my.df),]
Id value1 value2 1 1 10 100 2 2 20 200 3 3 30 300 4 4 40 400 5 5 50 500 7 6 60 600 8 7 70 700 9 8 80 800 11 8 81 799 12 9 90 900 R will do exactly what you tell it, and only that. And thank you for including a workable example! Sarah On Wed, Mar 9, 2011 at 9:42 AM, gianni lavaredo
<gianni.lavaredo at gmail.com> wrote:
Dear Reseacher, i need to resolve the following problem. I wish to delete duplicate row from a data.frame but not all duplicate row: ex: my.df <- data.frame(Id=c(1,2,3,4,5,5,6,7,8,8,8,9), value1=c(10,20,30,40,50,50,60,70,80,80,81,90), value2=c(100,200,300,400,500,500,600,700,800,800,799,900))
my.df
? Id value1 value2 1 ? 1 ? ? 10 ? ?100 2 ? 2 ? ? 20 ? ?200 3 ? 3 ? ? 30 ? ?300 4 ? 4 ? ? 40 ? ?400 5 ? 5 ? ? 50 ? ?500 6 ? 5 ? ? 50 ? ?500 7 ? 6 ? ? 60 ? ?600 8 ? 7 ? ? 70 ? ?700 9 ? 8 ? ? 80 ? ?800 10 ?8 ? ? 80 ? ?800 11 ?8 ? ? 81 ? ?799 12 ?9 ? ? 90 ? ?900 eliminate
my.df
? Id value1 value2 1 ? 1 ? ? 10 ? ?100 2 ? 2 ? ? 20 ? ?200 3 ? 3 ? ? 30 ? ?300 4 ? 4 ? ? 40 ? ?400 5 ? 5 ? ? 50 ? ?500 7 ? 6 ? ? 60 ? ?600 8 ? 7 ? ? 70 ? ?700 9 ? 8 ? ? 80 ? ?800 11 ?8 ? ? 81 ? ?799 12 ?9 ? ? 90 ? ?900 but if I use xx <- ?my.df[!duplicated( my.df$Id), ] my result is
xx
? Id value1 value2 1 ? 1 ? ? 10 ? ?100 2 ? 2 ? ? 20 ? ?200 3 ? 3 ? ? 30 ? ?300 4 ? 4 ? ? 40 ? ?400 5 ? 5 ? ? 50 ? ?500 7 ? 6 ? ? 60 ? ?600 8 ? 7 ? ? 70 ? ?700 9 ? 8 ? ? 80 ? ?800 12 ?9 ? ? 90 ? ?900 thanks in advance Gianni
Sarah Goslee http://www.functionaldiversity.org
Hi Gianni,
From the example it seems like you want to check if value1 is
duplicated, not Id:
> my.df[!duplicated(my.df$value1),]
You can also remove duplicated rows based on the values of more than one
column:
> my.df[!duplicated(my.df[,c("Id","value1")]),]
Does any of these do what you want?
Cheers,
Jon
On 3/9/2011 3:42 PM, gianni lavaredo wrote:
Dear Reseacher, i need to resolve the following problem. I wish to delete duplicate row from a data.frame but not all duplicate row: ex: my.df<- data.frame(Id=c(1,2,3,4,5,5,6,7,8,8,8,9), value1=c(10,20,30,40,50,50,60,70,80,80,81,90), value2=c(100,200,300,400,500,500,600,700,800,800,799,900))
my.df
Id value1 value2 1 1 10 100 2 2 20 200 3 3 30 300 4 4 40 400 5 5 50 500 6 5 50 500 7 6 60 600 8 7 70 700 9 8 80 800 10 8 80 800 11 8 81 799 12 9 90 900 eliminate
my.df
Id value1 value2 1 1 10 100 2 2 20 200 3 3 30 300 4 4 40 400 5 5 50 500 7 6 60 600 8 7 70 700 9 8 80 800 11 8 81 799 12 9 90 900 but if I use xx<- my.df[!duplicated( my.df$Id), ] my result is
xx
Id value1 value2 1 1 10 100 2 2 20 200 3 3 30 300 4 4 40 400 5 5 50 500 7 6 60 600 8 7 70 700 9 8 80 800 12 9 90 900 thanks in advance Gianni [[alternative HTML version deleted]]
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo