Removing not duplicated rows - R-help

Chris82

Fri, Apr 8, 2011 8:07 AM #

Hello R users,

I have a problem to delete rows in a table which are not duplicated in order
of an id number

a short example:

x <- data.frame(cbind(id=c(1,2,2,2,3,3,4,5,6,6), value=1:10))

x_new <- x[which(duplicated(x$id)),]

id value
3   2     3
4   2     4
6   3     6
10  6    10

The problem is that my command not only delet the id number which occur only
one time.
Also the first row with the id number (2,3 and 6 is deleted).

Thanks!

With best regards

--
View this message in context: http://r.789695.n4.nabble.com/Removing-not-duplicated-rows-tp3436600p3436600.html
Sent from the R help mailing list archive at Nabble.com.

Jeremy Hetzel

Fri, Apr 8, 2011 8:22 AM #

As I understand it, you are trying to subset the data frame to include only 
rows with a non-unique id.

Try this:
x <- data.frame(cbind(id=c(1,2,2,2,3,3,4,5,6,6), value=1:10))
id.table <- table(x$id)
x_new <- subset(x, id %in% id.table[id.table > 1])

Jeremy

Jeremy Hetzel

Fri, Apr 8, 2011 8:24 AM #

Sorry, I left out the names() function in the last step.

Try this instead:
x <- data.frame(cbind(id=c(1,2,2,2,3,3,4,5,6,6), value=1:10))
id.table <- table(x$id)
x_new <- subset(x, id %in% names(id.table[id.table > 1]))

Jeremy

Chris82

Fri, Apr 8, 2011 8:55 AM #

Thanks a lot, Jeremy!

It's working perfectly.


With best regards

--
View this message in context: http://r.789695.n4.nabble.com/Removing-not-duplicated-rows-tp3436600p3436736.html
Sent from the R help mailing list archive at Nabble.com.