Skip to content
Prev 333220 / 398506 Next

Find backward duplicates in a data frame

So rows are considered duplicated if they have the same two characters,
regardless of which column they're in?

If the B A row came first is it ok to keep that row, or would you want to
keep the A B row?

This appears to work, at least for this example.

  foo <- t(apply(test,1, function(x) sort(format(x)) ))
  test[ !duplicated(foo),]

  a u
1 A B
2 A C
4 B F
6 D W


Note that the function sorts the formatted value, in case the factor
levels are such that they don't sort alphabetically.

Notice also that in the result, the second column ('u') is still a factor,
and its levels still include 'A', even though A no longer is present in
the column. Whether or not that's wanted, I couldn't say.

-Don