Identifying duplicate rows?
On Mon, Sep 10, 2012 at 11:23:42AM -0700, kborgmann wrote:
Hi, I am trying to identify duplicate values in a column in a date frame. The duplicated function identifies the duplicate rows in the data frame but it only does this for the second record, not both records. Is there a way to mark both rows in the data frame as TRUE? dfA$dups<-duplicated(dfA$Value) dfA Site State Value dups 929 VA 73 FALSE 929 VA 73 TRUE 930 VA 76 FALSE 930 VA 76 TRUE 931 VA 74 FALSE 932 VA 75 FALSE But I would like this Site State Value dups 929 VA 73 TRUE 929 VA 73 TRUE 930 VA 76 TRUE 930 VA 76 TRUE 931 VA 74 FALSE 932 VA 75 FALSE
Hi.
Try the following.
dfA <- cbind(State="VA", data.frame(Value=c(73, 73, 76, 76, 74, 75)))
dfA$dups <- duplicated(dfA$Value) | duplicated(dfA$Value, fromLast=TRUE)
dfA
State Value dups
1 VA 73 TRUE
2 VA 73 TRUE
3 VA 76 TRUE
4 VA 76 TRUE
5 VA 74 FALSE
6 VA 75 FALSE
Hope this helps.
Petr Savicky.