Hello,
In a data frame I want to identify ALL duplicate IDs in the example to be
able to examine "OS" and "time".
(df<-data.frame(ID=c("userA", "userB", "userA", "userC"),
OS=c("Win","OSX","Win", "Win64"),
time=c("12:22","23:22","04:44","12:28")))
ID OS time
1 userA Win 12:22
2 userB OSX 23:22
3 userA Win 04:44
4 userC Win64 12:28
My desired output is that ALL records with the same IDs are found:
userA Win 12:22
userA Win 04:44
preferably by returning logical values (TRUE FALSE TRUE FALSE)
Is there a simple way to do that?
[-- With duplicated(df$ID) the output will be
[1] FALSE FALSE TRUE FALSE
i.e. not all user A records are found
With unique(df$ID)
[1] userA userB userC
Levels: userA userB userC
i.e. one of each ID is found --]
Erik Svensson
--
View this message in context: http://r.789695.n4.nabble.com/Find-all-duplicate-records-tp3865139p3865139.html
Sent from the R help mailing list archive at Nabble.com.
Find all duplicate records
4 messages · Erik Svensson, Uwe Ligges, Gabor Grothendieck
On 02.10.2011 16:05, Erik Svensson wrote:
Hello,
In a data frame I want to identify ALL duplicate IDs in the example to be
able to examine "OS" and "time".
(df<-data.frame(ID=c("userA", "userB", "userA", "userC"),
OS=c("Win","OSX","Win", "Win64"),
time=c("12:22","23:22","04:44","12:28")))
ID OS time
1 userA Win 12:22
2 userB OSX 23:22
3 userA Win 04:44
4 userC Win64 12:28
My desired output is that ALL records with the same IDs are found:
userA Win 12:22
userA Win 04:44
See ?split or ?subset Uwe Ligges
preferably by returning logical values (TRUE FALSE TRUE FALSE) Is there a simple way to do that? [-- With duplicated(df$ID) the output will be [1] FALSE FALSE TRUE FALSE i.e. not all user A records are found With unique(df$ID) [1] userA userB userC Levels: userA userB userC i.e. one of each ID is found --] Erik Svensson -- View this message in context: http://r.789695.n4.nabble.com/Find-all-duplicate-records-tp3865139p3865139.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Sun, Oct 2, 2011 at 10:05 AM, Erik Svensson
<erik.b.svensson at gmail.com> wrote:
Hello,
In a data frame I want to identify ALL duplicate IDs in the example to be
able to examine "OS" and "time".
(df<-data.frame(ID=c("userA", "userB", "userA", "userC"),
?OS=c("Win","OSX","Win", "Win64"),
?time=c("12:22","23:22","04:44","12:28")))
? ? ID ? ?OS ?time
1 userA ? Win 12:22
2 userB ? OSX 23:22
3 userA ? Win 04:44
4 userC Win64 12:28
My desired output is that ALL records with the same IDs are found:
userA ? Win 12:22
userA ? Win 04:44
preferably by returning logical values (TRUE FALSE TRUE FALSE)
Try this:
ave(rownames(df), df$ID, FUN = length) > 1
[1] TRUE FALSE TRUE FALSE
Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
It works, thanks a lot Gabor Erik -- View this message in context: http://r.789695.n4.nabble.com/Find-all-duplicate-records-tp3865139p3867724.html Sent from the R help mailing list archive at Nabble.com.