Message-ID: <1346273872523-4641778.post@n4.nabble.com>
Date: 2012-08-29T20:57:52Z
From: ramoss
Subject: Deduping in R by multiple variables
I have a dataset w/ 184K obs & 16 variables. In SAS I proc sort nodupkey it
in seconds by 11 variables.
I tried to do the same thing in R using both the unique & then the
!duplicated functions but it just hangs there & I get no output. Does
anyone know how to solve this?
This is how I tried to do it in R:
detail3 <-
[!duplicated(c(detail2$TDATE,detail2$FIRM,detail2$CM,detail2$BRANCH,
detail2$BEGTIME,
detail2$ENDTIME,detail2$OTYPE,detail2$OCOND,
detail2$ACCTYP
,detail2$OSIDE,detail2$SHARES,detail2$STOCKS,
detail2$STKFUL)),]
detail3 <-
unique(detail2[,c(detail2$TDATE,detail2$FIRM,detail2$CM,detail2$BRANCH,
detail2$BEGTIME, detail2$ENDTIME,detail2$OTYPE,detail2$OCOND,
detail2$ACCTYP ,detail2$OSIDE,detail2$SHARES,detail2$STOCKS,
detail2$STKFUL)])
Thanks in advance
--
View this message in context: http://r.789695.n4.nabble.com/Deduping-in-R-by-multiple-variables-tp4641778.html
Sent from the R help mailing list archive at Nabble.com.