Jason, (moved back to R-help)
On Sat, May 30, 2009 at 3:30 PM, Jason Rupert <jasonkrupert at yahoo.com> wrote:
Jay, I really appreciate all your help help. I posted to Nabble an R file and input CSV files more accurately demonstrating what I am seeing and the output I desire to achieve when I difference two dataframes. http://n2.nabble.com/Support-SetDiff-Discussion-Items...-td2999739.html It may be that "setdiff" as intended in the base R functionality and "prob" was never intended to provide the type of result I desire. ?If that is the case then I will need to ask the "Ninjas" for help to produce the out come I seek. That is, when I different the data within RSetDiffEntry.csv and RSetDuplicatesRemoved.csv, I desire to get the result shown in ?RDesired.csv. Note that, it would not be enough to just work to remove duplicate "CostPerSquareFoot" values, since that variable is tied to "EntryDate" and "HouseNumber". Any further help and insights are much appreciated. Thanks again, Jason
From your description, something like the following should work:
Let A = your RSetDiffEntry Let B = your RSetDuplicatesRemoved... library(prob) C <- setdiff(A,B) D <- rbind(A,C) E <- D[duplicated(D),] The E should = your RDesired. Hope this helps, Jay P.S. I notice your row number 7 in "RSetDuplicatesRemoved" is duplicated by the following row. That's a typo, yes? If so, then E should have one more row than your "RDesired."