Skip to content

setdiff bizarre (was: odd behavior out of setdiff)

1 message · Jason Rupert

#
Jay, 

Thanks again for all your help.  

I have ended up with something similar that appears to work and truly does provide the difference of two data frames including all the duplicate rows that may be removed due to filtering.  

Thanks again as this will be very helpful to me going forward as the data I receive often has duplicate rows that I filter out but want to double check that it is filtered out. 


Entry_DF<-read.csv("RSetDiffEntry.csv", header = TRUE)

EntryFiltered_DF<-subset(Entry_DF, !duplicated(Entry_DF))
EntryFiltered_DF<-subset(EntryFiltered_DF, !(EntryFiltered_DF$CostPerSquareFoot==0))
EntryFiltered_DF<-subset(EntryFiltered_DF, EntryFiltered_DF$CostPerSquareFoot>0)
EntryFiltered_DF<-subset(EntryFiltered_DF, EntryFiltered_DF$CostPerSquareFoot<300)

library("prob")
setDiff_DF<-setdiff(Entry_DF, EntryFiltered_DF)


DuplicateRows_DF<-subset(Entry_DF, duplicated(Entry_DF))


DesiredDFDiff_DF<-rbind(DuplicateRows_DF, setDiff_DF)

DesiredDFDiff_DF
--- On Sat, 5/30/09, G. Jay Kerns <gkerns at ysu.edu> wrote: