Compare two data sets
Here is one way to find the common rows. You can then use the 'keys' gotten back to reconstruct a new data frame:
f1 <- read.table(textConnection("V1 V2
+ YBL064C YBR067C + YBL064C YBR204C + YBL064C YDR368W + YBL064C YJL067W + YBL064C YPR160W + YBR053C YGL089C + YBR053C YHR113W + YBR053C YNL328C"), header=TRUE)
f2 <- read.table(textConnection("V1 V2
+ YBL064C YBR067C + YBL064C YBR204C + YBL064C YDR368W"), header=TRUE)
f1$key <- paste(f1$V1, f1$V2) f2$key <- paste(f2$V1, f2$V2) # now find the ones in common intersect(f1$key, f2$key)
[1] "YBL064C YBR067C" "YBL064C YBR204C" "YBL064C YDR368W"
On Tue, Mar 25, 2008 at 9:18 PM, Suhaila Zainudin
<suhaila.zainudin at gmail.com> wrote:
Hi, I have a similar query (how to compare 2 datasets), but my dataset is a bit different. I want to compare each data in dataset 1 to data in dataset 2 and get the data which is common to both datasets. For example; I have a a file (named mysample). V1 V2 YBL064C YBR067C YBL064C YBR204C YBL064C YDR368W YBL064C YJL067W YBL064C YPR160W YBR053C YGL089C YBR053C YHR113W YBR053C YNL328C And I have another file (myref) as follows V1 V2 YBL064C YBR067C YBL064C YBR204C YBL064C YDR368W When I try to intersect the two files, I received NULL data frames.
intersect(myref,mysample)
NULL data frame with 0 rows
What I am hoping to get out of intersect for the above files are
YBL064C YBR067C
YBL064C YBR204C
YBL064C YDR368W
Are there any R functions that can achieve what I want to do?
Or should I merge the data which is currently in 2 columns into single
column and use intersect again?
Thanks for any feedbacks!
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?