Skip to content
Prev 139696 / 398506 Next

problem with merge

I have used merge regularly and thought I understood how it worked, but 
I must not. I have two dataframes with identical colnames from two 
different experiments, TL01 and LC01. Each dataframe has a column named 
"Entrez.Gene", which I have converted to "as.character" just to make 
sure merge is not looking at factor levels. Because I have done some 
filtering, the Entrez.Gene values in each experiment overlap but are not 
identical. I want to produce a summary report with only those 
identifiers found in each experiment. I could do this with intersect and 
matching, but I thought merge could easily do this.

Below is my code and sessionInfo. For some reason there are over twice 
as many rows as I would expect. I can't quite figure out which arguments 
I have screwed up. What am I missing? It has to be something simple, I'm 
just not seeing it.  Thanks, Mark

 > TL01.LC01.data <- merge(TL01.data, LC01.data, by = "Entrez.Gene", 
all.x = FALSE, all.y = FALSE, suffixes = c(".TL01",".LC01"))
 > length(intersect(TL01.data$Entrez.Gene, LC01.data$Entrez.Gene))
[1] 13401
 > dim(TL01.LC01.data)
[1] 29471    57
 > dim(TL01.data)
[1] 16479    29
 > dim(LC01.data)
[1] 16479    29