Skip to content
Prev 327710 / 398502 Next

Intersecting two matrices

I would appreciate it if you would follow the Posting Guide and give a reproducible example and post all messages using plain text.

Try

m1 <- matrix(sample(0:999,2*1057837,TRUE),ncol=2)
m2 <- matrix(sample(0:999,2*951980,TRUE),ncol=2)
df1 <- as.data.frame(m1)
df2 <- as.data.frame(m2)
library(sqldf)
system.time(df3 <- sqldf("SELECT DISTINCT df1.V1, df1.V2 FROM df1 INNER JOIN df2 ON df1.V1=df2.V1 AND df1.V2=df2.V2") )

The speed seems heavily dependent on how many rows are duplicated within the input data frames... so if the range of values is small then the query runs slower. Note also that moving the data from R to the database and back takes time... you may be able to import the data directly from your source data to the database and save some time. Read ?sqldf and ?read.csv.sql examples for more info.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.
c char <charlie.hsia.us at gmail.com> wrote: