Skip to content

export from R to MySQL

6 messages · bogdan romocea, Sean Davis, Brian Ripley +2 more

#
There's no need for manual steps. To take advantage of MySQL's
extremely fast 'load data infile' you could dump the data in CSV
format, write a script for mysql (the command line tool), for example

q <- function(table,infile)
{
query <- paste("
create table ",table," (col1 float, col2 float);
load data infile '",infile,"'
into table ",table,"
fields terminated by ','
lines terminated by '\\r\\n'
ignore 0 lines;
show warnings;
",sep="")
query
}
sink("mysql_script.sql")
cat(q("db.table","infile"),"\n")
sink()

then run the script from R with system(). The mysql command looks like
mysql -u user --password=pswd -v < mysql_script.sql >> log.txt 2>&1
#
On 12/12/05 9:21 AM, "bogdan romocea" <br44114 at gmail.com> wrote:

            
This is creating the table by hand, as opposed to using dbWriteTable.  If
your data.frame contains 67 columns, using dbWriteTable saves quite a bit of
typing....

Sean
#
That was just an example -- it's not difficult to write an R function
to generate the mysql create table syntax for a data frame with 60 or
600 columns. (BTW, I would never type 67 columns.)
On 12/12/05, Sean Davis <sdavis2 at mail.nih.gov> wrote:
#
On Mon, 12 Dec 2005, Sean Davis wrote:

            
The RODBC equivalent creates the table for you, then fast imports the 
file.  Might be worthwhile contribution to RMySQL for someone.

Just be careful with client-server systems to have the file in the right 
place (if indeed you are allowed to have files on the server).
#
Prof Brian Ripley wrote:
That's what RMySQL's dbWriteTable() does.  The original posting
mentioned problems associated with speed of data.frame and
dbWriteTable, which seems plausible (but I haven't quantified it
myself) given the fact that dbWriteTable outputs a data.frame to an
intermediate file via write.table and then uses the LOAD DATA for
fast loading that intermediate file.
--
David
#
On Dec 12, 2005, at 6:50 PM, David James wrote:

            
Thanks at all!

As write.table and read.table itself are to some degree slow, for 
matrizes which are only numeric cat() and scan() could be faster. 
however it's a special case.