Dear listeners,
I try to use a MySQL database (with RODBC) for storing files. Above a
file size of 32 KB I seem to hit a wall and the code below that serves
me well otherwise fails to store and retrieve the whole file - only
parts come back (although it seems that the whole file was stored in the
database).
Is this normal? Does somebody have an idea if this is database related
or due to R or RODBC or if my code just is nuts?
Any suggestions for where to look for for a solution are more than welcome.
Best, Peter
System:
R version 2.15.3 (2013-03-01) -- "Security Blanket"
Platform: x86_64-w64-mingw32/x64 (64-bit)
DBMS Version: "5.6.13"
Driver_ODBC_Ver: "03.51"
Driver Name: "myodbc5w.dll"
Driver Version: "05.02.0005"
SQL ot create the table:
CREATE TABLE files (id INTEGER AUTO_INCREMENT, file longtext, PRIMARY
KEY (id));
CODE:
# define file for upload
fileIn <- "Hallo1.docx"
# define name of file to store data in
fileOut <- "Hallo1Copy.docx"
# read in file
dataIn <- readBin(fileIn, what="raw", n=file.info(fileIn)$size)
# collapse string
dataInString <- paste(dataIn,collapse="")
# loading package
require(RODBC)
# establishing connection
con <- odbcConnect("MyDatabase",rows_at_time = 1, believeNRows = FALSE)
# uploading data to column file in table files
dataDBString <- dataInString
sql <- paste0("INSERT INTO files (file) VALUES ('",dataDBString,"') ;")
sqlQuery(con, sql)
# downloading data again
res <- sqlQuery(con,
"SELECT file FROM files WHERE id = (SELECT max(id) FROM files) ;",
stringsAsFactors=F, rows_at_time=1,
as.is=T)
dataOutString <- res$file
# closing connection
odbcClose(con)
# splitting data
dataOut <- substring( dataOutString,
seq(1, nchar(dataOutString), 2),
seq(2, nchar(dataOutString), 2))
# checking how many bytes of dataIn and dataOut are identical
# --> there seems to be a magical wall above 32813 bytes (665626 characters)
print(table(dataIn==dataOut))
# write to file and check if file works
writeBin(as.raw(as.hexmode(dataOut)) , fileOut)
browseURL(fileOut)
RODBC / MySQL magical limit of 32 Kbyte
3 messages · Peter Meißner, Edward Vanden Berghe
Hi Peter,
I've had similar problems, with information that got truncated through an RODBC connection - on Windows 64 bits, with RODBC on a PostgreSQL database, with large PostGIS entries. Unfortunately I can't replicate the problem now as I moved on to Ubuntu now. For me the solution was to go to DBI/RPostgreSQL.
Did you try DBI? Even if you want to stick with RODBC in the future, it might assist in isolating the problem.
Cheers,
Edward
-----Original Message-----
From: r-sig-db-bounces at r-project.org [mailto:r-sig-db-bounces at r-project.org] On Behalf Of Peter Mei?ner
Sent: 26 November 2013 18:02
To: r-sig-db at r-project.org
Subject: [R-sig-DB] RODBC / MySQL magical limit of 32 Kbyte
Dear listeners,
I try to use a MySQL database (with RODBC) for storing files. Above a file size of 32 KB I seem to hit a wall and the code below that serves me well otherwise fails to store and retrieve the whole file - only parts come back (although it seems that the whole file was stored in the database).
Is this normal? Does somebody have an idea if this is database related or due to R or RODBC or if my code just is nuts?
Any suggestions for where to look for for a solution are more than welcome.
Best, Peter
System:
R version 2.15.3 (2013-03-01) -- "Security Blanket"
Platform: x86_64-w64-mingw32/x64 (64-bit)
DBMS Version: "5.6.13"
Driver_ODBC_Ver: "03.51"
Driver Name: "myodbc5w.dll"
Driver Version: "05.02.0005"
SQL ot create the table:
CREATE TABLE files (id INTEGER AUTO_INCREMENT, file longtext, PRIMARY KEY (id));
CODE:
# define file for upload
fileIn <- "Hallo1.docx"
# define name of file to store data in
fileOut <- "Hallo1Copy.docx"
# read in file
dataIn <- readBin(fileIn, what="raw", n=file.info(fileIn)$size)
# collapse string
dataInString <- paste(dataIn,collapse="")
# loading package
require(RODBC)
# establishing connection
con <- odbcConnect("MyDatabase",rows_at_time = 1, believeNRows = FALSE)
# uploading data to column file in table files
dataDBString <- dataInString
sql <- paste0("INSERT INTO files (file) VALUES ('",dataDBString,"') ;")
sqlQuery(con, sql)
# downloading data again
res <- sqlQuery(con,
"SELECT file FROM files WHERE id = (SELECT max(id) FROM files) ;",
stringsAsFactors=F, rows_at_time=1,
as.is=T)
dataOutString <- res$file
# closing connection
odbcClose(con)
# splitting data
dataOut <- substring( dataOutString,
seq(1, nchar(dataOutString), 2),
seq(2, nchar(dataOutString), 2))
# checking how many bytes of dataIn and dataOut are identical
# --> there seems to be a magical wall above 32813 bytes (665626 characters)
print(table(dataIn==dataOut))
# write to file and check if file works
writeBin(as.raw(as.hexmode(dataOut)) , fileOut)
browseURL(fileOut)
_______________________________________________
R-sig-DB mailing list -- R Special Interest Group
R-sig-DB at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-db
Thanks for the reply, I think I found the problem: There seems to be a "sanity check" when downloading information: if (datalen > 65535) datalen = 65535 If I understand this right, it just truncates each and every download of information above 65535. So the problem was(is) package related and might be dealt with by recompiling the source: download source; unzip it; RStudio->create project->from existing; open src/RODBC.c; change "if (datalen > 65535) datalen = 65535" to "/* if (datalen > 65535) datalen = 65535 */"); build binary and than install this binary) so far it works without problems. Best, Peter PS.: Now that I have learnt to compile packages I might also try to get RMySQL running. Am 27.11.2013 10:45, schrieb Edward Vanden Berghe:
Hi Peter,
I've had similar problems, with information that got truncated through an RODBC connection - on Windows 64 bits, with RODBC on a PostgreSQL database, with large PostGIS entries. Unfortunately I can't replicate the problem now as I moved on to Ubuntu now. For me the solution was to go to DBI/RPostgreSQL.
Did you try DBI? Even if you want to stick with RODBC in the future, it might assist in isolating the problem.
Cheers,
Edward
-----Original Message-----
From: r-sig-db-bounces at r-project.org [mailto:r-sig-db-bounces at r-project.org] On Behalf Of Peter Mei?ner
Sent: 26 November 2013 18:02
To: r-sig-db at r-project.org
Subject: [R-sig-DB] RODBC / MySQL magical limit of 32 Kbyte
Dear listeners,
I try to use a MySQL database (with RODBC) for storing files. Above a file size of 32 KB I seem to hit a wall and the code below that serves me well otherwise fails to store and retrieve the whole file - only parts come back (although it seems that the whole file was stored in the database).
Is this normal? Does somebody have an idea if this is database related or due to R or RODBC or if my code just is nuts?
Any suggestions for where to look for for a solution are more than welcome.
Best, Peter
System:
R version 2.15.3 (2013-03-01) -- "Security Blanket"
Platform: x86_64-w64-mingw32/x64 (64-bit)
DBMS Version: "5.6.13"
Driver_ODBC_Ver: "03.51"
Driver Name: "myodbc5w.dll"
Driver Version: "05.02.0005"
SQL ot create the table:
CREATE TABLE files (id INTEGER AUTO_INCREMENT, file longtext, PRIMARY KEY (id));
CODE:
# define file for upload
fileIn <- "Hallo1.docx"
# define name of file to store data in
fileOut <- "Hallo1Copy.docx"
# read in file
dataIn <- readBin(fileIn, what="raw", n=file.info(fileIn)$size)
# collapse string
dataInString <- paste(dataIn,collapse="")
# loading package
require(RODBC)
# establishing connection
con <- odbcConnect("MyDatabase",rows_at_time = 1, believeNRows = FALSE)
# uploading data to column file in table files
dataDBString <- dataInString
sql <- paste0("INSERT INTO files (file) VALUES ('",dataDBString,"') ;")
sqlQuery(con, sql)
# downloading data again
res <- sqlQuery(con,
"SELECT file FROM files WHERE id = (SELECT max(id) FROM files) ;",
stringsAsFactors=F, rows_at_time=1,
as.is=T)
dataOutString <- res$file
# closing connection
odbcClose(con)
# splitting data
dataOut <- substring( dataOutString,
seq(1, nchar(dataOutString), 2),
seq(2, nchar(dataOutString), 2))
# checking how many bytes of dataIn and dataOut are identical
# --> there seems to be a magical wall above 32813 bytes (665626 characters)
print(table(dataIn==dataOut))
# write to file and check if file works
writeBin(as.raw(as.hexmode(dataOut)) , fileOut)
browseURL(fileOut)
_______________________________________________ R-sig-DB mailing list -- R Special Interest Group R-sig-DB at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-db
Peter Mei?ner Workgroup 'Comparative Parliamentary Politics' Department of Politics and Administration University of Konstanz Box 216 78457 Konstanz Germany +49 7531 88 5665 http://www.polver.uni-konstanz.de/sieberer/home/