Skip to content

RODBC / MySQL magical limit of 32 Kbyte

3 messages · Peter Meißner, Edward Vanden Berghe

#
Dear listeners,

I try to use a MySQL database (with RODBC) for storing files. Above a 
file size of 32 KB I seem to hit a wall and the code below that serves 
me well otherwise fails to store and retrieve the whole file - only 
parts come back (although it seems that the whole file was stored in the 
database).

Is this normal? Does somebody have an idea if this is database related 
or due to R or RODBC or if my code just is nuts?

Any suggestions for where to look for for a solution are more than welcome.

Best, Peter


System:
R version         2.15.3 (2013-03-01) -- "Security Blanket"
Platform:         x86_64-w64-mingw32/x64 (64-bit)
DBMS Version:    "5.6.13"
Driver_ODBC_Ver: "03.51"
Driver Name:     "myodbc5w.dll"
Driver Version:  "05.02.0005"

SQL ot create the table:
CREATE TABLE files (id INTEGER AUTO_INCREMENT, file longtext, PRIMARY 
KEY (id));


CODE:

# define file for upload
     fileIn  <- "Hallo1.docx"

# define name of file to store data in
     fileOut <- "Hallo1Copy.docx"

# read in file
     dataIn <- readBin(fileIn, what="raw", n=file.info(fileIn)$size)

# collapse string
     dataInString <- paste(dataIn,collapse="")

# loading package
     require(RODBC)

# establishing connection
     con <- odbcConnect("MyDatabase",rows_at_time = 1, believeNRows = FALSE)

# uploading data to column file in table files
     dataDBString <- dataInString
     sql <- paste0("INSERT INTO files (file) VALUES ('",dataDBString,"') ;")
     sqlQuery(con, sql)

# downloading data again
     res <- sqlQuery(con,
         "SELECT file FROM files WHERE id = (SELECT max(id) FROM files) ;",
         stringsAsFactors=F, rows_at_time=1,
         as.is=T)
     dataOutString <- res$file

# closing connection
odbcClose(con)

# splitting data
dataOut <- substring(   dataOutString,
                         seq(1, nchar(dataOutString), 2),
                         seq(2, nchar(dataOutString), 2))

# checking how many bytes of dataIn and dataOut are identical
# --> there seems to be a magical wall above 32813 bytes (665626 characters)
print(table(dataIn==dataOut))

# write to file and check if file works
writeBin(as.raw(as.hexmode(dataOut)) , fileOut)
browseURL(fileOut)
#
Hi Peter,

I've had similar problems, with information that got truncated through an RODBC connection - on Windows 64 bits, with RODBC on a PostgreSQL database, with large PostGIS entries. Unfortunately I can't replicate the problem now as I moved on to Ubuntu now. For me the solution was to go to DBI/RPostgreSQL. 

Did you try DBI? Even if you want to stick with RODBC in the future, it might assist in isolating the problem.

Cheers,

Edward

-----Original Message-----
From: r-sig-db-bounces at r-project.org [mailto:r-sig-db-bounces at r-project.org] On Behalf Of Peter Mei?ner
Sent: 26 November 2013 18:02
To: r-sig-db at r-project.org
Subject: [R-sig-DB] RODBC / MySQL magical limit of 32 Kbyte

Dear listeners,

I try to use a MySQL database (with RODBC) for storing files. Above a file size of 32 KB I seem to hit a wall and the code below that serves me well otherwise fails to store and retrieve the whole file - only parts come back (although it seems that the whole file was stored in the database).

Is this normal? Does somebody have an idea if this is database related or due to R or RODBC or if my code just is nuts?

Any suggestions for where to look for for a solution are more than welcome.

Best, Peter


System:
R version         2.15.3 (2013-03-01) -- "Security Blanket"
Platform:         x86_64-w64-mingw32/x64 (64-bit)
DBMS Version:    "5.6.13"
Driver_ODBC_Ver: "03.51"
Driver Name:     "myodbc5w.dll"
Driver Version:  "05.02.0005"

SQL ot create the table:
CREATE TABLE files (id INTEGER AUTO_INCREMENT, file longtext, PRIMARY KEY (id));


CODE:

# define file for upload
     fileIn  <- "Hallo1.docx"

# define name of file to store data in
     fileOut <- "Hallo1Copy.docx"

# read in file
     dataIn <- readBin(fileIn, what="raw", n=file.info(fileIn)$size)

# collapse string
     dataInString <- paste(dataIn,collapse="")

# loading package
     require(RODBC)

# establishing connection
     con <- odbcConnect("MyDatabase",rows_at_time = 1, believeNRows = FALSE)

# uploading data to column file in table files
     dataDBString <- dataInString
     sql <- paste0("INSERT INTO files (file) VALUES ('",dataDBString,"') ;")
     sqlQuery(con, sql)

# downloading data again
     res <- sqlQuery(con,
         "SELECT file FROM files WHERE id = (SELECT max(id) FROM files) ;",
         stringsAsFactors=F, rows_at_time=1,
         as.is=T)
     dataOutString <- res$file

# closing connection
odbcClose(con)

# splitting data
dataOut <- substring(   dataOutString,
                         seq(1, nchar(dataOutString), 2),
                         seq(2, nchar(dataOutString), 2))

# checking how many bytes of dataIn and dataOut are identical
# --> there seems to be a magical wall above 32813 bytes (665626 characters)
print(table(dataIn==dataOut))

# write to file and check if file works
writeBin(as.raw(as.hexmode(dataOut)) , fileOut)
browseURL(fileOut)

_______________________________________________
R-sig-DB mailing list -- R Special Interest Group
R-sig-DB at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-db
#
Thanks for the reply,

I think I found the problem: There seems to be a "sanity check" when 
downloading information:

if (datalen > 65535) datalen = 65535

If I understand this right, it just truncates each and every download of 
information above 65535.

So the problem was(is) package related and might be dealt with by 
recompiling the source:

download source;
unzip it;
RStudio->create project->from existing;
open src/RODBC.c;
change "if (datalen > 65535) datalen = 65535"
to "/* if (datalen > 65535) datalen = 65535 */");
build binary and than install this binary)

so far it works without problems.

Best, Peter


PS.: Now that I have learnt to compile packages I might also try to get 
RMySQL running.


Am 27.11.2013 10:45, schrieb Edward Vanden Berghe: