I am getting segfaults when I try to read a large binary object from a SQL
Server database via RODBC. I am using the FreeTDS ODBC driver, and it has
been working fine when reading from this same database. I have included
relevant parts of the session below.
Each row of the v_MAFiles view holds a text, html or pdf version of a
document. The FileType entry is one of TXT, PDF, or HTML. If it's PDF,
the FileString variable is empty and the FileBLOB holds the (binary)
contents of the pdf. If FileType is TXT or HTML, then FileBLOB is empty
and the file contents are in FileString.
Note that the COLUMN_SIZE for FileBLOB and FileString are both 2 GB.
However, the actual strings and blobs in there are actually only about 60
KB.
I can read the FileString column with no problem. But trying to read a
FileBLOB entry segfaults, apparently when R is calling malloc(). I suspect
that R is trying and failing to allocate 2 GB of memory to hold something
with the reported COLUMN_SIZE. But it doesn't barf when it loads a
FileString entry of about the same 60 KB size. Perhaps the RODBC code is
somehow discovering that the string in FileString is not really 2 GB, but
it's not finding that out for a FileBLOB?
Any help with this would be much appreciated.
Sys.info()
sysname release
version
"Linux" "2.6.18-308.24.1.el5"
"#1 SMP Wed Nov 21 11:42:14 EST 2012"
nodename machine
login
"mralx1.rsma.frb.gov" "x86_64"
"unknown"
user effective_user
"m1jjh00" "m1jjh00"
sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-redhat-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US
LC_COLLATE=en_US LC_MONETARY=en_US
[6] LC_MESSAGES=en_US LC_PAPER=C LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mra_1.0 frb_3.9 fame_2.19 tis_1.23
RODBC_1.3-6
[6] RObjectTables_0.3-1
loaded via a namespace (and not attached):
[1] data.table_1.8.2 XML_3.9-4