Skip to content

data type mapping for RMySQL

4 messages · Tim Coote, Paul Gilbert, jim holtman +1 more

#
Hullo

I?m not sure whether this is a bug, but I would expect some sort of warning where mappings may lose precision.

I?ve got a mysql table with large numbers in it:

CREATE TABLE if not exists `tester` (
      `id` bigint(20) NOT NULL AUTO_INCREMENT,
      `address` bigint(20) NOT NULL,
      PRIMARY KEY (`id`),
      UNIQUE KEY `idx_address` (`address`)
    ) ENGINE=InnoDB AUTO_INCREMENT=78628 DEFAULT CHARSET=latin1 ;

The key aspect of this is that the ?address? column needs to be used to distinguish between values. If I put one value into this table, and read it back with the mysql client:

mysql> \. test.sql
Query OK, 1 row affected, 1 warning (0.00 sec)

Database changed
Query OK, 0 rows affected (0.05 sec)

Query OK, 1 row affected (0.01 sec)

+---------------------+
| address             |
+---------------------+
| 2029716610205351937 |
+---------------------+
1 row in set (0.00 sec)

But if I try to read the value using R:
Loading required package: DBI
address
1 2029716610205351936
Note that the address value is different

The issue is that the precision of the sql type (20 decimal characters) is larger than the precision of the R type that it?s been matched to.  I don?t know whether this is a bug or not - it would have help to have a warning about the possible precision mismatch, rather than to spot the emerging bug ;-) - that aside, is there a simple way to map the bigint type to something else, eg for this situation, character would do. I know that I can cast the type in the sql query, but I?d prefer to set the mapping at a more global scope if I could in the data definition.

tia

Tim
#
Just for the record, this also happens in Postgresql, so the problem is 
more general. (I tried testing in SQLite but my standalone client seems 
to be encrypting the db and RSQLite cannot use it.) I'm not sure there 
is a solution to this as I do not immediately see how to represent 
bigint in R.

Generic SQL example:

 >mysql test
 >psql  test
 >sqlite test

[ DROP TABLE tester; ]

CREATE TABLE  tester (
       id bigint NOT NULL,
       address bigint NOT NULL
     ) ;

INSERT INTO tester VALUES (1, 2029716610205351937);

SELECT * FROM tester ;

gives
   1 | 2029716610205351937
in mysql, psql, and sqlite

( exit; /q or .quit )
(later DROP TABLE tester;)


 >R

require("RMySQL")
hubdb <- dbConnect("MySQL", dbname="test" )
print(fetch(dbSendQuery(hubdb, "select * from tester;"), n=-1), digits=20)

RMySQL gives
   id             address
1  1 2029716610205351936


require("RPostgreSQL")
hubdb <- dbConnect("PostgreSQL", dbname="test" )
print(fetch(dbSendQuery(hubdb, "select * from tester;"), n=-1), digits=20)
RPostgreSQL gives
   id             address
1  1 2029716610205351936


Paul
On 13-12-20 08:07 AM, Tim Coote wrote:
#
There is no direct mapping for a 'bigint' that has 20 digits.  The
floating point in R (and in any other language used on your computer
with IEEE floating point) is limited to about 15 digits.  In the cases
where I have had to process this type of numeric data, I have had it
read in as a character value.  This means that I cannot do arithmetic
on it, but at least I can compare it to other values correctly.  This
is a variation of FAQ 7.31.

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
On Fri, Dec 20, 2013 at 10:28 AM, Paul Gilbert <pgilbert902 at gmail.com> wrote:
#
I don't think there's anyway to preserve precision here, but the R
client should at least give a warning that this is happening.

Hadley
On Fri, Dec 20, 2013 at 7:38 AM, jim holtman <jholtman at gmail.com> wrote: