Skip to content

Decimals in R/SQL

7 messages · Rachel Licata, jim holtman, Gabor Grothendieck +1 more

#
You at least have to provide a subset of 'ss09' so we can see what the
original data looks like.  I have not had any problems with decimals
in using sqldf.
+     select key, sum(V1 * V2) / sum(V3)
+         from x
+         group by key
+ ')
key sum(V1 * V2) / sum(V3)
1   1               19.38166
2   2               17.40503
3   3               71.48818
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c("key", "sum(V1 * V2) / sum(V3)"
), row.names = c(NA, 3L), class = "data.frame")

        
On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata <RachelL at kff.org> wrote:

  
    
#
Thanks Jim.  It appears the issue may only be in SQLite.  SS09 is a large table and here is a subset of the variables I am working with.  


SS09
	    ST			AGEP			PWGTP
33323    1                30                 130
33324    1                10                 186
33325    1                 2                 162
33326    1                80                  93
33327    1                29                 135
33328    1                66                  54
33329    1                62                  54
33330    1                21                 138
33331    1                29                 103
33332    1                 7                 144
33333    1                 5                 143

z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage from ss09 group by ST")

ST wgtage
1   1     37
2   2     33
3   4     36
4   5     37
5   6     35

z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from ss09 group by ST")

ST   wgtage
1   1 37.57083
2   2 33.94322
3   4 36.14499
4   5 37.51233
5   6 35.65581

-----Original Message-----
From: jim holtman [mailto:jholtman at gmail.com] 
Sent: Wednesday, April 13, 2011 12:16 PM
To: Rachel Licata
Cc: r-help at r-project.org
Subject: Re: [R] Decimals in R/SQL

You at least have to provide a subset of 'ss09' so we can see what the
original data looks like.  I have not had any problems with decimals
in using sqldf.
+     select key, sum(V1 * V2) / sum(V3)
+         from x
+         group by key
+ ')
key sum(V1 * V2) / sum(V3)
1   1               19.38166
2   2               17.40503
3   3               71.48818
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c("key", "sum(V1 * V2) / sum(V3)"
), row.names = c(NA, 3L), class = "data.frame")

        
On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata <RachelL at kff.org> wrote:

  
    
#
The problem is that you data is 'integer' and I assume that the
database is keeping everything integer.  You can do what you are
doing, or convert to 'numeric':
+ 33323    1                30                 130
+ 33324    1                10                 186
+ 33325    1                 2                 162
+ 33326    1                80                  93
+ 33327    1                29                 135
+ 33328    1                66                  54
+ 33329    1                62                  54
+ 33330    1                21                 138
+ 33331    1                29                 103
+ 33332    1                 7                 144
+ 33333    1                 5                 143"), header = TRUE)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : int  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
+     select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+         from x
+         group by ST
+ ')
ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1                             23
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : num  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
+     select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+         from x
+         group by ST
+ ')
ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1                       23.81446

        
On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata <RachelL at kff.org> wrote:

  
    
#
Thanks again Jim - that is really helpful and I apologize that I am new to R.  How can I convert to numeric in SQL and when I am working on a table in a database?  The file is huge so that is why I am using SQL and the database to work through it.  
Thanks!

-----Original Message-----
From: jim holtman [mailto:jholtman at gmail.com] 
Sent: Wednesday, April 13, 2011 12:52 PM
To: Rachel Licata
Cc: r-help at r-project.org
Subject: Re: [R] Decimals in R/SQL

The problem is that you data is 'integer' and I assume that the
database is keeping everything integer.  You can do what you are
doing, or convert to 'numeric':
+ 33323    1                30                 130
+ 33324    1                10                 186
+ 33325    1                 2                 162
+ 33326    1                80                  93
+ 33327    1                29                 135
+ 33328    1                66                  54
+ 33329    1                62                  54
+ 33330    1                21                 138
+ 33331    1                29                 103
+ 33332    1                 7                 144
+ 33333    1                 5                 143"), header = TRUE)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : int  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
+     select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+         from x
+         group by ST
+ ')
ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1                             23
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : num  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
+     select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+         from x
+         group by ST
+ ')
ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1                       23.81446

        
On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata <RachelL at kff.org> wrote:

  
    
#
On Wed, Apr 13, 2011 at 4:34 PM, Rachel Licata <RachelL at kff.org> wrote:
There are some examples here (a few of them, as noted, require the
development version of sqldf):

http://code.google.com/p/sqldf/#15._Why_do_certain_calculations_come_out_as_integer_rather_than
#
On Wed, Apr 13, 2011 at 1:34 PM, Rachel Licata <RachelL at kff.org> wrote:

            
I believe that RSQLite will do the right thing if you provide the
correct types in your schema.  So for a new database, you want to make
sure that the columns that you want to be numeric are created like:

    CREATE table sometable (my_data REAL);

You should be able to create a new table from an existing table using
SQL where the new table has the types you want.

+ seth