An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110413/2c5853cc/attachment.pl>
Decimals in R/SQL
7 messages · Rachel Licata, jim holtman, Gabor Grothendieck +1 more
You at least have to provide a subset of 'ss09' so we can see what the original data looks like. I have not had any problems with decimals in using sqldf.
x <- as.data.frame(matrix(runif(100)*100, 10))
x$key <- sample(1:3, 10, TRUE)
require(sqldf)
xsum <- sqldf('
+ select key, sum(V1 * V2) / sum(V3) + from x + group by key + ')
xsum
key sum(V1 * V2) / sum(V3) 1 1 19.38166 2 2 17.40503 3 3 71.48818
dput(xsum)
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c("key", "sum(V1 * V2) / sum(V3)"
), row.names = c(NA, 3L), class = "data.frame")
On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata <RachelL at kff.org> wrote:
Hello,
When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
The only way I can get decimals is by multiplying by 1.00000, etc. ?I
have tried manipulating the options, but it is only effective once I
multiply by 1.0000.
I appreciate any suggestions!
Thanks!
Example:
z <- sqldf ("select ST,
SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from
ss09 group by ST")
z shouldn't be whole numbers.
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
Thanks Jim. It appears the issue may only be in SQLite. SS09 is a large table and here is a subset of the variables I am working with. SS09 ST AGEP PWGTP 33323 1 30 130 33324 1 10 186 33325 1 2 162 33326 1 80 93 33327 1 29 135 33328 1 66 54 33329 1 62 54 33330 1 21 138 33331 1 29 103 33332 1 7 144 33333 1 5 143 z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage from ss09 group by ST") ST wgtage 1 1 37 2 2 33 3 4 36 4 5 37 5 6 35 z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from ss09 group by ST") ST wgtage 1 1 37.57083 2 2 33.94322 3 4 36.14499 4 5 37.51233 5 6 35.65581 -----Original Message----- From: jim holtman [mailto:jholtman at gmail.com] Sent: Wednesday, April 13, 2011 12:16 PM To: Rachel Licata Cc: r-help at r-project.org Subject: Re: [R] Decimals in R/SQL You at least have to provide a subset of 'ss09' so we can see what the original data looks like. I have not had any problems with decimals in using sqldf.
x <- as.data.frame(matrix(runif(100)*100, 10))
x$key <- sample(1:3, 10, TRUE)
require(sqldf)
xsum <- sqldf('
+ select key, sum(V1 * V2) / sum(V3) + from x + group by key + ')
xsum
key sum(V1 * V2) / sum(V3) 1 1 19.38166 2 2 17.40503 3 3 71.48818
dput(xsum)
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c("key", "sum(V1 * V2) / sum(V3)"
), row.names = c(NA, 3L), class = "data.frame")
On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata <RachelL at kff.org> wrote:
Hello,
When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
The only way I can get decimals is by multiplying by 1.00000, etc. ?I
have tried manipulating the options, but it is only effective once I
multiply by 1.0000.
I appreciate any suggestions!
Thanks!
Example:
z <- sqldf ("select ST,
SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from
ss09 group by ST")
z shouldn't be whole numbers.
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
The problem is that you data is 'integer' and I assume that the database is keeping everything integer. You can do what you are doing, or convert to 'numeric':
x <- read.table(textConnection(" ST AGEP PWGTP
+ 33323 1 30 130 + 33324 1 10 186 + 33325 1 2 162 + 33326 1 80 93 + 33327 1 29 135 + 33328 1 66 54 + 33329 1 62 54 + 33330 1 21 138 + 33331 1 29 103 + 33332 1 7 144 + 33333 1 5 143"), header = TRUE)
closeAllConnections() str(x)
'data.frame': 11 obs. of 3 variables: $ ST : int 1 1 1 1 1 1 1 1 1 1 ... $ AGEP : int 30 10 2 80 29 66 62 21 29 7 ... $ PWGTP: int 130 186 162 93 135 54 54 138 103 144 ...
require(sqldf)
xsum <- sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP) + from x + group by ST + ')
xsum
ST sum(AGEP * PWGTP) / sum(PWGTP) 1 1 23
# change to numeric instead of integer x$AGEP <- as.numeric(x$AGEP) str(x)
'data.frame': 11 obs. of 3 variables: $ ST : int 1 1 1 1 1 1 1 1 1 1 ... $ AGEP : num 30 10 2 80 29 66 62 21 29 7 ... $ PWGTP: int 130 186 162 93 135 54 54 138 103 144 ...
xsum <- sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP) + from x + group by ST + ')
xsum
ST sum(AGEP * PWGTP) / sum(PWGTP) 1 1 23.81446
On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata <RachelL at kff.org> wrote:
Thanks Jim. ?It appears the issue may only be in SQLite. ?SS09 is a large table and here is a subset of the variables I am working with. SS09 ? ? ? ? ? ?ST ? ? ? ? ? ? ? ? ?AGEP ? ? ? ? ? ? ? ? ? ?PWGTP 33323 ? ?1 ? ? ? ? ? ? ? ?30 ? ? ? ? ? ? ? ? 130 33324 ? ?1 ? ? ? ? ? ? ? ?10 ? ? ? ? ? ? ? ? 186 33325 ? ?1 ? ? ? ? ? ? ? ? 2 ? ? ? ? ? ? ? ? 162 33326 ? ?1 ? ? ? ? ? ? ? ?80 ? ? ? ? ? ? ? ? ?93 33327 ? ?1 ? ? ? ? ? ? ? ?29 ? ? ? ? ? ? ? ? 135 33328 ? ?1 ? ? ? ? ? ? ? ?66 ? ? ? ? ? ? ? ? ?54 33329 ? ?1 ? ? ? ? ? ? ? ?62 ? ? ? ? ? ? ? ? ?54 33330 ? ?1 ? ? ? ? ? ? ? ?21 ? ? ? ? ? ? ? ? 138 33331 ? ?1 ? ? ? ? ? ? ? ?29 ? ? ? ? ? ? ? ? 103 33332 ? ?1 ? ? ? ? ? ? ? ? 7 ? ? ? ? ? ? ? ? 144 33333 ? ?1 ? ? ? ? ? ? ? ? 5 ? ? ? ? ? ? ? ? 143 z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage from ss09 group by ST") ST wgtage 1 ? 1 ? ? 37 2 ? 2 ? ? 33 3 ? 4 ? ? 36 4 ? 5 ? ? 37 5 ? 6 ? ? 35 z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from ss09 group by ST") ST ? wgtage 1 ? 1 37.57083 2 ? 2 33.94322 3 ? 4 36.14499 4 ? 5 37.51233 5 ? 6 35.65581 -----Original Message----- From: jim holtman [mailto:jholtman at gmail.com] Sent: Wednesday, April 13, 2011 12:16 PM To: Rachel Licata Cc: r-help at r-project.org Subject: Re: [R] Decimals in R/SQL You at least have to provide a subset of 'ss09' so we can see what the original data looks like. ?I have not had any problems with decimals in using sqldf.
x <- as.data.frame(matrix(runif(100)*100, 10))
x$key <- sample(1:3, 10, TRUE)
require(sqldf)
xsum <- sqldf('
+ ? ? select key, sum(V1 * V2) / sum(V3) + ? ? ? ? from x + ? ? ? ? group by key + ')
xsum
?key sum(V1 * V2) / sum(V3) 1 ? 1 ? ? ? ? ? ? ? 19.38166 2 ? 2 ? ? ? ? ? ? ? 17.40503 3 ? 3 ? ? ? ? ? ? ? 71.48818
dput(xsum)
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c("key", "sum(V1 * V2) / sum(V3)"
), row.names = c(NA, 3L), class = "data.frame")
On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata <RachelL at kff.org> wrote:
Hello,
When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
The only way I can get decimals is by multiplying by 1.00000, etc. ?I
have tried manipulating the options, but it is only effective once I
multiply by 1.0000.
I appreciate any suggestions!
Thanks!
Example:
z <- sqldf ("select ST,
SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from
ss09 group by ST")
z shouldn't be whole numbers.
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
Thanks again Jim - that is really helpful and I apologize that I am new to R. How can I convert to numeric in SQL and when I am working on a table in a database? The file is huge so that is why I am using SQL and the database to work through it. Thanks! -----Original Message----- From: jim holtman [mailto:jholtman at gmail.com] Sent: Wednesday, April 13, 2011 12:52 PM To: Rachel Licata Cc: r-help at r-project.org Subject: Re: [R] Decimals in R/SQL The problem is that you data is 'integer' and I assume that the database is keeping everything integer. You can do what you are doing, or convert to 'numeric':
x <- read.table(textConnection(" ST AGEP PWGTP
+ 33323 1 30 130 + 33324 1 10 186 + 33325 1 2 162 + 33326 1 80 93 + 33327 1 29 135 + 33328 1 66 54 + 33329 1 62 54 + 33330 1 21 138 + 33331 1 29 103 + 33332 1 7 144 + 33333 1 5 143"), header = TRUE)
closeAllConnections() str(x)
'data.frame': 11 obs. of 3 variables: $ ST : int 1 1 1 1 1 1 1 1 1 1 ... $ AGEP : int 30 10 2 80 29 66 62 21 29 7 ... $ PWGTP: int 130 186 162 93 135 54 54 138 103 144 ...
require(sqldf)
xsum <- sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP) + from x + group by ST + ')
xsum
ST sum(AGEP * PWGTP) / sum(PWGTP) 1 1 23
# change to numeric instead of integer x$AGEP <- as.numeric(x$AGEP) str(x)
'data.frame': 11 obs. of 3 variables: $ ST : int 1 1 1 1 1 1 1 1 1 1 ... $ AGEP : num 30 10 2 80 29 66 62 21 29 7 ... $ PWGTP: int 130 186 162 93 135 54 54 138 103 144 ...
xsum <- sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP) + from x + group by ST + ')
xsum
ST sum(AGEP * PWGTP) / sum(PWGTP) 1 1 23.81446
On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata <RachelL at kff.org> wrote:
Thanks Jim. ?It appears the issue may only be in SQLite. ?SS09 is a large table and here is a subset of the variables I am working with. SS09 ? ? ? ? ? ?ST ? ? ? ? ? ? ? ? ?AGEP ? ? ? ? ? ? ? ? ? ?PWGTP 33323 ? ?1 ? ? ? ? ? ? ? ?30 ? ? ? ? ? ? ? ? 130 33324 ? ?1 ? ? ? ? ? ? ? ?10 ? ? ? ? ? ? ? ? 186 33325 ? ?1 ? ? ? ? ? ? ? ? 2 ? ? ? ? ? ? ? ? 162 33326 ? ?1 ? ? ? ? ? ? ? ?80 ? ? ? ? ? ? ? ? ?93 33327 ? ?1 ? ? ? ? ? ? ? ?29 ? ? ? ? ? ? ? ? 135 33328 ? ?1 ? ? ? ? ? ? ? ?66 ? ? ? ? ? ? ? ? ?54 33329 ? ?1 ? ? ? ? ? ? ? ?62 ? ? ? ? ? ? ? ? ?54 33330 ? ?1 ? ? ? ? ? ? ? ?21 ? ? ? ? ? ? ? ? 138 33331 ? ?1 ? ? ? ? ? ? ? ?29 ? ? ? ? ? ? ? ? 103 33332 ? ?1 ? ? ? ? ? ? ? ? 7 ? ? ? ? ? ? ? ? 144 33333 ? ?1 ? ? ? ? ? ? ? ? 5 ? ? ? ? ? ? ? ? 143 z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage from ss09 group by ST") ST wgtage 1 ? 1 ? ? 37 2 ? 2 ? ? 33 3 ? 4 ? ? 36 4 ? 5 ? ? 37 5 ? 6 ? ? 35 z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from ss09 group by ST") ST ? wgtage 1 ? 1 37.57083 2 ? 2 33.94322 3 ? 4 36.14499 4 ? 5 37.51233 5 ? 6 35.65581 -----Original Message----- From: jim holtman [mailto:jholtman at gmail.com] Sent: Wednesday, April 13, 2011 12:16 PM To: Rachel Licata Cc: r-help at r-project.org Subject: Re: [R] Decimals in R/SQL You at least have to provide a subset of 'ss09' so we can see what the original data looks like. ?I have not had any problems with decimals in using sqldf.
x <- as.data.frame(matrix(runif(100)*100, 10))
x$key <- sample(1:3, 10, TRUE)
require(sqldf)
xsum <- sqldf('
+ ? ? select key, sum(V1 * V2) / sum(V3) + ? ? ? ? from x + ? ? ? ? group by key + ')
xsum
?key sum(V1 * V2) / sum(V3) 1 ? 1 ? ? ? ? ? ? ? 19.38166 2 ? 2 ? ? ? ? ? ? ? 17.40503 3 ? 3 ? ? ? ? ? ? ? 71.48818
dput(xsum)
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c("key", "sum(V1 * V2) / sum(V3)"
), row.names = c(NA, 3L), class = "data.frame")
On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata <RachelL at kff.org> wrote:
Hello,
When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
The only way I can get decimals is by multiplying by 1.00000, etc. ?I
have tried manipulating the options, but it is only effective once I
multiply by 1.0000.
I appreciate any suggestions!
Thanks!
Example:
z <- sqldf ("select ST,
SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from
ss09 group by ST")
z shouldn't be whole numbers.
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
On Wed, Apr 13, 2011 at 4:34 PM, Rachel Licata <RachelL at kff.org> wrote:
Thanks again Jim - that is really helpful and I apologize that I am new to R. ?How can I convert to numeric in SQL and when I am working on a table in a database? ?The file is huge so that is why I am using SQL and the database to work through it. Thanks!
There are some examples here (a few of them, as noted, require the development version of sqldf): http://code.google.com/p/sqldf/#15._Why_do_certain_calculations_come_out_as_integer_rather_than
Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
On Wed, Apr 13, 2011 at 1:34 PM, Rachel Licata <RachelL at kff.org> wrote:
Thanks again Jim - that is really helpful and I apologize that I am new to R. ?How can I convert to numeric in SQL and when I am working on a table in a database? ?The file is huge so that is why I am using SQL and the database to work through it.
I believe that RSQLite will do the right thing if you provide the
correct types in your schema. So for a new database, you want to make
sure that the columns that you want to be numeric are created like:
CREATE table sometable (my_data REAL);
You should be able to create a new table from an existing table using
SQL where the new table has the types you want.
+ seth
Seth Falcon | @sfalcon | http://userprimary.net/