Hi,Thanks for the replies. My problem is that the tables (at least, a few of them) are too large. I cannot directly export the tables to get text files. I tried the following :
c1<-sqlQuery(ch,"SELECT COUNT(*) FROM table1")
I got the following error (changed the text to alter table names) :[1] "AB000 -2001 [Microsoft][ODBC MicrosoftAccess Driver] ODBC--call failed."??????????????? [2] "[RODBC] ERROR: Could not SQLExecDirect'SELECT COUNT(*) FROM Table1'"
But the following works (with about 5 minutes for each step) :con<-odbcConnect("TestDB")
d1<-sqlFetch(ch,"table1",max=1e6,as.is=TRUE)
d2<-sqlFetchMore(ch,max=1e6,as.is=TRUE)
I could successively retrieve 6 million rows (with 52 columns). After that (with the next sqlFetchMore command), I got the following error :
Error : cannot allocate vector of size 7.6 Mb
When this happened, I had saved all the data frames obtained with the saveRDS (found this to be the easiest method of limiting the size of the file on disk)? command and then deleted the data frames from memory. So, I do not know why the last error turned up.
Though I am not an expert in databases, I am quite sure that the database has not been put together in a sensible way. I am now trying to work with my amateurish methods and trying to see what I can.
I would like to know if there is any sql query by which I can continue in retrieving data from 6e6 row onwards. That is, instead of starting from row 1, can I fetch rows from row number, 6e6+1?How do I know that I am near the tail end? There seems to be no simple method for finding the size of the table.
Thanks,Ravi
ravi <rv15i at yahoo.se> 11/25/14 12:42 PM >>>
Hi,All my data is presently locked in a Microsoft access database. This has huge data in a number of large tables. Using RODBC and connecting to it takes too long a time, sometimes making the system to hang up.
To make things more manageable, I have tried to transfer the data to manageable .RData or .csv files. But I am not able to do this with some of the larger files. I am currently stuck in the one of the preliminary steps. I am not able to find the number of rows in a table. If I know this, I can transfer chunks of the tables to a sqlite database.
I am able to connect to connect to the access data base with :
library(RODBC)
con<-odbcConnect("TestDB")d1<-sqlFetch(ch,"table1",max=1e5,as.is=TRUE)d2<-sqlFetchMore(ch,max=1e5,as.is=TRUE)
d3<-rbind(d1,d2)I wanted to develop this into a loop to get a concatenated data frame, which I wanted to save either as a binary file, or transfer it to a sqlite data base. I would like to have some help on the simplest route to follow. But first, I will proceed in describing my immediate problem. In some of the tables, I find that the sqlFetchMore returns a value of -1L, meaning that the end has been reached. In RODBC, I find no command for getting the row count in a table. I have found this in DBI. But I have not been able to figure out how I should specify the connection to an acccess database (con in the following).
while(!dbHasCompleted(con)){ print(dbGetRowCount(con))}
I would appreciate help on the following points :1. How can I get the row count (and size) of a table in an access data base? With RODBC, DBI or any other way.2. I have found that saving the tables as .RData files reduces the file size and and reduces the reading time. Is there some way of appending to an already saved data frame with this method?3. I have come across alternative ways of saving data - using writeBin and packages like saves, rhdf5 etc. Would they be useful alternatives?
4. Is there an advantage in combining binary files and databases like sqlite? Or, are files already saved in a binary format in databases like sqlite?5. What is the simplest method of porting from the access to the sqlite database? With RSQlite and RODBC, can I have connections to the access and sqlite databases open at the same time? Or, should I close one and then open the other? It would help if I can get a detailed bit if code for doing this in a simple way.
I would appreciate all help that I can get.Thanks,Ravi
????[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
This is not really on topic here. At best the R-sig-db mailing list would be on topic, but it seems you may be dealing with issues related to nonstandard sql so you might need to find some additional help in a more Access-specific forum.
One thing you might try is using a specific column name from your table instead of the "*" symbol in the COUNT function... preferably a unique primary key column.
The reason the primary key is important is because sql doesn't assume the date are ordered the way R does... it is only as ordered as the indexes that are used to retrieve it. That may also be a clue for you to use in figuring out how to divide up the data into smaller chunks.
One thing I can say is that allocating chunks of memory and then freeing them can break up your available memory into pieces and make allocating more memory impossible. Unfortunately, troubleshooting such problems is highly dependent on your hardware and software configuration as well as the exact steps you are following. Since you have not attempted to create a reproducible example yet, you might want to do that before posting at R-sig-db to improve your chances of getting help.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On November 27, 2014 9:00:03 AM PST, ravi <rv15i at yahoo.se> wrote:
Hi,Thanks for the replies. My problem is that the tables (at least, a
few of them) are too large. I cannot directly export the tables to get
text files. I tried the following :
c1<-sqlQuery(ch,"SELECT COUNT(*) FROM table1")
I got the following error (changed the text to alter table names) :[1]
"AB000 -2001 [Microsoft][ODBC MicrosoftAccess Driver] ODBC--call
failed."??????????????? [2] "[RODBC] ERROR: Could not
SQLExecDirect'SELECT COUNT(*) FROM Table1'"
But the following works (with about 5 minutes for each step)
:con<-odbcConnect("TestDB")
d1<-sqlFetch(ch,"table1",max=1e6,as.is=TRUE)
d2<-sqlFetchMore(ch,max=1e6,as.is=TRUE)
I could successively retrieve 6 million rows (with 52 columns). After
that (with the next sqlFetchMore command), I got the following error :
Error : cannot allocate vector of size 7.6 Mb
When this happened, I had saved all the data frames obtained with the
saveRDS (found this to be the easiest method of limiting the size of
the file on disk)? command and then deleted the data frames from
memory. So, I do not know why the last error turned up.
Though I am not an expert in databases, I am quite sure that the
database has not been put together in a sensible way. I am now trying
to work with my amateurish methods and trying to see what I can.
I would like to know if there is any sql query by which I can continue
in retrieving data from 6e6 row onwards. That is, instead of starting
from row 1, can I fetch rows from row number, 6e6+1?How do I know that
I am near the tail end? There seems to be no simple method for finding
the size of the table.
Thanks,Ravi
ravi <rv15i at yahoo.se> 11/25/14 12:42 PM >>>
Hi,All my data is presently locked in a Microsoft access database. This
has huge data in a number of large tables. Using RODBC and connecting
to it takes too long a time, sometimes making the system to hang up.
To make things more manageable, I have tried to transfer the data to
manageable .RData or .csv files. But I am not able to do this with some
of the larger files. I am currently stuck in the one of the preliminary
steps. I am not able to find the number of rows in a table. If I know
this, I can transfer chunks of the tables to a sqlite database.
I am able to connect to connect to the access data base with :
library(RODBC)
con<-odbcConnect("TestDB")d1<-sqlFetch(ch,"table1",max=1e5,as.is=TRUE)d2<-sqlFetchMore(ch,max=1e5,as.is=TRUE)
d3<-rbind(d1,d2)I wanted to develop this into a loop to get a
concatenated data frame, which I wanted to save either as a binary
file, or transfer it to a sqlite data base. I would like to have some
help on the simplest route to follow. But first, I will proceed in
describing my immediate problem. In some of the tables, I find that the
sqlFetchMore returns a value of -1L, meaning that the end has been
reached. In RODBC, I find no command for getting the row count in a
table. I have found this in DBI. But I have not been able to figure out
how I should specify the connection to an acccess database (con in the
following).
while(!dbHasCompleted(con)){ print(dbGetRowCount(con))}
I would appreciate help on the following points :1. How can I get the
row count (and size) of a table in an access data base? With RODBC, DBI
or any other way.2. I have found that saving the tables as .RData files
reduces the file size and and reduces the reading time. Is there some
way of appending to an already saved data frame with this method?3. I
have come across alternative ways of saving data - using writeBin and
packages like saves, rhdf5 etc. Would they be useful alternatives?
4. Is there an advantage in combining binary files and databases like
sqlite? Or, are files already saved in a binary format in databases
like sqlite?5. What is the simplest method of porting from the access
to the sqlite database? With RSQlite and RODBC, can I have connections
to the access and sqlite databases open at the same time? Or, should I
close one and then open the other? It would help if I can get a
detailed bit if code for doing this in a simple way.
I would appreciate all help that I can get.Thanks,Ravi
????[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Confidentiality Statement: This email message, including any
attachments, is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all
copies of the original message.