RJDBC to OpenOffice Calc as RODBC to MS Excel
On Fri, 2007-12-14 at 07:31 +0000, Prof Brian Ripley wrote:
On Fri, 14 Dec 2007, Metz, Thomas (IRRI) wrote:
Under Windows, I have used RODBC to connect to Excel spreadsheets as per
the example below:
library(RODBC);
connect = odbcConnectExcel("testdata.xls");
query = "SELECT [data$.ethn], [data$.sex], [data$.age],
[data$.height], [data$.weight],
[label$.label]
FROM [data$], [label$]
WHERE [data$.ethn] = [label$.ethn];"
data = sqlQuery(connect, query);
odbcClose(connect);
[data$] and [label$] are two named sheets in the Excel spreadsheet
testdata.xls. [.ethn], [.sex], [.age], [.height], [.weight], and
[.label] are cloumn names that appear in the first row in the sheets. I
can also have UNION queries that allow me to overcome the spreadsheet
row limitation of a single sheet. The idea is to allow normalization of
data in a spreadsheet and leveraging the power of SQL, without using a
database.
Can the same be done under Windows (Linux?) with OpenOffice Calc using
RJDBC? Are there ODBC drivers for OpenOffice Calc?
An awful lot of that is Microsoft warts on SQL, so it will not be portable. But in a more standard syntax (drop the [] and $) it should be doable over any connection that supports SQL queries. The question is whether OO calc has suitable drivers as an ODBC/JDBC server. Not a question for this list! (I suspect the answer is no: Microsoft's drivers effectively use the Access engine to work with spreadsheet files and even plain text. I don't even see drivers for OO base.)
There are none to my knowledge. Most of OO.org's ODBC/JDBC integration is one-way. That is, it can connect within the suite and to external sources, but does not appear to provide connectivity to enable external applications to acquire data stored within OO.org's apps. Base is an embedded version of HSQLDB (http://www.hsqldb.org), which is a java based application. In theory, it would support a JDBC interface, but I have seen none and when the subject comes up on the OO.org lists, no solutions are forthcoming. It's too bad, at least within this context, that the OO.org folks elected to use HSQLDB rather than SQLite, which was the the other option under consideration. Perhaps Sun's influence, vis-a-vis Java, won the day here. Bearing in mind that OO.org's Write and Calc documents are just 'zipped' XML files, it would be possible to parse the data stored within such documents. I suspect Max Kuhn has spent much time on this for odfWeave. There are Perl modules that can provide a level of interaction here. For example, OpenOffice::Parse::SXC (http://search.cpan.org/~dclee/OpenOffice-Parse-SXC-0.03/SXC.pm) provides the means to parse a Calc file directly, without needing the OO.org API. One could wrap that module in an R function via a system() call and then interact with a Calc file directly. HTH, Marc Schwartz