I have put a group of packages on CRAN for time series databases. The current versions should be considered beta, and I would appreciate feedback from this SIG before announcing them more broadly. (Thanks to Gabor Grothendieck for comments on an alpha version.) TSdbi defines a common API which the other packages use. TSMySQL and TSSQLite provide methods for MySQL and SQLite, and require RMySQL and RSQLite respectively. TSpadi uses an RPC based protocol for a client/server connection where the server could use any database, but the working implementation is with Fame. (This last package is mainly for me to support legacy applications, but also helps test the generality of the interface.) I believe it should be straight forward to implement any SQL database having a DBI based package, and also not difficult to implement on top of RODBC, though I have not tried that yet. It should also be possible to interface to the R fame package directly, which could provide writing to the database and some other features not supported by TSpadi. (If anyone is interested in working on any of these, please contact me for additional hints.) The SQL implementations define tables necessary to put in place the back end database, but this might benefit from examination by someone that understands SQL table optimization better than I do. The current implementation supports annual, quarterly, monthly, semiannual, weekly, daily, business day, minutely, irregular data with a date, and irregular data with a date and time. This may be constrained by the back end (e.g. Fame does not support all these types.) My own work tends to be with the first three, so others have not been tested as extensively. It should be relatively easy to implement other types of time series data in the SQL back ends (suggestions and examples?). Series documentation is supported in a meta table, which also contains a lookup mechanism to determine which table has the data for a given series identifier. (Multilingual documentation support is not implemented, but should not be too difficult.) The design also (optionally) supports vintages and panels of data (e.g. series with the same identifier but a different release date or country). This feature is actively under development. The intention is that the R time series representation can optionally be specified, but currently only the default is working (ts were possible and zoo elsewhere). Vignette examples are provided in each of the packages. (The vignettes are similar, but the most complete at the moment is the TSMySQL one.) Some possible extensions include: - a mechanism for handling aliases for series names. - an RODBC database plug in - an R Postgresql database plug in - a direct fame database plug in (Fame through TSpadi is read only) - optionally different time series representations. - multilingual documentation - mechanism for signaling series updates to users It is unlikely that I will do many of these things myself, but if anyone is interested in working on them I would be happy to provide some guidance. Paul Gilbert ==================================================================================== La version fran?aise suit le texte anglais. ------------------------------------------------------------------------------------ This email may contain privileged and/or confidential information, and the Bank of Canada does not waive any related rights. Any distribution, use, or copying of this email or the information it contains by other than the intended recipient is unauthorized. If you received this email in error please delete it immediately from your system and notify the sender promptly by email that you have done so. ------------------------------------------------------------------------------------ Le pr?sent courriel peut contenir de l'information privil?gi?e ou confidentielle. La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements qu'il contient par une personne autre que le ou les destinataires d?sign?s est interdite. Si vous recevez ce courriel par erreur, veuillez le supprimer imm?diatement et envoyer sans d?lai ? l'exp?diteur un message ?lectronique pour l'aviser que vous avez ?limin? de votre ordinateur toute copie du courriel re?u.
time series database packages
1 message · Paul Gilbert