Skip to content

Postgresql Options for Mac OS X

8 messages · Paul Gilbert, Brian Ripley, Dirk Eddelbuettel +3 more

#
In trying to figure out how the best way to access a postgresql  
database from R on Mac OS X. Currently my database has 88 million rows  
(9 GB) and is growing so I am looking for a stable interface.  The  
following options seem to be available:

1. RdbiPgSQL_1.18.1
2. RPostgreSQL_0.1-4
3. TSPostgreSQL_2009.3-2

These packages seemed to be obsolete and no longer supported/ 
recommended:

4. RPgSQL 1.0
5. Rdbi 0.1.2

Results
======
#1 relies on #5 and has an older connection syntax.
#2 appears relatively new and has not reached 1.0 status yet.  For  
someone used to libpq the interface is very familiar.
#3 relies on #2 and not sure what it adds as the docs did not install  
correctly, nor are they accessible from CRAN.
All 3 packages compiled and installed on Mac OS X even though #2 and  
#3 reported failures in CRAN.
Have not tried to actually use them yet.

Conclusion
=========
Use #2 even though it does not seem very mature.

Questions
========
Is this a good summary of the state of accessing postgresql from R or  
have I missed something?

Is RPostgreSQL being used anyplace and is it stable?

Neil
Chicago
#
Neil Tiffin wrote:
Yes. (#3 is specific to time series dbs.  ) Another option is odbc.
I'm not sure why you had problems with the docs.  I think the testing 
problems on CRAN may have to do with the testing machine (not having a 
working Postgres?).

Paul
====================================================================================

La version fran?aise suit le texte anglais.

------------------------------------------------------------------------------------

This email may contain privileged and/or confidential in...{{dropped:26}}
#
On Mon, 11 May 2009, Neil Tiffin wrote:

            
You have omitted RODBC, since PostgreSQL has a mature ODBC driver.

3) Is a special-purpose layer over 2).
Have you looked at the failures?  RPostgreSQL failed to install 
because the build machine does not have PostgreSQL installed.
The second.

  
    
#
Thanks, I am looking into RODBC.

So which approach is considered the most used?
On May 12, 2009, at 12:09 AM, Prof Brian Ripley wrote:

            
#
Neil,
On 12 May 2009 at 06:09, Prof Brian Ripley wrote:
| On Mon, 11 May 2009, Neil Tiffin wrote:
| 
| > In trying to figure out how the best way to access a postgresql database from 
| > R on Mac OS X. Currently my database has 88 million rows (9 GB) and is 
| > growing so I am looking for a stable interface.  The following options seem 
| > to be available:
| >
| > 1. RdbiPgSQL_1.18.1
| > 2. RPostgreSQL_0.1-4
| > 3. TSPostgreSQL_2009.3-2
| 
| You have omitted RODBC, since PostgreSQL has a mature ODBC driver.

Yes. I packaged it for Debian more than six years ago, and have supported it
ever since, precisely because I wanted to talk to Pg on my systems. RODBC is
stable and mature --- thanks to a lot of work by Brian Ripley (building on
the earlier work by Michael Lapsley).

That said, I also find ODBC and its setup mechanism with DSNs etc somewhat
tedious and quite frankly like the DBI interface better.  Which is why I
suggested and mentored the DBI-compliant RPostgreSQL implementation as a
Google Summer of Code project last year.  People seem to use and like it, and
you remove one layer of possible breakage. The DBI abstraction also allows
you to switch to other backends should the need arise.

So you have choices, which is a good thing.

| Have you looked at the failures?  RPostgreSQL failed to install 
| because the build machine does not have PostgreSQL installed.

IIRC at least the CRAN Linux testing boxen (being Debian) have it.  And
Sameer and I also worked with Uwe last year to get Windows builds.  So yes,
the builds fails _on Mac OS X only_ --- presumably because nobody volunteered
to provide Pg for the build machine.  Maybe Neil can help here....

Dirk
1 day later
#
Hi,
On Mon, May 11, 2009 at 6:18 PM, Neil Tiffin <neilt at neiltiffin.com> wrote:
Yes, I do use it on a regular basis. Though there are some quirks here
and there with the functions, they are usable. Often I have found
myself writing small wrapper functions around the ones provided with
RPostgreSQL for my special needs. I have never tried the other options
so no comments on them. But I remember, some time back, reading the
documentations and then finally settling on RPostgreSQL for its
clarity in implementation and function usage.

I use it completely under Linux to Linux connections, so any Mac OSX
specific issues will be hidden from me.

Finally, I should also mention that my usage is primarily like this:
extract relevant/subset of records (and fields) from the database and
work inside R with that. I have never tried any complicated/fancy
database coding, though, in principle it should work, Dirk might
confirm.
#
On May 14, 2009, at 12:13 AM, Prasenjit Kapat wrote:

            
If you care to be more specific about the "Quirks" and why you did  
your wrappers I would be interested.  I am working with Dirk and  
others to get the test environments set up on CRAN and r-forge for  
Mac. Maybe some tests need to be added.  Also I have a lot of  
experience with C drivers and probably can propose some improvements.

Neil
#
On Thu, May 14, 2009 at 9:23 AM, Neil Tiffin <neilt at neiltiffin.com> wrote:

            
I'm not using the interface much at this point, so I cannot comment much
more than this without devoting more time.  It seems fine for everyday
casual use, but there there are a couple obvious issues that could use some
work.  See here for an issue with schema support (not sure if any suggested
changes have been adopted):

https://stat.ethz.ch/pipermail/r-sig-db/2009q2/000618.html

The second had to do with dbWriteTable and dbReadTable.  Postgresql has a
protocol for slurping up text data via the connection in a VERY fast manner;
this functionality (last I checked) was not being used to deal with
dbWriteTable and dbReadTable.  It would be very nice to use that api for the
reading and (particularly) writing of tables, since it can be an order of
magnitude or more faster than inserts.

Sean