Hello, I'm a Google Summer of Code student working for Gentoo to integrate installation of R packages into the package manager. Up until now, progress has been steady and I've run into little true issues. I'm using the R tools to build and install packages wherever applicable. Most CRAN packages install fine right now, so I'm now working on bioconductor, making sure your packages can be installed using a package manager as well. Now, it seems the *.db packages are giving me some headaches. They open up an existing sqlite database file on the disk and try to write to it, or so it seems. Writing directly to installed files is not really supported by the package managers of gentoo, and it can usually be argued against indeed. Packages can, in principle, only read from disk and create new files to be added to the system. However, I'm not really into Bioconductor and its software, and I'd like to hear your idea behind this process. What do the sqlite databases contain, how are they used (from a user perspective), what data is added to them by the *.db packages? Perhaps most importantly, can't this additional data be in a separate database file? Thanks for your thoughts, Auke Booij / tulcod.
[Bioc-devel] Installation process for affymetrix databases (*.db packages)
7 messages · Auke Booij, Marc Carlson
Hi Auke, Why is it that you think these packages are writing to these SQLite databases? They should not be doing so. Their purpose is only to be annotation packages and they make use of a relational database simply to represent biological data. Marc
On 07/08/2010 02:37 AM, Auke Booij wrote:
Hello, I'm a Google Summer of Code student working for Gentoo to integrate installation of R packages into the package manager. Up until now, progress has been steady and I've run into little true issues. I'm using the R tools to build and install packages wherever applicable. Most CRAN packages install fine right now, so I'm now working on bioconductor, making sure your packages can be installed using a package manager as well. Now, it seems the *.db packages are giving me some headaches. They open up an existing sqlite database file on the disk and try to write to it, or so it seems. Writing directly to installed files is not really supported by the package managers of gentoo, and it can usually be argued against indeed. Packages can, in principle, only read from disk and create new files to be added to the system. However, I'm not really into Bioconductor and its software, and I'd like to hear your idea behind this process. What do the sqlite databases contain, how are they used (from a user perspective), what data is added to them by the *.db packages? Perhaps most importantly, can't this additional data be in a separate database file? Thanks for your thoughts, Auke Booij / tulcod.
_______________________________________________ Bioc-devel at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
On Thu, Jul 8, 2010 at 10:21 PM, Marc Carlson <mcarlson at fhcrc.org> wrote:
Why is it that you think these packages are writing to these SQLite databases? ?They should not be doing so. ?Their purpose is only to be annotation packages and they make use of a relational database simply to represent biological data.
Hm, perhaps it's not really writing, but requesting write permissions
nonetheless? I'm searching through the code to figure out what's
happening, but it seems at some point the R CMD INSTALL I'm starting
as part of the sandboxed install process tries to load some libraries
which try to write to sqlite libraries... partial output below. Does
any of this ring a bell?
R CMD INSTALL --build /usr/portage/distfiles/zebrafish.db_2.4.1.tar.gz
-l /var/tmp/paludis/dev-R-zebrafishdb-2.4.1/work/tmp_install
* installing *source* package 'zebrafish.db' ...
** R
** inst
** preparing package for lazy loading
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'openVignette()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation(pkgname)'.
Loading required package: DBI
ACCESS DENIED open_wr:
/usr/lib64/R/library/org.Dr.eg.db/extdata/org.Dr.eg.sqlite
ISE:write_logfile unable to append logfile
ISE open_wr(/usr/lib64/R/library/org.Dr.eg.db/extdata/org.Dr.eg.sqlite):
Permission denied
abs_path: /usr/lib64/R/library/org.Dr.eg.db/extdata/org.Dr.eg.sqlite
res_path: /usr/lib64/R/library/org.Dr.eg.db/extdata/org.Dr.eg.sqlite
/usr/lib/libsandbox.so(+0x3709)[0x7ffdc4cf5709]
/usr/lib/libsandbox.so(+0x379b)[0x7ffdc4cf579b]
/usr/lib/libsandbox.so(+0x4bd1)[0x7ffdc4cf6bd1]
/usr/lib/libsandbox.so(open64+0xc7)[0x7ffdc4cfa687]
/usr/lib64/R/library/RSQLite/libs/RSQLite.so(+0x427e6)[0x7ffdbf8797e6]
/usr/lib64/R/library/RSQLite/libs/RSQLite.so(+0x43850)[0x7ffdbf87a850]
/usr/lib64/R/library/RSQLite/libs/RSQLite.so(+0x44257)[0x7ffdbf87b257]
/usr/lib64/R/library/RSQLite/libs/RSQLite.so(RS_SQLite_newConnection+0x109)[0x7ffdbf8482a5]
/usr/lib64/R/lib/libR.so(+0x95763)[0x7ffdc488f763]
/usr/lib64/R/lib/libR.so(Rf_eval+0x76e)[0x7ffdc48b6585]
/proc/12572/cmdline: /usr/lib64/R/bin/exec/R --no-restore --slave
--args nextArg--buildnextArg/usr/portage/distfiles/zebrafish.db_2.4.1.tar.gznextArg-lnextArg/var/tmp/paludis/dev-R-zebrafishdb-2.4.1/work/tmp_install
Hi Auke, The installer should place the contents of "inst/extdata" into "extdata" of your installed library, but that is about it. There *IS* a dependency however that is logged in the DESCRIPTION file. Each chip package requires that an "org" package be installed in order for it to work. So for example, this package "zebrafish.db_2.4.1.tar.gz " requires the "org.Dr.eg.db". And the message below seems to be complaining that the database in "org.Dr.eg.db" is not available. It does not need to write to this DB, but it does need to be able to read it. Did you install the org packages and other dependencies 1st? Also is there a specific reason why is biocLite() is not your preferred way to install bioconductor packages? It handles all of these issues for you and installs requested packages (and resolves dependencies). http://www.bioconductor.org/docs/install/ Marc
On 07/08/2010 02:37 PM, Auke Booij wrote:
On Thu, Jul 8, 2010 at 10:21 PM, Marc Carlson <mcarlson at fhcrc.org> wrote:
Why is it that you think these packages are writing to these SQLite
databases? They should not be doing so. Their purpose is only to be
annotation packages and they make use of a relational database simply to
represent biological data.
Hm, perhaps it's not really writing, but requesting write permissions
nonetheless? I'm searching through the code to figure out what's
happening, but it seems at some point the R CMD INSTALL I'm starting
as part of the sandboxed install process tries to load some libraries
which try to write to sqlite libraries... partial output below. Does
any of this ring a bell?
R CMD INSTALL --build /usr/portage/distfiles/zebrafish.db_2.4.1.tar.gz
-l /var/tmp/paludis/dev-R-zebrafishdb-2.4.1/work/tmp_install
* installing *source* package 'zebrafish.db' ...
** R
** inst
** preparing package for lazy loading
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'openVignette()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation(pkgname)'.
Loading required package: DBI
ACCESS DENIED open_wr:
/usr/lib64/R/library/org.Dr.eg.db/extdata/org.Dr.eg.sqlite
ISE:write_logfile unable to append logfile
ISE open_wr(/usr/lib64/R/library/org.Dr.eg.db/extdata/org.Dr.eg.sqlite):
Permission denied
abs_path: /usr/lib64/R/library/org.Dr.eg.db/extdata/org.Dr.eg.sqlite
res_path: /usr/lib64/R/library/org.Dr.eg.db/extdata/org.Dr.eg.sqlite
/usr/lib/libsandbox.so(+0x3709)[0x7ffdc4cf5709]
/usr/lib/libsandbox.so(+0x379b)[0x7ffdc4cf579b]
/usr/lib/libsandbox.so(+0x4bd1)[0x7ffdc4cf6bd1]
/usr/lib/libsandbox.so(open64+0xc7)[0x7ffdc4cfa687]
/usr/lib64/R/library/RSQLite/libs/RSQLite.so(+0x427e6)[0x7ffdbf8797e6]
/usr/lib64/R/library/RSQLite/libs/RSQLite.so(+0x43850)[0x7ffdbf87a850]
/usr/lib64/R/library/RSQLite/libs/RSQLite.so(+0x44257)[0x7ffdbf87b257]
/usr/lib64/R/library/RSQLite/libs/RSQLite.so(RS_SQLite_newConnection+0x109)[0x7ffdbf8482a5]
/usr/lib64/R/lib/libR.so(+0x95763)[0x7ffdc488f763]
/usr/lib64/R/lib/libR.so(Rf_eval+0x76e)[0x7ffdc48b6585]
/proc/12572/cmdline: /usr/lib64/R/bin/exec/R --no-restore --slave
--args nextArg--buildnextArg/usr/portage/distfiles/zebrafish.db_2.4.1.tar.gznextArg-lnextArg/var/tmp/paludis/dev-R-zebrafishdb-2.4.1/work/tmp_install
On Fri, Jul 9, 2010 at 12:37 AM, Marc Carlson <mcarlson at fhcrc.org> wrote:
Hi Auke,
Hi Marc,
The installer should place the contents of "inst/extdata" into "extdata" of your installed library, but that is about it. ?There *IS* a dependency however that is logged in the DESCRIPTION file. ?Each chip package requires that an "org" package be installed in order for it to work. ?So for example, this package "zebrafish.db_2.4.1.tar.gz " requires the "org.Dr.eg.db". ?And the message below seems to be complaining that the database in "org.Dr.eg.db" is not available. ?It does not need to write to this DB, but it does need to be able to read it. ?Did you install the org packages and other dependencies 1st?
I most definitely have that package installed. The package manager resolves all dependencies. As far as I'm concerned, the error is pretty clear: something's trying to open a file with write permissions, and that's prohibited. Could it be that the org.Dr.eg.db is opened with write permissions, even though nothing is really written? If so, could that be fixed or will I need a workaround?
Also is there a specific reason why is biocLite() is not your preferred way to install bioconductor packages? ?It handles all of these issues for you and installs requested packages (and resolves dependencies).
No, there is no specific reason for me not to use biocLite, but then again, I personally don't seek anything in Bioconductor at all. I'm working on enabling virtually all users of Gentoo Linux to install CRAN and Bioconductor packages using their regular package manager, which I think offers some obvious conceptual advantages. Perhaps one of the bigger issues with biocLite() some Gentoo purists may have is that it installs files without the package manager knowing anything about those files, and some users consider that evil. Another serious advantage would be dependency resolution for external packages, like for CRAN's gsl package, but I have yet to find a package with external dependencies in Bioconductor.
Yes, I've seen that, and I've actually scavenged the (two) repository locations by reading biocLite() and friends :-) So, to sum it up, could it be that the database is opened with write permissions, but nothing is actually written, and is it possible at all to fix this, or is this inherent in the way R reads sqlite databases? Thanks again, Auke Booij / tulcod.
Hi Auke, Thanks to Martins generous hint, I think I have this fixed. There were some default behaviors in a package we depend on that were opening these databases so that they could be written to (even though nothing was being written there). If you grab the very latest version of AnnotationDbi, you can try to see if its working any better for you. The latest version is already available in the svn repository, but you should be able to get the latest tarball from our website/build system within the next 24 hours or so. Marc
On 07/09/2010 12:53 AM, Auke Booij wrote:
On Fri, Jul 9, 2010 at 12:37 AM, Marc Carlson <mcarlson at fhcrc.org> wrote:
Hi Auke,
Hi Marc,
The installer should place the contents of "inst/extdata" into "extdata"
of your installed library, but that is about it. There *IS* a
dependency however that is logged in the DESCRIPTION file. Each chip
package requires that an "org" package be installed in order for it to
work. So for example, this package "zebrafish.db_2.4.1.tar.gz "
requires the "org.Dr.eg.db". And the message below seems to be
complaining that the database in "org.Dr.eg.db" is not available. It
does not need to write to this DB, but it does need to be able to read
it. Did you install the org packages and other dependencies 1st?
I most definitely have that package installed. The package manager resolves all dependencies. As far as I'm concerned, the error is pretty clear: something's trying to open a file with write permissions, and that's prohibited. Could it be that the org.Dr.eg.db is opened with write permissions, even though nothing is really written? If so, could that be fixed or will I need a workaround?
Also is there a specific reason why is biocLite() is not your preferred
way to install bioconductor packages? It handles all of these issues
for you and installs requested packages (and resolves dependencies).
No, there is no specific reason for me not to use biocLite, but then again, I personally don't seek anything in Bioconductor at all. I'm working on enabling virtually all users of Gentoo Linux to install CRAN and Bioconductor packages using their regular package manager, which I think offers some obvious conceptual advantages. Perhaps one of the bigger issues with biocLite() some Gentoo purists may have is that it installs files without the package manager knowing anything about those files, and some users consider that evil. Another serious advantage would be dependency resolution for external packages, like for CRAN's gsl package, but I have yet to find a package with external dependencies in Bioconductor.
Yes, I've seen that, and I've actually scavenged the (two) repository locations by reading biocLite() and friends :-) So, to sum it up, could it be that the database is opened with write permissions, but nothing is actually written, and is it possible at all to fix this, or is this inherent in the way R reads sqlite databases? Thanks again, Auke Booij / tulcod.
1 day later
On Fri, Jul 9, 2010 at 6:23 PM, Marc Carlson <mcarlson at fhcrc.org> wrote:
Hi Auke, Thanks to Martins generous hint, I think I have this fixed. ?There were some default behaviors in a package we depend on that were opening these databases so that they could be written to (even though nothing was being written there). ?If you grab the very latest version of AnnotationDbi, you can try to see if its working any better for you. The latest version is already available in the svn repository, but you should be able to get the latest tarball from our website/build system within the next 24 hours or so. ?Marc
Hey Marc, Martin and others, It works great now. Thanks a lot for the quick fix!