I was experiencing similar issues of long build times with LazyData TRUE so
I made it FALSE.
You can specify the environment in which data is loaded (it doesn't have to
be the global environment). For example, to load the cmap_es data from the
ccdata package call this from within a function:
utils::data("cmap_es", package = "ccdata", envir = environment())
It will get loaded into the function's environment (not global). You may
also need to add a `cmap_es = NULL` prior to loading it otherwise the build
process will complain about not declaring the cmap_es variable before using
it.
On Sat, Jul 30, 2016 at 3:00 AM, <bioc-devel-request at r-project.org> wrote:
Send Bioc-devel mailing list submissions to
bioc-devel at r-project.org
To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/bioc-devel or, via email, send a message with subject or body 'help' to bioc-devel-request at r-project.org You can reach the person managing the list at bioc-devel-owner at r-project.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Bioc-devel digest..." Today's Topics: 1. Re: lazyData (Kasper Daniel Hansen) ---------------------------------------------------------------------- Message: 1 Date: Fri, 29 Jul 2016 16:26:45 -0400 From: Kasper Daniel Hansen <kasperdanielhansen at gmail.com> To: Martin Morgan <martin.morgan at roswellpark.org> Cc: "bioc-devel at r-project.org" <bioc-devel at r-project.org> Subject: Re: [Bioc-devel] lazyData Message-ID: <CAC2h7ut9B9UFfAxrD1OiEXgV-AkWMd= SsiTRCJqWf7zfwBgSZg at mail.gmail.com> Content-Type: text/plain; charset="UTF-8" With LazyData true you indeed don't load the data until it is available. My guess, from skimming the code extremely fast, is that the extreme requirements (memory and time) during installation is because the data objects needs to get loaded and somehow modified for this to happen. Re. the global environment: if my package has an object TEST, and LazyData is TRUE, when I do (say) data(TEST) or use TEST somehow, TEST doesn't exists in the Global environment. But if LazyData is FALSE and I do data(TEST), TEST gets copied into the Global environment, which is kind of irritating when it is annotation data because it seems fragile to me (perhaps it is not). Best, Kasper On Fri, Jul 29, 2016 at 3:38 PM, Martin Morgan < martin.morgan at roswellpark.org> wrote: On 07/18/2016 10:52 AM, Kasper Daniel Hansen wrote: This is a report on my testing with lazyData turned on and off wrt. installation time and memory requirements. It turns out that using lazyData dramatically increases memory consumption and time for a (admittedly large) annotation package. Perhaps this is something we should think about wrt. annotation and data packages. Test example is IlluminaHumanMethylationEPICanno.ilm10b2.hg19 an annotation package for minfi. The .tar.gz for the this package is 113 so its not small. I have explored using LazyData: yes/no in DESCRIPTION adding a single line data/datalist file containing the objects in the package What follows are timings and memory consumption of R CMD build + INSTALL on my Mac laptop using an SSD drive. LazyData: yes datalist: no 285 seconds 3.22 GB (values as high as 3.8GB seen) LazyData: no datalist: no 81s 1.64 GB LazyData: no datalist: yes 19s 0.38 GB Hi Kasper -- I have to admit my ignorance on the miracle of lazy data. Can you clarify what one gains from LazyData? I kind of though that with LazyData: true the data was only loaded when needed, but that doesn't seem consistent with the picture you paint above? Also, what's the discussion about global variables? Martin (following combination is not mentioned by R-exts, and while it still uses tons of memory, it seems to be 1 minute faster; redid measuring once to confirm this) LazyData: yes datalist: yes 226 s 3.26 GB (values as high as 3.9GB seen) Make the data LazyLoaded is pretty nice; one thing is it avoids polluting the global environment. But it seems that it would be worthwhile to consider if some of this could be done prior to the package build time. Perhaps not, but for sure we are spending resources on the building and installing of this by the build system. I started going down this route because my Travis build starting being killed due to 3+GB being used. I really don't like turning off LazyLoad because of the global environment issue, but the number are kind of extreme here. Best, Kasper [[alternative HTML version deleted]] _______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. [[alternative HTML version deleted]] ------------------------------ Subject: Digest Footer _______________________________________________ Bioc-devel mailing list Bioc-devel at r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-devel ------------------------------ End of Bioc-devel Digest, Vol 148, Issue 38 *******************************************