Skip to content

Package vignettes share the same environment?

3 messages · Martin Morgan, Duncan Murdoch, Yihui Xie

#
In a package 'vig' R CMD build vig (or tools::buildVignettes(dir="vig") with

$ cat vig/vignettes/vig1.Rnw
\documentclass{article}
\begin{document}
<<>>=
x <- 1
@
\end{document}

$ cat vig/vignettes/vig2.Rnw
\documentclass{article}
\begin{document}
<<>>=
x
@
\end{document}

produces vig2.pdf where x is defined with value 1 -- the vignettes share a build 
environment. This seems undesirable in terms of reproducibility (a reader of 
vig2.pdf will not understand where x is assigned; similarly for the results of 
require()  or data() in vig1 referenced in vig2), and is not (?) documented. A 
more elaborate context is

     https://stat.ethz.ch/pipermail/bioc-devel/2014-April/005501.html

Would it be better to build each vignette in its own environment?

$ R --version|head -n 3
R version 3.1.0 RC (2014-04-05 r65379) -- "Spring Dance"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

Martin
#
On 05/04/2014, 2:24 PM, Martin Morgan wrote:
It's not just the environment that gets shared:  if you run 
buildVignette or buildVignettes in an R session, other aspects of the 
session (e.g. options() settings) will also be inherited by the 
vignette.  The way "R CMD build" handles this is to start a new R 
process to build the vignettes.

Currently it builds all vignettes in one process, rather than starting a 
separate process for each, which is why you see the x variable carry 
from one vignette to another.  I think it has been like this for quite a 
while, because on some platforms (e.g. Windows), starting a new process 
is quite slow.

I don't know if any other packages than gage currently depend on this 
behaviour.  It does sound confusing for the reader, but I don't think it 
breaks reproducibility:  after all, if a user has the package, they have 
all the vignettes, not just one.  If they just have the vignette, then 
they might not have the functions in the package that it needs, so 
they've already lost reproducibility.

Duncan Murdoch
#
By "quite slow" start-up time on Windows, you mean on the order of 1
or 2 seconds? That is probably not too bad, when we weigh it against
the confusion from compiling all vignettes in the same R session.

knitr::knit() has an 'envir' argument that specifies the environment
in which the code chunks are executed, but at the moment it is not
easy to pass additional arguments to the vignette engine. Of course,
one way is to define a new vignette engine, and that is not too hard:
it is basically something like knitr::knit(file, envir = new.env()).

But as mentioned below, there are other things shared in the same
session such as options(). I guess the cleanest way is still to start
new R sessions.

Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Web: http://yihui.name
On Sat, Apr 5, 2014 at 3:04 PM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: