Skip to content

[R-pkg-devel] Suppressing long-running vignette code in CRAN submission

26 messages · Kevin Coombes, John Harrold, Ivan Krylov +9 more

Messages 1–25 of 26

#
Dear list members,

I believe that this issue has been discussed previously, but I'm not 
sure that I have the solution right.

Georges Monette and I have developed a package that we intend to submit 
soon to CRAN which has a vignette including code that takes a long time 
to run. The sources for the package are available at 
<https://github.com/gmonette/cv>.

We figure that we have to suppress running the code the vignette when 
CRAN checks the package or the check time will be excessive.

The vignette is written as a .Rmd file to be compiled by knitr, 
producing an HTML vignette. The top of the .Rmd file looks like this:

------- snip -------

---
title: "Cross-validation of regression models"
author: "John Fox and Georges Monette"
date: "`r Sys.Date()`"
package: cv
output:
   rmarkdown::html_vignette:
   fig_caption: yes
bibliography: ["cv.bib"]
csl: apa.csl
vignette: >
   %\VignetteIndexEntry{Cross-validation of regression models}
   %\VignetteEngine{knitr::rmarkdown}
   %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
   collapse = TRUE,
   message = TRUE,
   warning = TRUE,
   fig.align = "center",
   fig.height = 6,
   fig.width = 7,
   fig.path = "fig/",
   dev = "png",
   comment = "#>",
   eval = nzchar(Sys.getenv("REBUILD_CV_VIGNETTES"))
)

### other irrelevant setup code not shown ###

```

------- snip -------

So (near the bottom), if the environment variable REBUILD_CV_VIGNETTES 
isn't empty, the code blocks in the vignette are evaluated, otherwise 
not. To build the tarball for the package to be submitted to CRAN, we 
will set REBUILD_CV_VIGNETTES to "true". That works as intended.

If we submit the tarball to CRAN, I believe that the package as 
distributed by CRAN will include the HTML vignette from our tarball, 
showing the evaluated code blocks, but when CRAN checks the package, 
these long-running code blocks will not be executed (because 
REBUILD_CV_VIGNETTES will not exist on the CRAN check machines).

My questions:

Is that correct?

If not, how can we ensure that the complete vignette is distributed by 
CRAN without causing an overly long CRAN check time?

In particular, we don't want CRAN to rebuild and distribute the 
vignette, because the resulting HTML file won't show the evaluated code.

Any assistance would be appreciated.

Thank you,
  John
#
Produce a PDF file yourself, then use the "as.is" feature of the R.rsp 
package.

Specifically, include this line in your DESCRIPTION file:
VignetteBuilder: R.rsp

Let's say the pdf file is called "myfile.pdf".? Create a file called 
"myfile.pdf.asis" that contains the vignette instructions. Put it in the 
vignette directory along with the PDF file. Mine looks like:

%\VignetteIndexEntry{PUT VIGNETTE TITLE HERE}
%\VignetteKeywords{PUT KEYWORDS HERE}
%\VignetteDepends{PACKAGE_NAME}
%\VignettePackage{PACKAGE_NAME}
%\VignetteEngine{R.rsp::asis}

I am not certain if this works for an HTML vignette, but I see no reason 
why it shouldn't.

Best,
 ? Kevin
On 10/16/2023 10:14 AM, John Fox wrote:
#
Hello John,

The way I've done this is I use the the global eval option to control
running the code. When it's set to TRUE it will run the code and save the
results as an RData file. So when I'm creating it locally I have eval=TRUE
in the options:

https://github.com/john-harrold/ubiquity/blob/bb0532915b63f02f701148e5ae097222fef50a2d/vignettes/Simulation.Rmd#L11

Then when I want to use the results I load them from a file:

https://github.com/john-harrold/ubiquity/blob/bb0532915b63f02f701148e5ae097222fef50a2d/vignettes/Simulation.Rmd#L101

Then before I submit to CRAN I set eval=FALSE in the setup at the top of
the vignette.

John
On Mon, Oct 16, 2023 at 7:15?AM John Fox <jfox at mcmaster.ca> wrote:

            

  
  
#
On 16 October 2023 at 10:42, Kevin R Coombes wrote:
| Produce a PDF file yourself, then use the "as.is" feature of the R.rsp 
| package.

For completeness, that approach also works directly with Sweave. Described in
a blog post by Mark van der Loo in 2019, and used in a number of packages
including a few of mine.

That said, I also used the approach described by John Harrold and cached
results myself.

Dirk
#
Hi, thanks to reply.?I know that the version 1.0.3 of cartogRaflow use rgeos maptools. Now the version cartogRaflow was archived on 16th October.?I've uploaded cartogRaflow version 1.0.4 with sf package.?
I've submitted yesterday the version 1.0.4 in CRAN and received 2 two error message because cartogRaflow is archived and use rgeos maptools in version 1.0.3?
My question is : how can put the version 1.0.4 in CRAN??
Thanks in advance?Sylvain?

Envoy? depuis Yahoo?Mail pour Android 
 
  Le mar., oct. 17, 2023 ? 10:03, Dirk Eddelbuettel<edd at debian.org> a ?crit:
On 16 October 2023 at 10:42, Kevin R Coombes wrote:
| Produce a PDF file yourself, then use the "as.is" feature of the R.rsp 
| package.

For completeness, that approach also works directly with Sweave. Described in
a blog post by Mark van der Loo in 2019, and used in a number of packages
including a few of mine.

That said, I also used the approach described by John Harrold and cached
results myself.

Dirk
#
Dear Sylvain,

? Tue, 17 Oct 2023 09:45:25 +0000 (UTC)
"cartograflow at gmail.com" <cartograflow at gmail.com> ?????:
Could you please show us the full R CMD check report e-mailed to you in
response to the CRAN submission?

A NOTE saying that the previous version had been archived is to be
expected. If it's all your package got, it will be manually reviewed
later and published.
#
You will find the full R CMD check that I received?
Status: 2 NOTEs
Debian: <https://win-builder.r-project.org/incoming_pretest/cartograflow_1.0.4_20231016_225129/Debian/00check.log>


Envoy? depuis Yahoo?Mail pour Android 
 
  Le mar., oct. 17, 2023 ? 11:49, Ivan Krylov<krylov.r00t at gmail.com> a ?crit:   Dear Sylvain,

? Tue, 17 Oct 2023 09:45:25 +0000 (UTC)
"cartograflow at gmail.com" <cartograflow at gmail.com> ?????:
Could you please show us the full R CMD check report e-mailed to you in
response to the CRAN submission?

A NOTE saying that the previous version had been archived is to be
expected. If it's all your package got, it will be manually reviewed
later and published.
#
Hello Dirk,

Thank you (and Kevin and John) for addressing my questions.

No one directly answered my first question, however, which was whether 
the approach that I suggested would work. I guess that the implication 
is that it won't, but it would be nice to confirm that before I try 
something else, specifically using R.rsp.

Best,
  John
On 2023-10-17 4:02 a.m., Dirk Eddelbuettel wrote:
#
Hello John,

It should work for the scenario where you as developer and CRAN's checks should both be happy. But there are other scenarios were it might not work.

For example, if a user wishes to build a local pkgdown site for your package, they would need to know to set that environment variable. I also see that you have pkgdown actions on the github repo but the eval option is commented out there

     # eval = nzchar(Sys.getenv("REBUILD_VIGNETTES"))

Even so, I don't see the vignette on the rendered site. The same thing I observed when I cloned the package, uncommented the 'eval' option, and run pkgdown. Of course you  could modify the pkgdown action to set the environment variable.
I don't know why pkgdown doesn't include the vignette 'cv' (it includes the other one). 

Also when the vignette is created without evaluating the code it would be useful to include a very visible message, explaining how to create the full one.

All in all, some of the solutions recommended by others seem preferable.

Georgi Boshnakov


-----Original Message-----
From: R-package-devel <r-package-devel-bounces at r-project.org> On Behalf Of John Fox
Sent: Tuesday, October 17, 2023 3:03 PM
To: Dirk Eddelbuettel <edd at debian.org>
Cc: r-package-devel at r-project.org
Subject: Re: [R-pkg-devel] Suppressing long-running vignette code in CRAN submission

Hello Dirk,

Thank you (and Kevin and John) for addressing my questions.

No one directly answered my first question, however, which was whether the approach that I suggested would work. I guess that the implication is that it won't, but it would be nice to confirm that before I try something else, specifically using R.rsp.

Best,
  John
On 2023-10-17 4:02 a.m., Dirk Eddelbuettel wrote:
______________________________________________
R-package-devel at r-project.org mailing list https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-package-devel__;!!PDiH4ENfjr2_Jw!F7whfpm7565Ira_DwkEbOj5wjzyL-WztVmqP5Ak56oOGw4wkUyMiLVCOno76H-Iu2wVBvuVMWoYaUtEdjWOx0vfSzv8t$ [stat[.]ethz[.]ch]

Dear list members,

I believe that this issue has been discussed previously, but I'm not sure that I have the solution right.

Georges Monette and I have developed a package that we intend to submit soon to CRAN which has a vignette including code that takes a long time to run. The sources for the package are available at <https://urldefense.com/v3/__https://github.com/gmonette/cv__;!!PDiH4ENfjr2_Jw!BUIYcAjj16HHKVqI_3EssWXnTOQ1Sz95Vv_aF1ebj4qCCayUfxFKDASOpfkJhK4YTWTDJsbDgRVAHngMjRyqzrZC5Wsa$ [github[.]com]>.

We figure that we have to suppress running the code the vignette when CRAN checks the package or the check time will be excessive.

The vignette is written as a .Rmd file to be compiled by knitr, producing an HTML vignette. The top of the .Rmd file looks like this:

------- snip -------

---
title: "Cross-validation of regression models"
author: "John Fox and Georges Monette"
date: "`r Sys.Date()`"
package: cv
output:
   rmarkdown::html_vignette:
   fig_caption: yes
bibliography: ["cv.bib"]
csl: apa.csl
vignette: >
   %\VignetteIndexEntry{Cross-validation of regression models}
   %\VignetteEngine{knitr::rmarkdown}
   %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
   collapse = TRUE,
   message = TRUE,
   warning = TRUE,
   fig.align = "center",
   fig.height = 6,
   fig.width = 7,
   fig.path = "fig/",
   dev = "png",
   comment = "#>",
   eval = nzchar(Sys.getenv("REBUILD_CV_VIGNETTES"))
)

### other irrelevant setup code not shown ###

```

------- snip -------

So (near the bottom), if the environment variable REBUILD_CV_VIGNETTES isn't empty, the code blocks in the vignette are evaluated, otherwise not. To build the tarball for the package to be submitted to CRAN, we will set REBUILD_CV_VIGNETTES to "true". That works as intended.

If we submit the tarball to CRAN, I believe that the package as distributed by CRAN will include the HTML vignette from our tarball, showing the evaluated code blocks, but when CRAN checks the package, these long-running code blocks will not be executed (because REBUILD_CV_VIGNETTES will not exist on the CRAN check machines).

My questions:

Is that correct?

If not, how can we ensure that the complete vignette is distributed by CRAN without causing an overly long CRAN check time?

In particular, we don't want CRAN to rebuild and distribute the vignette, because the resulting HTML file won't show the evaluated code.

Any assistance would be appreciated.

Thank you,
  John
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://urldefense.com/v3/__https://www.john-fox.ca/__;!!PDiH4ENfjr2_Jw!BUIYcAjj16HHKVqI_3EssWXnTOQ1Sz95Vv_aF1ebj4qCCayUfxFKDASOpfkJhK4YTWTDJsbDgRVAHngMjRyqzheH0nJ3$ [john-fox[.]ca]

______________________________________________
R-package-devel at r-project.org mailing list https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-package-devel__;!!PDiH4ENfjr2_Jw!BUIYcAjj16HHKVqI_3EssWXnTOQ1Sz95Vv_aF1ebj4qCCayUfxFKDASOpfkJhK4YTWTDJsbDgRVAHngMjRyqzih743zY$ [stat[.]ethz[.]ch]
#
Hello Georgi,
On 2023-10-17 2:36 p.m., Georgi Boshnakov wrote:
Thank you for addressing that question. I was concerned that CRAN would 
rebuild the vignette for the tarball that they distribute, which then 
would have no output.

In the interim, I implemented the approach using the R.rsp package 
suggested by Kevin and Dirk. It seems to work OK. A downside is that we 
have to maintain two versions of the package -- one for development, 
using knitr to build the HTML vignettes and the other using R.rsp with 
the HTML vignettes pre-built.
Yes, both vignettes are there. One is under "Get Started" (I believe 
because the vignette is in cv.html, with the same name as the package) 
and the other under "Articles."
Thanks for the suggestion. We'll think about this, though the process of 
creating the vignettes is already getting more complicated than I'd like.
Using R.rsp has the disadvantage of not including useful vignette 
sources in the tarball and thus installed package, but since it seems to 
be a well-used approach, I think we'll probably go with it.

Thanks again,
  John
#
John,
On 17 October 2023 at 10:02, John Fox wrote:
| Hello Dirk,
| 
| Thank you (and Kevin and John) for addressing my questions.
| 
| No one directly answered my first question, however, which was whether 
| the approach that I suggested would work. I guess that the implication 
| is that it won't, but it would be nice to confirm that before I try 
| something else, specifically using R.rsp.

I am a little remote here, both mentally and physically. What I might do here
in the case of your long-running vignette, and have done in about half a
dozen packages where I wanted 'certainty' and no surprises, is to render the
pdf vignette I want as I want them locally, ship them in the package as an
included file (sometimes from a subdirectory) and have a five-or-so line
Sweave .Rnw file include it. That works without hassles. Here is the Rnw I
use for package anytime

-----------------------------------------------------------------------------
\documentclass{article}
\usepackage{pdfpages}
%\VignetteIndexEntry{Introduction to anytime}
%\VignetteKeywords{anytime, date, datetime, conversion}
%\VignettePackage{anytime}
%\VignetteEncoding{UTF-8}

\begin{document}
\includepdf[pages=-, fitpaper=true]{anytime-intro.pdf}
\end{document}
-----------------------------------------------------------------------------

That is five lines of LaTeX code slurping in the pdf (per the blog post by
Mark). As I understand it R.rsp does something similar at the marginal cost
of an added dependency.

Now, as mentioned, you can also 'conditionally' conpute in a vignette and
choose if and when to use a data cache. I think that we show most of that in
the package described in the RJournal piece by Brooke and myself on drat for
data repositories. (We may be skipping the compute when the data is not
accessible. Loading a precomputed set is similar. I may be doing that in the
much older never quite finished gcbd package and its vignette.

Hope this helps, maybe more once I am back home.

Cheers, Dirk
 
| Best,
|   John
|
| On 2023-10-17 4:02 a.m., Dirk Eddelbuettel wrote:
| > Caution: External email.
| > 
| >
| > On 16 October 2023 at 10:42, Kevin R Coombes wrote:
| > | Produce a PDF file yourself, then use the "as.is" feature of the R.rsp
| > | package.
| > 
| > For completeness, that approach also works directly with Sweave. Described in
| > a blog post by Mark van der Loo in 2019, and used in a number of packages
| > including a few of mine.
| > 
| > That said, I also used the approach described by John Harrold and cached
| > results myself.
| > 
| > Dirk
| > 
| > --
| > dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
| > 
| > ______________________________________________
| > R-package-devel at r-project.org mailing list
| > https://stat.ethz.ch/mailman/listinfo/r-package-devel
|
#
Hello Dirk,

Thank you for the additional information.

As you suggest, what you did to distribute pre-built PDF vignettes is 
quite similar to what R.rsp does, except that the latter also supports 
pre-built HTML vignettes, which is what I'd prefer to distribute. Since 
I apparently have that working now, we'll probably go with it unless we 
hit snags when the package is sent to CRAN.

While I appreciate the offer, it's probably not necessary for you to 
spend more time on this now.

Thanks again,
  John
On 2023-10-17 3:19 p.m., Dirk Eddelbuettel wrote:
#
On Tue, Oct 17, 2023 at 12:45?PM John Fox <jfox at mcmaster.ca> wrote:
Author of R.rsp here: It supports both static PDFs and static HTMLs, cf.

https://cran.r-project.org/web/packages/R.rsp/vignettes/R_packages-Static_PDF_and_HTML_vignettes.pdf

/Henrik
#
John,

the short answer is it won't work (it defeats the purpose of vignettes).

However, this sounds like a purely hypothetical question - CRAN policies allow long-running vignettes if they declared.

Cheers,
Simon
#
Dear Henrik,

I'd already read the R.rsp vignette to which you refer, and, as I said, 
confirmed that I can use R.rsp to implement static HTML vignettes for 
our package.

Thank you for the confirmation,
  John
On 2023-10-17 3:50 p.m., Henrik Bengtsson wrote:
#
Hello Simon,
On 2023-10-17 3:51 p.m., Simon Urbanek wrote:
Thank you for confirming that.
I assume that we'd declare the long-running vignette in our submission 
note to CRAN. Maybe that's better than pre-building the HTML vignettes 
in the package.

Best,
  John
#
On 17/10/2023 4:21 p.m., John Fox wrote:
There's also the "BuildVignettes: false" field in DESCRIPTION, but its 
use is discouraged, and I don't think it allows you to ask CRAN to build 
some vignettes but not all.

Duncan Murdoch
#
Hello Duncan,
On 2023-10-17 4:43 p.m., Duncan Murdoch wrote:
Thanks, I wasn't aware of that. There are two vignettes, one of which is 
slow to build.

Given Simon's suggestion, we'll likely submit the package with a note 
about the long-running vignette, and if that proves problematic, we can 
use R.asp to pre-build the HTML vignettes.

Best,
  John
#
On 18 October 2023 at 08:51, Simon Urbanek wrote:
| John,
| 
| the short answer is it won't work (it defeats the purpose of vignettes).

Not exactly. Everything is under our (i.e. package author) control, and when
we want to replace 'computed' values with cached values we can.

All this is somewhat of a charade. "Of course" we want vignettes to run
tests. But then we don't want to fall over random missing .sty files or fonts
(macOS machines have been less forgiving than others), not to mention compile
time.

So for simplicity I often pre-make pdf vignettes that get included in other
latex code as source. Works great, never fails, CRAN never complained --
which is somewhat contrary to your statement.

It is effectively the same with tests. We all want maximum test surfaces. But
when tests fail, or when they run too long, or [insert many other reasons
here] so many packages run tests conditionally.  Such is life.

Dirk

 
| However, this sounds like a purely hypothetical question - CRAN policies allow long-running vignettes if they declared.
| 
| Cheers,
| Simon
| 
|
| > On 18/10/2023, at 3:02 AM, John Fox <jfox at mcmaster.ca> wrote:
| > 
| > Hello Dirk,
| > 
| > Thank you (and Kevin and John) for addressing my questions.
| > 
| > No one directly answered my first question, however, which was whether the approach that I suggested would work. I guess that the implication is that it won't, but it would be nice to confirm that before I try something else, specifically using R.rsp.
| > 
| > Best,
| > John
| >
| > On 2023-10-17 4:02 a.m., Dirk Eddelbuettel wrote:
| >> Caution: External email.
| >> On 16 October 2023 at 10:42, Kevin R Coombes wrote:
| >> | Produce a PDF file yourself, then use the "as.is" feature of the R.rsp
| >> | package.
| >> For completeness, that approach also works directly with Sweave. Described in
| >> a blog post by Mark van der Loo in 2019, and used in a number of packages
| >> including a few of mine.
| >> That said, I also used the approach described by John Harrold and cached
| >> results myself.
| >> Dirk
| >> --
| >> dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
| >> ______________________________________________
| >> R-package-devel at r-project.org mailing list
| >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
| > 
| > ______________________________________________
| > R-package-devel at r-project.org mailing list
| > https://stat.ethz.ch/mailman/listinfo/r-package-devel
| > 
| 
| ______________________________________________
| R-package-devel at r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-package-devel
#
Dirk,

I think you misread the email - John was was asking specifically about his approach to use REBUILD_CV_VIGNETTES without any caching since that was the original question which no one answered in the thread - and that was what I was answering. The alternative approaches were already discussed to death so I didn't comment on those.

Cheers,
Simon
#
I ask myself the question: Who is the vignette for?  It does server two
purposes. One is testing but primarily it's for the users to learn how to
use a package. I think the testing is secondary, and if it slows down
installation or general usability I'd sacrifice the testing. If it's that
important, then the tests can be added explicitly in tests/.
On Tue, Oct 17, 2023 at 3:04?PM Dirk Eddelbuettel <edd at debian.org> wrote:

            

  
  
#
And if the vignette claims to help the users, but contains errors?  Then 
it's not very helpful at all.  That's what the testing is there to detect.

Duncan Murdoch
On 17/10/2023 6:30 p.m., John Harrold wrote:
#
Dear John,

Unless I'm mistaken, the *installation* time of the package isn't really 
at issue. If a user installs a package from a tarball provided by CRAN, 
the vignettes aren't normally rebuilt.

Best,
  John
On 2023-10-17 6:30 p.m., John Harrold wrote:
#
Please pardon me if I suggest something unrelated below. Many experts
have made suggestions that I would also like to consider because I
also have a similar issue with some packages.

This is an approach I found, for Rmarkdown vignettes:

https://www.kloppenborg.ca/2021/06/long-running-vignettes/

This is similar to some of the suggestions. The vignette is rendered
locally. It uses the trick that, If we render the vignette by calling
knitr::knit() directly, the extension of the source file does not
matter. The output, although with the extension ".Rmd", actually
contains the results of the code, in chunks starting with "```r", not
"```{r}".

When this pre-buiult .Rmd file is built again, it will just convert
the file to an HTML file, with no need to rerun the code.

The method uses an extension for the source Rmd file (".orig" in the
post) to make sure the "real" source files are ignored when building
the vignettes.

Perhaps this is also a feasible solution for long running vignettes?

Regards,
Shu Fai
On Wed, Oct 18, 2023 at 6:51?AM John Fox <jfox at mcmaster.ca> wrote:
#
Dear Shu Fai,

This approach is certainly relevant, and I think it is slightly better 
than using R.rsp. My preference is still to include the original .Rmd 
file along with a note to CRAN about the long-running vignette.

Thank you,
  John
On 2023-10-17 9:25 p.m., Shu Fai Cheung wrote: