R CMD check for the R code from vignettes
On 06/02/2014 12:16 AM, Gabriel Becker wrote:
Carl, I don't really have a horse in this race other than a strong feeling that whatever check does should be mandatory. That having been said, I think it can be argued that the fact that check does this means that it IS in the R package vignette specification that all vignettes must be such that their tangled code will run without errors.
My understanding of this is that the package maintainer can turn off building the vignette (--no-vignettes) but R CMD check and CRAN still check that the tangle code runs, and the check fails if it does not. Running the tangle code can be turned off, just not by the package maintainer. You have to make a special appeal to the CRAN maintainers, and give reasons they are prepared to accept. I think the intention is that the tangle code should run without errors. I doubt they would accept "it doesn't work" as an acceptable reason. But there are reasons, like the vignette requires access to a commercial database engine. Also, I think, turning this off means they just do not run it regularly, in the daily checks. I don't think it necessarily means the code is never tested. The testing may need to be done on machines with special resources. Thus, --no-vignettes provides a mechanism to avoid running the tangle code twice but, without special exemption, it is still run once. Some package maintainers may not think of several feature of 'R CMD check' as 'aids'. I think of it having more to do with maintaining some quality assurance, which I think of as an aid but not a debugging aid. I believe the CRAN maintainers have intentionally, and successfully, made disabling the running of tangled code more trouble than it is generally worth. Effectively, a package should have tangle code that runs without errors. (Of course, I could be wrong about all this, it has happened before.) Paul
~G On Sun, Jun 1, 2014 at 8:43 PM, Carl Boettiger <cboettig at gmail.com> wrote:
Yihui, list, Focusing the behavior of R CMD check, the only reason I have seen put forward in the discussion for having check tangle and then source as well as knit/weave the very same vignette is to assist the package maintainer in debugging R errors vs pdflatex errors. As tangle (and many other tools) are already available to an author needing extra help debugging, and as the error messages are usually clear on whether errors come from the R code or whatever format compiling (pdflatex, markdown html, etc), this seems like a poor reason for R CMD check to be wasting time doing two versions of almost (but not literally) the same check. As has already been discussed, it is possible to write vignettes that can be Sweave'd but not source'd, due to the different treatments of inline chunks. While I see the advantages of this property, I don't see why R CMD check should be enforcing it through the arbitrary mechanism of running both Sweave and tangle+source. If that is the desired behavior for all Sweave documents it should be in part of the Sweave specification not to be able to write/change values in inline expressions, or part of the tangle definition to include inline chunks. I any event I don't see any reason for R CMD check doing both. Perhaps someone can fill in whatever I've overlooked? Carl On Sat, May 31, 2014 at 8:17 PM, Yihui Xie <xie at yihui.name> wrote:
1. The starting point of this discussion is package vignettes, instead of R scripts. I'm not saying we should abandon R scripts, or all people should write R code to generate reports. Starting from a package vignette, you can evaluate it using a weave function, or evaluate its derivative, namely an R script. I was saying the former might not be a bad idea, although the latter sounds more familiar to most R users. For a package vignette, within the context of R CMD check, is it necessary to do tangle + evaluate _besides_ weave? 2. If you are comfortable with reading pure code without narratives, I'm totally fine with that. I guess there is nothing to argue on this point, since it is pretty much personal taste. 3. Yes, you are absolutely correct -- Sweave()/knit() does more than source(), but let me repeat the issue to be discussed: what harm does it bring if we disable tangle for R package vignettes? Sorry if I did not make it clear enough, my priority of this discussion is the necessity of tangle for package vignettes. After we finish this issue, I'll be happy to extend the discussion towards tangle in general. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:
On Sat, May 31, 2014 at 6:54 PM, Yihui Xie <xie at yihui.name> wrote:
I agree that fully evaluating the code is valuable, but it is not a problem since the weave functions do fully evaluate the code. If there is a reason for why source() an R script is preferred, I guess it is users' familiarity with .R instead of .Rnw/.Rmd/...,
It's because .Rnw and Rmd require more from the user than .R. Also, this started with vignettes but you seem to be talking more generally. If
so, I
would point out that not all R code is intended to generate reports, and writing pure R code that isn't going to generate a report in an
.Rnw/.Rmd
file would be very strange to say the least.
however, I guess it would be painful to read the pure R script tangled from the source document without the original narratives.
That depends a lot on what you want. Reading an woven article/report
that
includes code and reading code are different and equally valid
activities.
Sometimes I really just want to know what the author actually told the computer to do.
So what do we really lose if we turn off tangle? We lose an R script as a derivative from the source document, but we do not lose the code evaluation.
We lose *isolated* code evaluation. Sweave/knit have a lot more moving pieces than source/eval do. Many of which are for the purpose of
displaying
output, rather than running code.
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Carl Boettiger UC Santa Cruz http://carlboettiger.info/