Hi,
Recently I saw a couple of cases in which the package vignettes were
somewhat complicated so that Stangle() (or knitr::purl() or other
tangling functions) can fail to produce the exact R code that is
executed by the weaving function Sweave() (or knitr::knit(), ...). For
example, this is a valid document that can pass the weaving process
but cannot generate a valid R script to be source()d:
\documentclass{article}
\begin{document}
Assign 1 to x: \Sexpr{x <- 1}
<<>>=
x + 1
@
\end{document}
That is because the inline R code is not written to the R script
during the tangling process. When an R package vignette contains
inline R code expressions that have significant side effects, R CMD
check can fail because the tangled output is not correct. What I
showed here is only a trivial example, and I have seen two packages
that have more complicated scenarios than this. Anyway, the key thing
that I want to discuss here is, since the R code in the vignette has
been executed once during the weaving process, does it make much sense
to execute the code generated from the tangle function? In other
words, if the weaving process has succeeded, is it necessary to
source() the R script again?
The two options here are:
1. Do not check the R code from vignettes;
2. Or fix the tangle function so that it produces exactly what was
executed in the weaving process. If this is done, I'm back to my
previous question: does it make sense to run the code twice?
To push this a little further, personally I do not quite appreciate
literate programming in R as two separate steps, namely weave and
tangle. In particular, I do not see the value of tangle, considering
Sweave() (or knitr::knit()) as the new "source()". Therefore
eventually I tend to just drop tangle, but perhaps I missed something
here, and I'd like to hear what other people think about it.
Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Web: http://yihui.name
R CMD check for the R code from vignettes
15 messages · Kevin Coombes, Carl Boettiger, Henrik Bengtsson +4 more
Hi,
Unless someone is planning to change Stangle to include inline
expressions (which I am *not* advocating), I think that relying on
side-effects within an \Sexpr construction is a bad idea. So, my own
coding style is to restrict my use of \Sexpr to calls of the form
\Sexpr{show.the.value.of.this.variable}. As a result, I more-or-less
believe that having R CMD check use Stangle and report an error is
probably a good thing.
There is a completely separate questions about the relationship between
Sweave/Stangle or knit/purl and literate programming that is linked to
your question about whether to use Stangle on vignettes. The underlying
model(s) in R have drifted away from Knuth's original conception, for
some good reasons.
The original goal of literate programming was to be able to explain the
algorithms and data structures in the code to humans. For that purpose,
it was important to have named code chunks that you could move around,
which would allow you to describe the algorithm starting from a high
level overview and then drilling down into the details. From this
perspective, "tangle" was critical to being able to reconstruct a
program that would compile and run correctly.
The vast majority of applications of Sweave/Stangle or knit/purl in
modern R have a completely different goal: to produce some sort of
document that describes the results of an analysis to a non-programmer
or non-statistician. For this goal, "weave" is much more important than
"tangle", because the most important aspect is the ability to integrate
the results (figures, tables, etc) of running the code into the document
that get passed off to the person for whom the analysis was prepared. As
a result, the number of times in my daily work that I need to explicitly
invoke Stangle (or purl) explicitly is many orders of magnitude smaller
than the number of times that I invoke Sweave (or knitr).
-- Kevin
On 5/30/2014 1:04 AM, Yihui Xie wrote:
Hi,
Recently I saw a couple of cases in which the package vignettes were
somewhat complicated so that Stangle() (or knitr::purl() or other
tangling functions) can fail to produce the exact R code that is
executed by the weaving function Sweave() (or knitr::knit(), ...). For
example, this is a valid document that can pass the weaving process
but cannot generate a valid R script to be source()d:
\documentclass{article}
\begin{document}
Assign 1 to x: \Sexpr{x <- 1}
<<>>=
x + 1
@
\end{document}
That is because the inline R code is not written to the R script
during the tangling process. When an R package vignette contains
inline R code expressions that have significant side effects, R CMD
check can fail because the tangled output is not correct. What I
showed here is only a trivial example, and I have seen two packages
that have more complicated scenarios than this. Anyway, the key thing
that I want to discuss here is, since the R code in the vignette has
been executed once during the weaving process, does it make much sense
to execute the code generated from the tangle function? In other
words, if the weaving process has succeeded, is it necessary to
source() the R script again?
The two options here are:
1. Do not check the R code from vignettes;
2. Or fix the tangle function so that it produces exactly what was
executed in the weaving process. If this is done, I'm back to my
previous question: does it make sense to run the code twice?
To push this a little further, personally I do not quite appreciate
literate programming in R as two separate steps, namely weave and
tangle. In particular, I do not see the value of tangle, considering
Sweave() (or knitr::knit()) as the new "source()". Therefore
eventually I tend to just drop tangle, but perhaps I missed something
here, and I'd like to hear what other people think about it.
Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Web: http://yihui.name
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20140530/8e4b84ee/attachment.pl>
I think there are several aspects to Yihue's post and some simple
workarounds/long solutions to the issues:
1. For the reasons argued, I would agree that 'R CMD check'
incorrectly assumes that tangled code script should be able to run
without errors. Instead I think it should only check the syntax, i.e.
that it can be parsed without errors. If not, then Sweave may have to
be redfined to clarify that \Sexpr{}/"inline" expressions must not
have "side effects".
2. For other (=non-Sweave) vignette builder packages, you can already
today define engines that do not tangle, think
%\VignetteEngine{knitr::knitr_no_tangle}.
3. Extending on this, I'd like to propose %\VignetteTangle{no} (and/or
false, FALSE, ...), which would tell the engine to not generate the
"tangle" script file. Then it is up to the vignette engine to
acknowledge this or not, but at least we will have a standard across
engines rather that each of us come up with their own markup for this.
You can also imagine that one support other types of settings, e.g.
%\VignetteTangle{all} to include also \Sexpr{} in the tangled output.
/Henrik
On Fri, May 30, 2014 at 9:29 AM, Carl Boettiger <cboettig at gmail.com> wrote:
Hi Yihui, I agree with you (and your comments in [knitr issue 784]) that it seems wrong for R CMD check to be using tangle (purl, etc) as a way to check R code in a vignette, when the standard and expected way to check the vignette is already to knit / Sweave the vignette. I also agree with the perspective that the tangle function no longer plays the crucial role it did when we were using noweb and C programs that couldn't be compiled without tangle. However, I would be hesitant to see tangle removed entirely, as it is occasionally a convenient way to create an R script from a dynamic document. Pure R scripts are still much more widely recognized than dynamic documents, and I sometimes will just tangle out the R code because a collaborator would have no idea what to do with a .Rmd file (Though RStudio is certainly improving this situation). Tangle-like functions also provides a nice compliment to the "stitch" and friends that make dynamic documents from the ubiquitous R scripts. [knitr issue 784]: https://github.com/yihui/knitr/issues/784 - Carl On Fri, May 30, 2014 at 6:21 AM, Kevin Coombes <kevin.r.coombes at gmail.com> wrote:
Hi,
Unless someone is planning to change Stangle to include inline expressions
(which I am *not* advocating), I think that relying on side-effects within
an \Sexpr construction is a bad idea. So, my own coding style is to
restrict my use of \Sexpr to calls of the form
\Sexpr{show.the.value.of.this.variable}. As a result, I more-or-less
believe that having R CMD check use Stangle and report an error is probably
a good thing.
There is a completely separate questions about the relationship between
Sweave/Stangle or knit/purl and literate programming that is linked to your
question about whether to use Stangle on vignettes. The underlying model(s)
in R have drifted away from Knuth's original conception, for some good
reasons.
The original goal of literate programming was to be able to explain the
algorithms and data structures in the code to humans. For that purpose, it
was important to have named code chunks that you could move around, which
would allow you to describe the algorithm starting from a high level
overview and then drilling down into the details. From this perspective,
"tangle" was critical to being able to reconstruct a program that would
compile and run correctly.
The vast majority of applications of Sweave/Stangle or knit/purl in modern
R have a completely different goal: to produce some sort of document that
describes the results of an analysis to a non-programmer or
non-statistician. For this goal, "weave" is much more important than
"tangle", because the most important aspect is the ability to integrate the
results (figures, tables, etc) of running the code into the document that
get passed off to the person for whom the analysis was prepared. As a
result, the number of times in my daily work that I need to explicitly
invoke Stangle (or purl) explicitly is many orders of magnitude smaller
than the number of times that I invoke Sweave (or knitr).
-- Kevin
On 5/30/2014 1:04 AM, Yihui Xie wrote:
Hi,
Recently I saw a couple of cases in which the package vignettes were
somewhat complicated so that Stangle() (or knitr::purl() or other
tangling functions) can fail to produce the exact R code that is
executed by the weaving function Sweave() (or knitr::knit(), ...). For
example, this is a valid document that can pass the weaving process
but cannot generate a valid R script to be source()d:
\documentclass{article}
\begin{document}
Assign 1 to x: \Sexpr{x <- 1}
<<>>=
x + 1
@
\end{document}
That is because the inline R code is not written to the R script
during the tangling process. When an R package vignette contains
inline R code expressions that have significant side effects, R CMD
check can fail because the tangled output is not correct. What I
showed here is only a trivial example, and I have seen two packages
that have more complicated scenarios than this. Anyway, the key thing
that I want to discuss here is, since the R code in the vignette has
been executed once during the weaving process, does it make much sense
to execute the code generated from the tangle function? In other
words, if the weaving process has succeeded, is it necessary to
source() the R script again?
The two options here are:
1. Do not check the R code from vignettes;
2. Or fix the tangle function so that it produces exactly what was
executed in the weaving process. If this is done, I'm back to my
previous question: does it make sense to run the code twice?
To push this a little further, personally I do not quite appreciate
literate programming in R as two separate steps, namely weave and
tangle. In particular, I do not see the value of tangle, considering
Sweave() (or knitr::knit()) as the new "source()". Therefore
eventually I tend to just drop tangle, but perhaps I missed something
here, and I'd like to hear what other people think about it.
Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Web: http://yihui.name
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Carl Boettiger UC Santa Cruz http://carlboettiger.info/ [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Sorry, it should be Yihui and nothing else. /Henrik
On Fri, May 30, 2014 at 10:15 AM, Henrik Bengtsson <hb at biostat.ucsf.edu> wrote:
I think there are several aspects to Yihue's post and some simple
workarounds/long solutions to the issues:
1. For the reasons argued, I would agree that 'R CMD check'
incorrectly assumes that tangled code script should be able to run
without errors. Instead I think it should only check the syntax, i.e.
that it can be parsed without errors. If not, then Sweave may have to
be redfined to clarify that \Sexpr{}/"inline" expressions must not
have "side effects".
2. For other (=non-Sweave) vignette builder packages, you can already
today define engines that do not tangle, think
%\VignetteEngine{knitr::knitr_no_tangle}.
3. Extending on this, I'd like to propose %\VignetteTangle{no} (and/or
false, FALSE, ...), which would tell the engine to not generate the
"tangle" script file. Then it is up to the vignette engine to
acknowledge this or not, but at least we will have a standard across
engines rather that each of us come up with their own markup for this.
You can also imagine that one support other types of settings, e.g.
%\VignetteTangle{all} to include also \Sexpr{} in the tangled output.
/Henrik
On Fri, May 30, 2014 at 9:29 AM, Carl Boettiger <cboettig at gmail.com> wrote:
Hi Yihui, I agree with you (and your comments in [knitr issue 784]) that it seems wrong for R CMD check to be using tangle (purl, etc) as a way to check R code in a vignette, when the standard and expected way to check the vignette is already to knit / Sweave the vignette. I also agree with the perspective that the tangle function no longer plays the crucial role it did when we were using noweb and C programs that couldn't be compiled without tangle. However, I would be hesitant to see tangle removed entirely, as it is occasionally a convenient way to create an R script from a dynamic document. Pure R scripts are still much more widely recognized than dynamic documents, and I sometimes will just tangle out the R code because a collaborator would have no idea what to do with a .Rmd file (Though RStudio is certainly improving this situation). Tangle-like functions also provides a nice compliment to the "stitch" and friends that make dynamic documents from the ubiquitous R scripts. [knitr issue 784]: https://github.com/yihui/knitr/issues/784 - Carl On Fri, May 30, 2014 at 6:21 AM, Kevin Coombes <kevin.r.coombes at gmail.com> wrote:
Hi,
Unless someone is planning to change Stangle to include inline expressions
(which I am *not* advocating), I think that relying on side-effects within
an \Sexpr construction is a bad idea. So, my own coding style is to
restrict my use of \Sexpr to calls of the form
\Sexpr{show.the.value.of.this.variable}. As a result, I more-or-less
believe that having R CMD check use Stangle and report an error is probably
a good thing.
There is a completely separate questions about the relationship between
Sweave/Stangle or knit/purl and literate programming that is linked to your
question about whether to use Stangle on vignettes. The underlying model(s)
in R have drifted away from Knuth's original conception, for some good
reasons.
The original goal of literate programming was to be able to explain the
algorithms and data structures in the code to humans. For that purpose, it
was important to have named code chunks that you could move around, which
would allow you to describe the algorithm starting from a high level
overview and then drilling down into the details. From this perspective,
"tangle" was critical to being able to reconstruct a program that would
compile and run correctly.
The vast majority of applications of Sweave/Stangle or knit/purl in modern
R have a completely different goal: to produce some sort of document that
describes the results of an analysis to a non-programmer or
non-statistician. For this goal, "weave" is much more important than
"tangle", because the most important aspect is the ability to integrate the
results (figures, tables, etc) of running the code into the document that
get passed off to the person for whom the analysis was prepared. As a
result, the number of times in my daily work that I need to explicitly
invoke Stangle (or purl) explicitly is many orders of magnitude smaller
than the number of times that I invoke Sweave (or knitr).
-- Kevin
On 5/30/2014 1:04 AM, Yihui Xie wrote:
Hi,
Recently I saw a couple of cases in which the package vignettes were
somewhat complicated so that Stangle() (or knitr::purl() or other
tangling functions) can fail to produce the exact R code that is
executed by the weaving function Sweave() (or knitr::knit(), ...). For
example, this is a valid document that can pass the weaving process
but cannot generate a valid R script to be source()d:
\documentclass{article}
\begin{document}
Assign 1 to x: \Sexpr{x <- 1}
<<>>=
x + 1
@
\end{document}
That is because the inline R code is not written to the R script
during the tangling process. When an R package vignette contains
inline R code expressions that have significant side effects, R CMD
check can fail because the tangled output is not correct. What I
showed here is only a trivial example, and I have seen two packages
that have more complicated scenarios than this. Anyway, the key thing
that I want to discuss here is, since the R code in the vignette has
been executed once during the weaving process, does it make much sense
to execute the code generated from the tangle function? In other
words, if the weaving process has succeeded, is it necessary to
source() the R script again?
The two options here are:
1. Do not check the R code from vignettes;
2. Or fix the tangle function so that it produces exactly what was
executed in the weaving process. If this is done, I'm back to my
previous question: does it make sense to run the code twice?
To push this a little further, personally I do not quite appreciate
literate programming in R as two separate steps, namely weave and
tangle. In particular, I do not see the value of tangle, considering
Sweave() (or knitr::knit()) as the new "source()". Therefore
eventually I tend to just drop tangle, but perhaps I missed something
here, and I'd like to hear what other people think about it.
Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Web: http://yihui.name
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Carl Boettiger UC Santa Cruz http://carlboettiger.info/ [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Hi Kevin, Personally I also avoid code that have side effects in the inline expressions, but I think there are legitimate use cases in which inline expressions have side effects. This discussion was motivated by Carl's knitcitations package, as well as another question on StackOverflow (http://stackoverflow.com/q/23927325/559676). I'm aware of the distinction between the original literate programming paradigm and the one in R (that is why I said "literate programming in R" instead of "literate programming in general"). In R, weave actually does what both weave and tangle do in the original paradigm -- there is no need to tangle the document to get the computer code so that we can execute it. To Carl: I agree that it is a little extreme to drop tangle entirely, so I think at least knitr::purl() will stay there in the foreseeable future. I tend to adopt Henrik's idea, i.e., to provide vignette engines that just ignore tangle. At the moment, it seems R CMD check is comfortable with vignettes that do not have corresponding R scripts, and I hope these R scripts will not become mandatory in the future. Thanks everyone for your comments! Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name On Fri, May 30, 2014 at 8:21 AM, Kevin Coombes
<kevin.r.coombes at gmail.com> wrote:
Hi,
Unless someone is planning to change Stangle to include inline expressions
(which I am *not* advocating), I think that relying on side-effects within
an \Sexpr construction is a bad idea. So, my own coding style is to restrict
my use of \Sexpr to calls of the form
\Sexpr{show.the.value.of.this.variable}. As a result, I more-or-less believe
that having R CMD check use Stangle and report an error is probably a good
thing.
There is a completely separate questions about the relationship between
Sweave/Stangle or knit/purl and literate programming that is linked to your
question about whether to use Stangle on vignettes. The underlying model(s)
in R have drifted away from Knuth's original conception, for some good
reasons.
The original goal of literate programming was to be able to explain the
algorithms and data structures in the code to humans. For that purpose, it
was important to have named code chunks that you could move around, which
would allow you to describe the algorithm starting from a high level
overview and then drilling down into the details. From this perspective,
"tangle" was critical to being able to reconstruct a program that would
compile and run correctly.
The vast majority of applications of Sweave/Stangle or knit/purl in modern R
have a completely different goal: to produce some sort of document that
describes the results of an analysis to a non-programmer or
non-statistician. For this goal, "weave" is much more important than
"tangle", because the most important aspect is the ability to integrate the
results (figures, tables, etc) of running the code into the document that
get passed off to the person for whom the analysis was prepared. As a
result, the number of times in my daily work that I need to explicitly
invoke Stangle (or purl) explicitly is many orders of magnitude smaller than
the number of times that I invoke Sweave (or knitr).
-- Kevin
On 5/30/2014 1:04 AM, Yihui Xie wrote:
Hi,
Recently I saw a couple of cases in which the package vignettes were
somewhat complicated so that Stangle() (or knitr::purl() or other
tangling functions) can fail to produce the exact R code that is
executed by the weaving function Sweave() (or knitr::knit(), ...). For
example, this is a valid document that can pass the weaving process
but cannot generate a valid R script to be source()d:
\documentclass{article}
\begin{document}
Assign 1 to x: \Sexpr{x <- 1}
<<>>=
x + 1
@
\end{document}
That is because the inline R code is not written to the R script
during the tangling process. When an R package vignette contains
inline R code expressions that have significant side effects, R CMD
check can fail because the tangled output is not correct. What I
showed here is only a trivial example, and I have seen two packages
that have more complicated scenarios than this. Anyway, the key thing
that I want to discuss here is, since the R code in the vignette has
been executed once during the weaving process, does it make much sense
to execute the code generated from the tangle function? In other
words, if the weaving process has succeeded, is it necessary to
source() the R script again?
The two options here are:
1. Do not check the R code from vignettes;
2. Or fix the tangle function so that it produces exactly what was
executed in the weaving process. If this is done, I'm back to my
previous question: does it make sense to run the code twice?
To push this a little further, personally I do not quite appreciate
literate programming in R as two separate steps, namely weave and
tangle. In particular, I do not see the value of tangle, considering
Sweave() (or knitr::knit()) as the new "source()". Therefore
eventually I tend to just drop tangle, but perhaps I missed something
here, and I'd like to hear what other people think about it.
Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Web: http://yihui.name
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20140531/2be40605/attachment.pl>
Note the test has been done once in weave, since R CMD check will try to rebuild vignettes. The problem is whether the related tools in R should change their tangle utilities so we can **repeat** the test, and it seems the answer is "no" in my eyes. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name
On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:
On Fri, May 30, 2014 at 9:22 PM, Yihui Xie <xie at yihui.name> wrote:
Hi Kevin, I tend to adopt Henrik's idea, i.e., to provide vignette engines that just ignore tangle. At the moment, it seems R CMD check is comfortable with vignettes that do not have corresponding R scripts, and I hope these R scripts will not become mandatory in the future.
I'm not sure this is the right approach. This would essentially make the test optional based on decisions by the package author. I'm not arguing in favor if this particular test, but if package authors are able to turn a test off then the test loses quite a bit of it's value. I think that R CMD check has done a great deal for the R community by presenting a uniform, minimum "barrier to entry" for R packages. Allowing package developers to alter the tests it does (other than the obvious case of their own unit tests) would remove that. That having been said, it seems to me that tangle-like utilities should have the option of extracting inline code, and that during R CMD check that option should *always* be turned on. That would solve the problem in question while retaining the test would it not? ~G
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20140531/ec5e2e44/attachment.pl>
On 05/31/2014 03:52 PM, Yihui Xie wrote:
Note the test has been done once in weave, since R CMD check will try to rebuild vignettes. The problem is whether the related tools in R should change their tangle utilities so we can **repeat** the test, and it seems the answer is "no" in my eyes. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:
On Fri, May 30, 2014 at 9:22 PM, Yihui Xie <xie at yihui.name> wrote:
Hi Kevin, I tend to adopt Henrik's idea, i.e., to provide vignette engines that just ignore tangle. At the moment, it seems R CMD check
It is very useful, pedagogically and when reproducing analyses, to be able to
source() the tangled .R code into an R session, analogous to running example
code with example(). The documentation for ?Stangle does read
(Code inside '\Sexpr{}' statements is ignored by 'Stangle'.)
So my 'vote' (recognizing that I don't have one of those) is to incorporate
\Sexpr{} expressions into the tangled code, or to continue to flag use of Sexpr
with side effects as errors (indirectly, by source()ing the tangled code),
rather than writing engines that ignore tangle.
It is very valuable to all parties to write a vignette with code that is fully
evaluated; otherwise, it is too easy for bit rot to seep in, or to 'fake' it in
a way that seems innocent but is misleading.
Martin Morgan
is comfortable with vignettes that do not have corresponding R scripts, and I hope these R scripts will not become mandatory in the future.
I'm not sure this is the right approach. This would essentially make the test optional based on decisions by the package author. I'm not arguing in favor if this particular test, but if package authors are able to turn a test off then the test loses quite a bit of it's value. I think that R CMD check has done a great deal for the R community by presenting a uniform, minimum "barrier to entry" for R packages. Allowing package developers to alter the tests it does (other than the obvious case of their own unit tests) would remove that. That having been said, it seems to me that tangle-like utilities should have the option of extracting inline code, and that during R CMD check that option should *always* be turned on. That would solve the problem in question while retaining the test would it not? ~G
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
I mentioned in my original post that Sweave()/knit()/... can be considered as the "new" source(). They can do the same thing as source() does. I agree that fully evaluating the code is valuable, but it is not a problem since the weave functions do fully evaluate the code. If there is a reason for why source() an R script is preferred, I guess it is users' familiarity with .R instead of .Rnw/.Rmd/..., however, I guess it would be painful to read the pure R script tangled from the source document without the original narratives. So what do we really lose if we turn off tangle? We lose an R script as a derivative from the source document, but we do not lose the code evaluation. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name
On Sat, May 31, 2014 at 6:20 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
On 05/31/2014 03:52 PM, Yihui Xie wrote:
Note the test has been done once in weave, since R CMD check will try to rebuild vignettes. The problem is whether the related tools in R should change their tangle utilities so we can **repeat** the test, and it seems the answer is "no" in my eyes. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:
On Fri, May 30, 2014 at 9:22 PM, Yihui Xie <xie at yihui.name> wrote:
Hi Kevin, I tend to adopt Henrik's idea, i.e., to provide vignette engines that just ignore tangle. At the moment, it seems R CMD check
It is very useful, pedagogically and when reproducing analyses, to be able
to source() the tangled .R code into an R session, analogous to running
example code with example(). The documentation for ?Stangle does read
(Code inside '\Sexpr{}' statements is ignored by 'Stangle'.)
So my 'vote' (recognizing that I don't have one of those) is to incorporate
\Sexpr{} expressions into the tangled code, or to continue to flag use of
Sexpr with side effects as errors (indirectly, by source()ing the tangled
code), rather than writing engines that ignore tangle.
It is very valuable to all parties to write a vignette with code that is
fully evaluated; otherwise, it is too easy for bit rot to seep in, or to
'fake' it in a way that seems innocent but is misleading.
Martin Morgan
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20140531/b46d05f4/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20140531/21f3e15a/attachment.pl>
Yes, that is a matter of familiarity as I mentioned, isn't it? I understand this justification. I can argue that it is also convenient to give people an Rnw/Rmd document and they can easily run the R code chunks as well (e.g. in RStudio, chunk navigation and evaluation are pretty simple) _within_ the context of your teaching materials. However, I think this is drifting away from the original topic, so I'll stop my comments on the direction of teaching. The original question was, what do we lose if we disable tangle for R package vignettes? Please also note I mean this is _optional_, i.e. package authors can _choose_ whether they want to disable tangle. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name On Sat, May 31, 2014 at 9:11 PM, Kasper Daniel Hansen
<kasperdanielhansen at gmail.com> wrote:
The Bioconductor project has a substantial amount of teaching material in the form of Sweave files. For teaching, it can be extremely convenient to give people an R script which they can copy and paste from (or do something else with). This is especially true for inexperienced R users. Best, Kasper
1. The starting point of this discussion is package vignettes, instead of R scripts. I'm not saying we should abandon R scripts, or all people should write R code to generate reports. Starting from a package vignette, you can evaluate it using a weave function, or evaluate its derivative, namely an R script. I was saying the former might not be a bad idea, although the latter sounds more familiar to most R users. For a package vignette, within the context of R CMD check, is it necessary to do tangle + evaluate _besides_ weave? 2. If you are comfortable with reading pure code without narratives, I'm totally fine with that. I guess there is nothing to argue on this point, since it is pretty much personal taste. 3. Yes, you are absolutely correct -- Sweave()/knit() does more than source(), but let me repeat the issue to be discussed: what harm does it bring if we disable tangle for R package vignettes? Sorry if I did not make it clear enough, my priority of this discussion is the necessity of tangle for package vignettes. After we finish this issue, I'll be happy to extend the discussion towards tangle in general. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Web: http://yihui.name
On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:
On Sat, May 31, 2014 at 6:54 PM, Yihui Xie <xie at yihui.name> wrote:
I agree that fully evaluating the code is valuable, but it is not a problem since the weave functions do fully evaluate the code. If there is a reason for why source() an R script is preferred, I guess it is users' familiarity with .R instead of .Rnw/.Rmd/...,
It's because .Rnw and Rmd require more from the user than .R. Also, this started with vignettes but you seem to be talking more generally. If so, I would point out that not all R code is intended to generate reports, and writing pure R code that isn't going to generate a report in an .Rnw/.Rmd file would be very strange to say the least.
however, I guess it would be painful to read the pure R script tangled from the source document without the original narratives.
That depends a lot on what you want. Reading an woven article/report that includes code and reading code are different and equally valid activities. Sometimes I really just want to know what the author actually told the computer to do.
So what do we really lose if we turn off tangle? We lose an R script as a derivative from the source document, but we do not lose the code evaluation.
We lose *isolated* code evaluation. Sweave/knit have a lot more moving pieces than source/eval do. Many of which are for the purpose of displaying output, rather than running code.