Interestingly, the <<- operator is also a lot faster than using a
namespace explicitly, and only slightly slower than using <- with local
variables, see below. But, surely, both must at some point insert values in
a given environment ? either the local one, for <-, or an enclosing one,
for <<- ? so I guess I am asking if there is a more low-level assignment
operation I can get my hands on without diving into C?
factorial <- function(n, acc = 1) {
if (n == 1) acc
else factorial(n - 1, n * acc)
}
factorial_tr_manual <- function (n, acc = 1)
{
repeat {
if (n <= 1)
return(acc)
else {
.tailr_n <- n - 1
.tailr_acc <- acc * n
n <- .tailr_n
acc <- .tailr_acc
next
}
}
}
factorial_tr_automatic_1 <- function(n, acc = 1) {
.tailr_n <- n
.tailr_acc <- acc
callCC(function(escape) {
repeat {
n <- .tailr_n
acc <- .tailr_acc
if (n <= 1) {
escape(acc)
} else {
.tailr_n <<- n - 1
.tailr_acc <<- n * acc
}
}
})
}
factorial_tr_automatic_2 <- function(n, acc = 1) {
.tailr_env <- rlang::get_env()
callCC(function(escape) {
repeat {
if (n <= 1) {
escape(acc)
} else {
.tailr_env$.tailr_n <- n - 1
.tailr_env$.tailr_acc <- n * acc
.tailr_env$n <- .tailr_env$.tailr_n
.tailr_env$acc <- .tailr_env$.tailr_acc
}
}
})
}
microbenchmark::microbenchmark(factorial(1000),
factorial_tr_manual(1000),
factorial_tr_automatic_1(1000),
factorial_tr_automatic_2(1000))
Unit: microseconds
expr min lq mean median
uq max neval
factorial(1000) 884.137 942.060 1076.3949 977.6235
1042.5035 2889.779 100
factorial_tr_manual(1000) 110.215 116.919 130.2337 118.7350
122.7495 255.062 100
factorial_tr_automatic_1(1000) 179.897 183.437 212.8879 187.8250
195.7670 979.352 100
factorial_tr_automatic_2(1000) 508.353 534.328 601.9643 560.7830
587.8350 1424.260 100
Cheers
On 26 Feb 2018, 21.12 +0100, Thomas Mailund <thomas.mailund at gmail.com>,
wrote:
Following up on this attempt of implementing the tail-recursion
optimisation ? now that I?ve finally had the chance to look at it again ? I
find that non-local return implemented with callCC doesn?t actually incur
much overhead once I do it more sensibly. I haven?t found a good way to
handle parallel assignments that isn?t vastly slower than simply
introducing extra variables, so I am going with that solution. However, I
have now run into another problem involving those local variables ? and
assigning to local variables in general.
Consider again the factorial function and three different ways of
implementing it using the tail recursion optimisation:
factorial <- function(n, acc = 1) {
if (n == 1) acc
else factorial(n - 1, n * acc)
}
factorial_tr_manual <- function (n, acc = 1)
{
repeat {
if (n <= 1)
return(acc)
else {
.tailr_n <- n - 1
.tailr_acc <- acc * n
n <- .tailr_n
acc <- .tailr_acc
next
}
}
}
factorial_tr_automatic_1 <- function(n, acc = 1) {
callCC(function(escape) {
repeat {
if (n <= 1) {
escape(acc)
} else {
.tailr_n <- n - 1
.tailr_acc <- n * acc
n <- .tailr_n
acc <- .tailr_acc
}
}
})
}
factorial_tr_automatic_2 <- function(n, acc = 1) {
.tailr_env <- rlang::get_env()
callCC(function(escape) {
repeat {
if (n <= 1) {
escape(acc)
} else {
.tailr_env$.tailr_n <- n - 1
.tailr_env$.tailr_acc <- n * acc
.tailr_env$n <- .tailr_env$.tailr_n
.tailr_env$acc <- .tailr_env$.tailr_acc
}
}
})
}
The factorial_tr_manual function is how I would implement the function
manually while factorial_tr_automatic_1 is what my package used to come up
with. It handles non-local returns, because this is something I need in
general. Finally, factorial_tr_automatic_2 accesses the local variables
explicitly through the environment, which is what my package currently
produces.
The difference between supporting non-local returns and not is tiny, but
explicitly accessing variables through their environment costs me about a
factor of five ? something that surprised me.
microbenchmark::microbenchmark(factorial(1000),
+ factorial_tr_manual(1000),
+ factorial_tr_automatic_1(1000),
+ factorial_tr_automatic_2(1000))
Unit: microseconds
expr min lq mean median
factorial(1000) 756.357 810.4135 963.1040 856.3315
factorial_tr_manual(1000) 104.838 119.7595 198.7347 129.0870
factorial_tr_automatic_1(1000) 112.354 125.5145 211.6148 135.5255
factorial_tr_automatic_2(1000) 461.015 544.7035 688.5988 565.3240
uq max neval
945.3110 4149.099 100
136.8200 4190.331 100
152.9625 5944.312 100
600.5235 7798.622 100
The simple solution, of course, is to not do that, but then I can?t
handle expressions inside calls to ?with?. And I would really like to,
because then I can combine tail recursion with pattern matching.
I can define linked lists and a length function on them like this:
library(pmatch)
llist := NIL | CONS(car, cdr : llist)
llength <- function(llist, acc = 0) {
cases(llist,
NIL -> acc,
CONS(car, cdr) -> llength(cdr, acc + 1))
}
The tail-recursion I get out of transforming this function looks like
llength_tr <- function (llist, acc = 0) {
.tailr_env <- rlang::get_env()
callCC(function(escape) {
repeat {
if (!rlang::is_null(..match_env <- test_pattern(llist,
NIL)))
with(..match_env, escape(acc))
else if (!rlang::is_null(..match_env <-
test_pattern(llist, CONS(car,
with(..match_env, {
.tailr_env$.tailr_llist <- cdr
.tailr_env$.tailr_acc <- acc + 1
.tailr_env$llist <- .tailr_env$.tailr_llist
.tailr_env$acc <- .tailr_env$.tailr_acc
})
}
})
}
Maybe not the prettiest code, but you are not supposed to actually see
There is not much gain in speed
Unit: milliseconds
expr min lq mean median uq
llength(test_llist) 70.74605 76.08734 87.78418 85.81193 94.66378
llength_tr(test_llist) 45.16946 51.56856 59.09306 57.00101 63.07044
max neval
182.4894 100
166.6990 100
but you don?t run out of stack space
llength(make_llist(1000))
Error: evaluation nested too deeply: infinite recursion /
Error during wrapup: C stack usage 7990648 is too close to the limit
llength_tr(make_llist(1000))
[1] 1000
I should be able to make the function go faster if I had a faster way of
handling the variable assignments, but inside ?with?, I?m not sure how to
do that?
Any suggestions?
Cheers
On 11 Feb 2018, 16.48 +0100, Thomas Mailund <thomas.mailund at gmail.com>,
Hi guys,
I am working on some code for automatically translating recursive
As a toy-example, consider the factorial function
factorial <- function(n, acc = 1) {
if (n <= 1) acc
else factorial(n - 1, acc * n)
}
I can automatically translate this into the loop-version
factorial_tr_1 <- function (n, acc = 1)
{
repeat {
if (n <= 1)
return(acc)
else {
.tailr_n <- n - 1
.tailr_acc <- acc * acc
n <- .tailr_n
acc <- .tailr_acc
next
}
}
}
which will run faster and not have problems with recursion depths.
However, I?m not entirely happy with this version for two reasons: I am not
happy with introducing the temporary variables and this rewrite will not
work if I try to over-scope an evaluation context.
I have two related questions, one related to parallel assignments ?
i.e. expressions to variables so the expression uses the old variable
values and not the new values until the assignments are all done ? and one
related to restarting a loop from nested loops or from nested expressions
in `with` expressions or similar.
I can implement parallel assignment using something like
factorial_tr_2 <- function (n, acc = 1)
{
.tailr_env <- rlang::get_env()
repeat {
if (n <= 1)
return(acc)
else {
rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
next
}
}
}
This reduces the number of additional variables I need to one, but is
a couple of orders of magnitude slower than the first version.
microbenchmark::microbenchmark(factorial(100),
+ factorial_tr_1(100),
+ factorial_tr_2(100))
Unit: microseconds
expr min lq mean median uq max neval
factorial(100) 53.978 60.543 77.76203 71.0635 85.947 180.251 100
factorial_tr_1(100) 9.022 9.903 11.52563 11.0430 11.984 28.464 100
factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463
Is there another way to do parallel assignments that doesn?t cost this
My other problem is the use of `next`. I would like to combine
devtools::install_github("mailund/pmatch?)
library(pmatch)
llist := NIL | CONS(car, cdr : llist)
and define a function for computing the length of a list like this:
list_length <- function(lst, acc = 0) {
force(acc)
cases(lst,
NIL -> acc,
CONS(car, cdr) -> list_length(cdr, acc + 1))
}
The `cases` function creates an environment that binds variables in a
pattern-description that over-scopes the expression to the right of `->`,
so the recursive call in this example have access to the variables `cdr`
and `car`.
I can transform a `cases` call to one that creates the environment
containing the bound variables and then evaluate this using `eval` or
`with`, but in either case, a call to `next` will not work in such a
context. The expression will be evaluated inside `bind` or `with`, and not
in the `list_lenght` function.
A version that *will* work, is something like this
factorial_tr_3 <- function (n, acc = 1)
{
.tailr_env <- rlang::get_env()
.tailr_frame <- rlang::current_frame()
repeat {
if (n <= 1)
rlang::return_from(.tailr_frame, acc)
else {
rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
rlang::return_to(.tailr_frame)
}
}
}
Here, again, for the factorial function since this is easier to follow
than the list-length function.
This solution will also work if you return values from inside loops,
where `next` wouldn?t work either.
Using `rlang::return_from` and `rlang::return_to` implements the right
semantics, but costs me another order of magnitude in running time.
microbenchmark::microbenchmark(factorial(100),
factorial_tr_1(100),
factorial_tr_2(100),
factorial_tr_3(100))
Unit: microseconds
expr min lq mean median uq max neval
factorial(100) 52.479 60.2640 93.43069 67.5130 83.925 2062.481 100
factorial_tr_1(100) 8.875 9.6525 49.19595 10.6945 11.217 3818.823 100
factorial_tr_2(100) 5296.350 5525.0745 5973.77664 5737.8730 6260.128
factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725
I can live with the ?introducing extra variables? solution to parallel
assignment, and I could hack my way out of using `with` or `bind` in
rewriting `cases`, but restarting a `repeat` loop would really make for a
nicer solution. I know that `goto` is considered harmful, but really, in
this case, it is what I want.
A `callCC` version also solves the problem
factorial_tr_4 <- function(n, acc = 1) {
function_body <- function(continuation) {
if (n <= 1) {
continuation(acc)
} else {
continuation(list("continue", n = n - 1, acc = acc * n))
}
}
repeat {
result <- callCC(function_body)
if (is.list(result) && result[[1]] == "continue") {
n <- result$n
acc <- result$acc
next
} else {
return(result)
}
}
}
But this requires that I know how to distinguish between a valid
return value and a tag for ?next? and is still a lot slower than the `next`
solution
microbenchmark::microbenchmark(factorial(100),
factorial_tr_1(100),
factorial_tr_2(100),
factorial_tr_3(100),
factorial_tr_4(100))
Unit: microseconds
expr min lq mean median uq max neval
factorial(100) 54.109 61.8095 81.33167 81.8785 89.748 243.554 100
factorial_tr_1(100) 9.025 9.9035 11.38607 11.1990 12.008 22.375 100
factorial_tr_2(100) 5272.524 5798.3965 6302.40467 6077.7180 6492.959
factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665
factorial_tr_4(100) 270.978 302.7890 337.48763 313.9930 334.096
I don?t necessarily need the tail-recursion optimisation to be faster
than the recursive version; just getting out of the problem of too deep
recursions is a benefit, but I would rather not pay with an order of
magnitude for it. I could, of course, try to handle cases that works with
`next` in one way, and other cases using `callCC`, but I feel it should be
possible with a version that handles all cases the same way.
Is there any way to achieve this?
Cheers
Thomas
[[alternative HTML version deleted]]