thr3ads.net - R help - [R] Parallel assignments and goto [Feb 2018]

If this information is useful, please help other people find it:
Share via:

Thomas Mailund

2018-Feb-11 15:48 UTC

[R] Parallel assignments and goto

Hi guys,

I am working on some code for automatically translating recursive functions into
looping functions to implemented tail-recursion optimisations. See
https://github.com/mailund/tailr

As a toy-example, consider the factorial function

factorial <- function(n, acc = 1) {
    if (n <= 1) acc
    else factorial(n - 1, acc * n)
}

I can automatically translate this into the loop-version

factorial_tr_1 <- function (n, acc = 1) 
{
    repeat {
        if (n <= 1) 
            return(acc)
        else {
            .tailr_n <- n - 1
            .tailr_acc <- acc * acc
            n <- .tailr_n
            acc <- .tailr_acc
            next
        }
    }
}

which will run faster and not have problems with recursion depths. However, I?m
not entirely happy with this version for two reasons: I am not happy with
introducing the temporary variables and this rewrite will not work if I try to
over-scope an evaluation context.

I have two related questions, one related to parallel assignments ? i.e.
expressions to variables so the expression uses the old variable values and not
the new values until the assignments are all done ? and one related to
restarting a loop from nested loops or from nested expressions in `with`
expressions or similar.

I can implement parallel assignment using something like rlang::env_bind:

factorial_tr_2 <- function (n, acc = 1) 
{
    .tailr_env <- rlang::get_env()
    repeat {
        if (n <= 1) 
            return(acc)
        else {
            rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
            next
        }
    }
}

This reduces the number of additional variables I need to one, but is a couple
of orders of magnitude slower than the first version.
> microbenchmark::microbenchmark(factorial(100),+                                factorial_tr_1(100),
+                                factorial_tr_2(100))
Unit: microseconds
                     expr      min       lq       mean    median       uq     
max neval
      factorial(100)   53.978   60.543   77.76203   71.0635   85.947  180.251  
100
 factorial_tr_1(100)    9.022    9.903   11.52563   11.0430   11.984   28.464  
100
 factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463 8177.635  
100


Is there another way to do parallel assignments that doesn?t cost this much in
running time?

My other problem is the use of `next`. I would like to combine tail-recursion
optimisation with pattern matching as in https://github.com/mailund/pmatch where
I can, for example, define a linked list like this:

devtools::install_github("mailund/pmatch?)
library(pmatch)
llist := NIL | CONS(car, cdr : llist)

and define a function for computing the length of a list like this:

list_length <- function(lst, acc = 0) {
  force(acc)
  cases(lst,
        NIL -> acc,
        CONS(car, cdr) -> list_length(cdr, acc + 1))
}

The `cases` function creates an environment that binds variables in a
pattern-description that over-scopes the expression to the right of `->`, so
the recursive call in this example have access to the variables `cdr` and `car`.

I can transform a `cases` call to one that creates the environment containing
the bound variables and then evaluate this using `eval` or `with`, but in either
case, a call to `next` will not work in such a context. The expression will be
evaluated inside `bind` or `with`, and not in the `list_lenght` function.

A version that *will* work, is something like this

factorial_tr_3 <- function (n, acc = 1) 
{
    .tailr_env <- rlang::get_env()
    .tailr_frame <- rlang::current_frame()
    repeat {
        if (n <= 1) 
            rlang::return_from(.tailr_frame, acc)
        else {
            rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
            rlang::return_to(.tailr_frame)
        }
    }
}

Here, again, for the factorial function since this is easier to follow than the
list-length function.

This solution will also work if you return values from inside loops, where
`next` wouldn?t work either.

Using `rlang::return_from` and `rlang::return_to` implements the right
semantics, but costs me another order of magnitude in running time.

microbenchmark::microbenchmark(factorial(100),
                               factorial_tr_1(100),
                               factorial_tr_2(100),
                               factorial_tr_3(100))
Unit: microseconds
                expr       min         lq        mean     median        uq      
max neval
      factorial(100)    52.479    60.2640    93.43069    67.5130    83.925  
2062.481   100
 factorial_tr_1(100)     8.875     9.6525    49.19595    10.6945    11.217  
3818.823   100
 factorial_tr_2(100)  5296.350  5525.0745  5973.77664  5737.8730  6260.128  
8471.301   100
 factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725 89859.169
171039.228   100

I can live with the ?introducing extra variables? solution to parallel
assignment, and I could hack my way out of using `with` or `bind` in rewriting
`cases`, but restarting a `repeat` loop would really make for a nicer solution.
I know that `goto` is considered harmful, but really, in this case, it is what I
want.

A `callCC` version also solves the problem

factorial_tr_4 <- function(n, acc = 1) {
    function_body <- function(continuation) {
        if (n <= 1) {
            continuation(acc)
        } else {
            continuation(list("continue", n = n - 1, acc = acc * n))
        }
    }
    repeat {
        result <- callCC(function_body)
        if (is.list(result) && result[[1]] == "continue") {
            n <- result$n
            acc <- result$acc
            next
        } else {
            return(result)
        }
    }
}

But this requires that I know how to distinguish between a valid return value
and a tag for ?next? and is still a lot slower than the `next` solution

microbenchmark::microbenchmark(factorial(100),
                               factorial_tr_1(100),
                               factorial_tr_2(100),
                               factorial_tr_3(100),
                               factorial_tr_4(100))
Unit: microseconds
                expr       min         lq        mean     median        uq      
max neval
      factorial(100)    54.109    61.8095    81.33167    81.8785    89.748   
243.554   100
 factorial_tr_1(100)     9.025     9.9035    11.38607    11.1990    12.008    
22.375   100
 factorial_tr_2(100)  5272.524  5798.3965  6302.40467  6077.7180  6492.959  
9967.237   100
 factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665 75405.054
203785.673   100
 factorial_tr_4(100)   270.978   302.7890   337.48763   313.9930   334.096  
1425.702   100

I don?t necessarily need the tail-recursion optimisation to be faster than the
recursive version; just getting out of the problem of too deep recursions is a
benefit, but I would rather not pay with an order of magnitude for it. I could,
of course, try to handle cases that works with `next` in one way, and other
cases using `callCC`, but I feel it should be possible with a version that
handles all cases the same way.

Is there any way to achieve this?

Cheers
	Thomas

David Winsemius

2018-Feb-11 17:19 UTC

head link

[R] Parallel assignments and goto

> On Feb 11, 2018, at 7:48 AM, Thomas Mailund <thomas.mailund at
gmail.com> wrote:
> 
> Hi guys,
> 
> I am working on some code for automatically translating recursive functions
into looping functions to implemented tail-recursion optimisations. See
https://github.com/mailund/tailr
> 
> As a toy-example, consider the factorial function
> 
> factorial <- function(n, acc = 1) {
>    if (n <= 1) acc
>    else factorial(n - 1, acc * n)
> }
> 
> I can automatically translate this into the loop-version
> 
> factorial_tr_1 <- function (n, acc = 1) 
> {
>    repeat {
>        if (n <= 1) 
>            return(acc)
>        else {
>            .tailr_n <- n - 1
>            .tailr_acc <- acc * acc
>            n <- .tailr_n
>            acc <- .tailr_acc
>            next
>        }
>    }
> }
> 
> which will run faster and not have problems with recursion depths. However,
I?m not entirely happy with this version for two reasons: I am not happy with
introducing the temporary variables and this rewrite will not work if I try to
over-scope an evaluation context.
> 
> I have two related questions, one related to parallel assignments ? i.e.
expressions to variables so the expression uses the old variable values and not
the new values until the assignments are all done ? and one related to
restarting a loop from nested loops or from nested expressions in `with`
expressions or similar.
> 
> I can implement parallel assignment using something like rlang::env_bind:
> 
> factorial_tr_2 <- function (n, acc = 1) 
> {
>    .tailr_env <- rlang::get_env()
>    repeat {
>        if (n <= 1) 
>            return(acc)
>        else {
>            rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
>            next
>        }
>    }
> }
> 
> This reduces the number of additional variables I need to one, but is a
couple of orders of magnitude slower than the first version.
> 
>> microbenchmark::microbenchmark(factorial(100),
> +                                factorial_tr_1(100),
> +                                factorial_tr_2(100))
> Unit: microseconds
>                     expr      min       lq       mean    median       uq   
max neval
>      factorial(100)   53.978   60.543   77.76203   71.0635   85.947 
180.251   100
> factorial_tr_1(100)    9.022    9.903   11.52563   11.0430   11.984  
28.464   100
> factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463
8177.635   100
> 
> 
> Is there another way to do parallel assignments that doesn?t cost this much
in running time?
> 
> My other problem is the use of `next`. I would like to combine
tail-recursion optimisation with pattern matching as in
https://github.com/mailund/pmatch where I can, for example, define a linked list
like this:
> 
> devtools::install_github("mailund/pmatch?)
> library(pmatch)
> llist := NIL | CONS(car, cdr : llist)
> 
> and define a function for computing the length of a list like this:
> 
> list_length <- function(lst, acc = 0) {
>  force(acc)
>  cases(lst,
>        NIL -> acc,
>        CONS(car, cdr) -> list_length(cdr, acc + 1))
> }
> 
> The `cases` function creates an environment that binds variables in a
pattern-description that over-scopes the expression to the right of `->`, so
the recursive call in this example have access to the variables `cdr` and `car`.
> 
> I can transform a `cases` call to one that creates the environment
containing the bound variables and then evaluate this using `eval` or `with`,
but in either case, a call to `next` will not work in such a context. The
expression will be evaluated inside `bind` or `with`, and not in the
`list_lenght` function.
> 
> A version that *will* work, is something like this
> 
> factorial_tr_3 <- function (n, acc = 1) 
> {
>    .tailr_env <- rlang::get_env()
>    .tailr_frame <- rlang::current_frame()
>    repeat {
>        if (n <= 1) 
>            rlang::return_from(.tailr_frame, acc)
>        else {
>            rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
>            rlang::return_to(.tailr_frame)
>        }
>    }
> }
> 
> Here, again, for the factorial function since this is easier to follow than
the list-length function.
> 
> This solution will also work if you return values from inside loops, where
`next` wouldn?t work either.
> 
> Using `rlang::return_from` and `rlang::return_to` implements the right
semantics, but costs me another order of magnitude in running time.
> 
> microbenchmark::microbenchmark(factorial(100),
>                               factorial_tr_1(100),
>                               factorial_tr_2(100),
>                               factorial_tr_3(100))
> Unit: microseconds
>                expr       min         lq        mean     median        uq  
max neval
>      factorial(100)    52.479    60.2640    93.43069    67.5130    83.925  
2062.481   100
> factorial_tr_1(100)     8.875     9.6525    49.19595    10.6945    11.217  
3818.823   100
> factorial_tr_2(100)  5296.350  5525.0745  5973.77664  5737.8730  6260.128  
8471.301   100
> factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725 89859.169
171039.228   100
> 
> I can live with the ?introducing extra variables? solution to parallel
assignment, and I could hack my way out of using `with` or `bind` in rewriting
`cases`, but restarting a `repeat` loop would really make for a nicer solution.
I know that `goto` is considered harmful, but really, in this case, it is what I
want.
> 
> A `callCC` version also solves the problem
> 
> factorial_tr_4 <- function(n, acc = 1) {
>    function_body <- function(continuation) {
>        if (n <= 1) {
>            continuation(acc)
>        } else {
>            continuation(list("continue", n = n - 1, acc = acc *
n))
>        }
>    }
>    repeat {
>        result <- callCC(function_body)
>        if (is.list(result) && result[[1]] == "continue")
{
>            n <- result$n
>            acc <- result$acc
>            next
>        } else {
>            return(result)
>        }
>    }
> }
> 
> But this requires that I know how to distinguish between a valid return
value and a tag for ?next? and is still a lot slower than the `next` solution
> 
> microbenchmark::microbenchmark(factorial(100),
>                               factorial_tr_1(100),
>                               factorial_tr_2(100),
>                               factorial_tr_3(100),
>                               factorial_tr_4(100))
> Unit: microseconds
>                expr       min         lq        mean     median        uq  
max neval
>      factorial(100)    54.109    61.8095    81.33167    81.8785    89.748  
243.554   100
> factorial_tr_1(100)     9.025     9.9035    11.38607    11.1990    12.008  
22.375   100
> factorial_tr_2(100)  5272.524  5798.3965  6302.40467  6077.7180  6492.959  
9967.237   100
> factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665 75405.054
203785.673   100
> factorial_tr_4(100)   270.978   302.7890   337.48763   313.9930   334.096  
1425.702   100
> 
> I don?t necessarily need the tail-recursion optimisation to be faster than
the recursive version; just getting out of the problem of too deep recursions is
a benefit, but I would rather not pay with an order of magnitude for it. I
could, of course, try to handle cases that works with `next` in one way, and
other cases using `callCC`, but I feel it should be possible with a version that
handles all cases the same way.
> 
> Is there any way to achieve this?
> 
> Cheers
> 	Thomas
I didn't see any reference to the R `Recall` or `local` functions. I
don't remember that tail optimization is something that R provides, however.


David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.' 
-Gehm's Corollary to Clarke's Third Law

Thomas Mailund

2018-Feb-11 17:25 UTC

head link

[R] Parallel assignments and goto

I admit I didn?t know about Recall, but you are right, there is no direct
support for this tail-recursion optimisation. For good reasons ? it would break
a lot of NSE. I am not attempting to solve tail-recursion optimisation for all
cases. That wouldn?t work by just rewriting functions. It might be doable with
JIT or something like that, but my goal is less ambitious.

Using local, though, might be an approach. I will play around with that
tomorrow.

Cheers

On 11 Feb 2018, 18.19 +0100, David Winsemius <dwinsemius at comcast.net>,
wrote:>
> > On Feb 11, 2018, at 7:48 AM, Thomas Mailund <thomas.mailund at
gmail.com> wrote:
> >
> > Hi guys,
> >
> > I am working on some code for automatically translating recursive
functions into looping functions to implemented tail-recursion optimisations.
See https://github.com/mailund/tailr
> >
> > As a toy-example, consider the factorial function
> >
> > factorial <- function(n, acc = 1) {
> > if (n <= 1) acc
> > else factorial(n - 1, acc * n)
> > }
> >
> > I can automatically translate this into the loop-version
> >
> > factorial_tr_1 <- function (n, acc = 1)
> > {
> > repeat {
> > if (n <= 1)
> > return(acc)
> > else {
> > .tailr_n <- n - 1
> > .tailr_acc <- acc * acc
> > n <- .tailr_n
> > acc <- .tailr_acc
> > next
> > }
> > }
> > }
> >
> > which will run faster and not have problems with recursion depths.
However, I?m not entirely happy with this version for two reasons: I am not
happy with introducing the temporary variables and this rewrite will not work if
I try to over-scope an evaluation context.
> >
> > I have two related questions, one related to parallel assignments ?
i.e. expressions to variables so the expression uses the old variable values and
not the new values until the assignments are all done ? and one related to
restarting a loop from nested loops or from nested expressions in `with`
expressions or similar.
> >
> > I can implement parallel assignment using something like
rlang::env_bind:
> >
> > factorial_tr_2 <- function (n, acc = 1)
> > {
> > .tailr_env <- rlang::get_env()
> > repeat {
> > if (n <= 1)
> > return(acc)
> > else {
> > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > next
> > }
> > }
> > }
> >
> > This reduces the number of additional variables I need to one, but is
a couple of orders of magnitude slower than the first version.
> >
> > > microbenchmark::microbenchmark(factorial(100),
> > + factorial_tr_1(100),
> > + factorial_tr_2(100))
> > Unit: microseconds
> > expr min lq mean median uq max neval
> > factorial(100) 53.978 60.543 77.76203 71.0635 85.947 180.251 100
> > factorial_tr_1(100) 9.022 9.903 11.52563 11.0430 11.984 28.464 100
> > factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463
8177.635 100
> >
> >
> > Is there another way to do parallel assignments that doesn?t cost this
much in running time?
> >
> > My other problem is the use of `next`. I would like to combine
tail-recursion optimisation with pattern matching as in
https://github.com/mailund/pmatch where I can, for example, define a linked list
like this:
> >
> > devtools::install_github("mailund/pmatch?)
> > library(pmatch)
> > llist := NIL | CONS(car, cdr : llist)
> >
> > and define a function for computing the length of a list like this:
> >
> > list_length <- function(lst, acc = 0) {
> > force(acc)
> > cases(lst,
> > NIL -> acc,
> > CONS(car, cdr) -> list_length(cdr, acc + 1))
> > }
> >
> > The `cases` function creates an environment that binds variables in a
pattern-description that over-scopes the expression to the right of `->`, so
the recursive call in this example have access to the variables `cdr` and `car`.
> >
> > I can transform a `cases` call to one that creates the environment
containing the bound variables and then evaluate this using `eval` or `with`,
but in either case, a call to `next` will not work in such a context. The
expression will be evaluated inside `bind` or `with`, and not in the
`list_lenght` function.
> >
> > A version that *will* work, is something like this
> >
> > factorial_tr_3 <- function (n, acc = 1)
> > {
> > .tailr_env <- rlang::get_env()
> > .tailr_frame <- rlang::current_frame()
> > repeat {
> > if (n <= 1)
> > rlang::return_from(.tailr_frame, acc)
> > else {
> > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > rlang::return_to(.tailr_frame)
> > }
> > }
> > }
> >
> > Here, again, for the factorial function since this is easier to follow
than the list-length function.
> >
> > This solution will also work if you return values from inside loops,
where `next` wouldn?t work either.
> >
> > Using `rlang::return_from` and `rlang::return_to` implements the right
semantics, but costs me another order of magnitude in running time.
> >
> > microbenchmark::microbenchmark(factorial(100),
> > factorial_tr_1(100),
> > factorial_tr_2(100),
> > factorial_tr_3(100))
> > Unit: microseconds
> > expr min lq mean median uq max neval
> > factorial(100) 52.479 60.2640 93.43069 67.5130 83.925 2062.481 100
> > factorial_tr_1(100) 8.875 9.6525 49.19595 10.6945 11.217 3818.823 100
> > factorial_tr_2(100) 5296.350 5525.0745 5973.77664 5737.8730 6260.128
8471.301 100
> > factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725
89859.169 171039.228 100
> >
> > I can live with the ?introducing extra variables? solution to parallel
assignment, and I could hack my way out of using `with` or `bind` in rewriting
`cases`, but restarting a `repeat` loop would really make for a nicer solution.
I know that `goto` is considered harmful, but really, in this case, it is what I
want.
> >
> > A `callCC` version also solves the problem
> >
> > factorial_tr_4 <- function(n, acc = 1) {
> > function_body <- function(continuation) {
> > if (n <= 1) {
> > continuation(acc)
> > } else {
> > continuation(list("continue", n = n - 1, acc = acc * n))
> > }
> > }
> > repeat {
> > result <- callCC(function_body)
> > if (is.list(result) && result[[1]] == "continue") {
> > n <- result$n
> > acc <- result$acc
> > next
> > } else {
> > return(result)
> > }
> > }
> > }
> >
> > But this requires that I know how to distinguish between a valid
return value and a tag for ?next? and is still a lot slower than the `next`
solution
> >
> > microbenchmark::microbenchmark(factorial(100),
> > factorial_tr_1(100),
> > factorial_tr_2(100),
> > factorial_tr_3(100),
> > factorial_tr_4(100))
> > Unit: microseconds
> > expr min lq mean median uq max neval
> > factorial(100) 54.109 61.8095 81.33167 81.8785 89.748 243.554 100
> > factorial_tr_1(100) 9.025 9.9035 11.38607 11.1990 12.008 22.375 100
> > factorial_tr_2(100) 5272.524 5798.3965 6302.40467 6077.7180 6492.959
9967.237 100
> > factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665
75405.054 203785.673 100
> > factorial_tr_4(100) 270.978 302.7890 337.48763 313.9930 334.096
1425.702 100
> >
> > I don?t necessarily need the tail-recursion optimisation to be faster
than the recursive version; just getting out of the problem of too deep
recursions is a benefit, but I would rather not pay with an order of magnitude
for it. I could, of course, try to handle cases that works with `next` in one
way, and other cases using `callCC`, but I feel it should be possible with a
version that handles all cases the same way.
> >
> > Is there any way to achieve this?
> >
> > Cheers
> > Thomas
>
> I didn't see any reference to the R `Recall` or `local` functions. I
don't remember that tail optimization is something that R provides, however.
>
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently
advanced.' -Gehm's Corollary to Clarke's Third Law
>
>
>
>
>
	[[alternative HTML version deleted]]

Thomas Mailund

2018-Feb-26 19:59 UTC

head link

[R] Parallel assignments and goto

Following up on this attempt of implementing the tail-recursion optimisation ?
now that I?ve finally had the chance to look at it again ? I find that non-local
return implemented with callCC doesn?t actually incur much overhead once I do it
more sensibly. I haven?t found a good way to handle parallel assignments that
isn?t vastly slower than simply introducing extra variables, so I am going with
that solution. However, I have now run into another problem involving those
local variables ? and assigning to local variables in general.

Consider again the factorial function and three different ways of implementing
it using the tail recursion optimisation:

factorial <- function(n, acc = 1) {
? ? if (n == 1) acc
? ? else factorial(n - 1, n * acc)
}

factorial_tr_manual <- function (n, acc = 1)
{
? ? repeat {
? ? ? ? if (n <= 1)
? ? ? ? ? ? return(acc)
? ? ? ? else {
? ? ? ? ? ? .tailr_n <- n - 1
? ? ? ? ? ? .tailr_acc <- acc * n
? ? ? ? ? ? n <- .tailr_n
? ? ? ? ? ? acc <- .tailr_acc
? ? ? ? ? ? next
? ? ? ? }
? ? }
}

factorial_tr_automatic_1 <- function(n, acc = 1) {
? ? callCC(function(escape) {
? ? ? ? repeat {
? ? ? ? ? ? if (n <= 1) {
? ? ? ? ? ? ? ? escape(acc)
? ? ? ? ? ? } else {
? ? ? ? ? ? ? ? .tailr_n <- n - 1
? ? ? ? ? ? ? ? .tailr_acc <- n * acc
? ? ? ? ? ? ? ? n <- .tailr_n
? ? ? ? ? ? ? ? acc <- .tailr_acc
? ? ? ? ? ? }
? ? ? ? }
? ? })
}

factorial_tr_automatic_2 <- function(n, acc = 1) {
? ? .tailr_env <- rlang::get_env()
? ? callCC(function(escape) {
? ? ? ? repeat {
? ? ? ? ? ? if (n <= 1) {
? ? ? ? ? ? ? ? escape(acc)
? ? ? ? ? ? } else {
? ? ? ? ? ? ? ? .tailr_env$.tailr_n <- n - 1
? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- n * acc
? ? ? ? ? ? ? ? .tailr_env$n <- .tailr_env$.tailr_n
? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
? ? ? ? ? ? }
? ? ? ? }
? ? })
}

The?factorial_tr_manual function is how I would implement the function manually
while?factorial_tr_automatic_1 is what my package used to come up with. It
handles non-local returns, because this is something I need in general.
Finally,?factorial_tr_automatic_2 accesses the local variables explicitly
through the environment, which is what my package currently produces.

The difference between supporting non-local returns and not is tiny, but
explicitly accessing variables through their environment costs me about a factor
of five ? something that surprised me.
> microbenchmark::microbenchmark(factorial(1000),+ ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_manual(1000),
+ ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_1(1000),
+ ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_2(1000))
Unit: microseconds
? ? ? ? ? ? ? ? ? ? ? ? ? ?expr ? ? min ? ? ? lq ? ? mean ? median
? ? ? ? ? ? ? ? factorial(1000) 756.357 810.4135 963.1040 856.3315
? ? ? factorial_tr_manual(1000) 104.838 119.7595 198.7347 129.0870
?factorial_tr_automatic_1(1000) 112.354 125.5145 211.6148 135.5255
?factorial_tr_automatic_2(1000) 461.015 544.7035 688.5988 565.3240
? ? ? ?uq ? ? ?max neval
?945.3110 4149.099 ? 100
?136.8200 4190.331 ? 100
?152.9625 5944.312 ? 100
?600.5235 7798.622 ? 100

The simple solution, of course, is to not do that, but then I can?t handle
expressions inside calls to ?with?. And I would really like to, because then I
can combine tail recursion with pattern matching.

I can define linked lists and a length function on them like this:

library(pmatch)
llist := NIL | CONS(car, cdr : llist)

llength <- function(llist, acc = 0) {
? ? cases(llist,
? ? ? ? ? NIL -> acc,
? ? ? ? ? CONS(car, cdr) -> llength(cdr, acc + 1))
}

The tail-recursion I get out of transforming this function looks like this:

llength_tr <- function (llist, acc = 0) {
? ? .tailr_env <- rlang::get_env()
? ? callCC(function(escape) {
? ? ? ? repeat {
? ? ? ? ? ? if (!rlang::is_null(..match_env <- test_pattern(llist,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? NIL)))
? ? ? ? ? ? ? ? with(..match_env, escape(acc))

? ? ? ? ? ? else if (!rlang::is_null(..match_env <-
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?test_pattern(llist, CONS(car, cdr))))
? ? ? ? ? ? ? ? with(..match_env, {
? ? ? ? ? ? ? ? ? ? .tailr_env$.tailr_llist <- cdr
? ? ? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- acc + 1
? ? ? ? ? ? ? ? ? ? .tailr_env$llist <- .tailr_env$.tailr_llist
? ? ? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
? ? ? ? ? ? ? ? })
? ? ? ? }
? ? })
}

Maybe not the prettiest code, but you are not supposed to actually see it, of
course.

There is not much gain in speed

Unit: milliseconds
? ? ? ? ? ? ? ? ? ?expr ? ? ?min ? ? ? lq ? ? mean ? median ? ? ? uq
? ? llength(test_llist) 70.74605 76.08734 87.78418 85.81193 94.66378
?llength_tr(test_llist) 45.16946 51.56856 59.09306 57.00101 63.07044
? ? ? max neval
?182.4894 ? 100
?166.6990 ? 100

but you don?t run out of stack space
> llength(make_llist(1000))Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Error during wrapup: C stack usage ?7990648 is too close to the
limit> llength_tr(make_llist(1000))[1] 1000

I should be able to make the function go faster if I had a faster way of
handling the variable assignments, but inside ?with?, I?m not sure how to do
that?

Any suggestions?

Cheers

On 11 Feb 2018, 16.48 +0100, Thomas Mailund <thomas.mailund at gmail.com>,
wrote:> Hi guys,
>
> I am working on some code for automatically translating recursive functions
into looping functions to implemented tail-recursion optimisations. See
https://github.com/mailund/tailr
>
> As a toy-example, consider the factorial function
>
> factorial <- function(n, acc = 1) {
> if (n <= 1) acc
> else factorial(n - 1, acc * n)
> }
>
> I can automatically translate this into the loop-version
>
> factorial_tr_1 <- function (n, acc = 1)
> {
> repeat {
> if (n <= 1)
> return(acc)
> else {
> .tailr_n <- n - 1
> .tailr_acc <- acc * acc
> n <- .tailr_n
> acc <- .tailr_acc
> next
> }
> }
> }
>
> which will run faster and not have problems with recursion depths. However,
I?m not entirely happy with this version for two reasons: I am not happy with
introducing the temporary variables and this rewrite will not work if I try to
over-scope an evaluation context.
>
> I have two related questions, one related to parallel assignments ? i.e.
expressions to variables so the expression uses the old variable values and not
the new values until the assignments are all done ? and one related to
restarting a loop from nested loops or from nested expressions in `with`
expressions or similar.
>
> I can implement parallel assignment using something like rlang::env_bind:
>
> factorial_tr_2 <- function (n, acc = 1)
> {
> .tailr_env <- rlang::get_env()
> repeat {
> if (n <= 1)
> return(acc)
> else {
> rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> next
> }
> }
> }
>
> This reduces the number of additional variables I need to one, but is a
couple of orders of magnitude slower than the first version.
>
> > microbenchmark::microbenchmark(factorial(100),
> + factorial_tr_1(100),
> + factorial_tr_2(100))
> Unit: microseconds
> expr min lq mean median uq max neval
> factorial(100) 53.978 60.543 77.76203 71.0635 85.947 180.251 100
> factorial_tr_1(100) 9.022 9.903 11.52563 11.0430 11.984 28.464 100
> factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463
8177.635 100
>
>
> Is there another way to do parallel assignments that doesn?t cost this much
in running time?
>
> My other problem is the use of `next`. I would like to combine
tail-recursion optimisation with pattern matching as in
https://github.com/mailund/pmatch where I can, for example, define a linked list
like this:
>
> devtools::install_github("mailund/pmatch?)
> library(pmatch)
> llist := NIL | CONS(car, cdr : llist)
>
> and define a function for computing the length of a list like this:
>
> list_length <- function(lst, acc = 0) {
> force(acc)
> cases(lst,
> NIL -> acc,
> CONS(car, cdr) -> list_length(cdr, acc + 1))
> }
>
> The `cases` function creates an environment that binds variables in a
pattern-description that over-scopes the expression to the right of `->`, so
the recursive call in this example have access to the variables `cdr` and `car`.
>
> I can transform a `cases` call to one that creates the environment
containing the bound variables and then evaluate this using `eval` or `with`,
but in either case, a call to `next` will not work in such a context. The
expression will be evaluated inside `bind` or `with`, and not in the
`list_lenght` function.
>
> A version that *will* work, is something like this
>
> factorial_tr_3 <- function (n, acc = 1)
> {
> .tailr_env <- rlang::get_env()
> .tailr_frame <- rlang::current_frame()
> repeat {
> if (n <= 1)
> rlang::return_from(.tailr_frame, acc)
> else {
> rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> rlang::return_to(.tailr_frame)
> }
> }
> }
>
> Here, again, for the factorial function since this is easier to follow than
the list-length function.
>
> This solution will also work if you return values from inside loops, where
`next` wouldn?t work either.
>
> Using `rlang::return_from` and `rlang::return_to` implements the right
semantics, but costs me another order of magnitude in running time.
>
> microbenchmark::microbenchmark(factorial(100),
> factorial_tr_1(100),
> factorial_tr_2(100),
> factorial_tr_3(100))
> Unit: microseconds
> expr min lq mean median uq max neval
> factorial(100) 52.479 60.2640 93.43069 67.5130 83.925 2062.481 100
> factorial_tr_1(100) 8.875 9.6525 49.19595 10.6945 11.217 3818.823 100
> factorial_tr_2(100) 5296.350 5525.0745 5973.77664 5737.8730 6260.128
8471.301 100
> factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725 89859.169
171039.228 100
>
> I can live with the ?introducing extra variables? solution to parallel
assignment, and I could hack my way out of using `with` or `bind` in rewriting
`cases`, but restarting a `repeat` loop would really make for a nicer solution.
I know that `goto` is considered harmful, but really, in this case, it is what I
want.
>
> A `callCC` version also solves the problem
>
> factorial_tr_4 <- function(n, acc = 1) {
> function_body <- function(continuation) {
> if (n <= 1) {
> continuation(acc)
> } else {
> continuation(list("continue", n = n - 1, acc = acc * n))
> }
> }
> repeat {
> result <- callCC(function_body)
> if (is.list(result) && result[[1]] == "continue") {
> n <- result$n
> acc <- result$acc
> next
> } else {
> return(result)
> }
> }
> }
>
> But this requires that I know how to distinguish between a valid return
value and a tag for ?next? and is still a lot slower than the `next` solution
>
> microbenchmark::microbenchmark(factorial(100),
> factorial_tr_1(100),
> factorial_tr_2(100),
> factorial_tr_3(100),
> factorial_tr_4(100))
> Unit: microseconds
> expr min lq mean median uq max neval
> factorial(100) 54.109 61.8095 81.33167 81.8785 89.748 243.554 100
> factorial_tr_1(100) 9.025 9.9035 11.38607 11.1990 12.008 22.375 100
> factorial_tr_2(100) 5272.524 5798.3965 6302.40467 6077.7180 6492.959
9967.237 100
> factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665 75405.054
203785.673 100
> factorial_tr_4(100) 270.978 302.7890 337.48763 313.9930 334.096 1425.702
100
>
> I don?t necessarily need the tail-recursion optimisation to be faster than
the recursive version; just getting out of the problem of too deep recursions is
a benefit, but I would rather not pay with an order of magnitude for it. I
could, of course, try to handle cases that works with `next` in one way, and
other cases using `callCC`, but I feel it should be possible with a version that
handles all cases the same way.
>
> Is there any way to achieve this?
>
> Cheers
> Thomas
>
>
>
>
>
>
>
>
	[[alternative HTML version deleted]]

Thomas Mailund

2018-Feb-27 14:51 UTC

head link

[R] Parallel assignments and goto

Interestingly, the <<- operator is also a lot faster than using a
namespace explicitly, and only slightly slower than using <- with local
variables, see below. But, surely, both must at some point insert values in a
given environment ? either the local one, for <-, or an enclosing one, for
<<- ? so I guess I am asking if there is a more low-level assignment
operation I can get my hands on without diving into C?


factorial <- function(n, acc = 1) {
? ? if (n == 1) acc
? ? else factorial(n - 1, n * acc)
}

factorial_tr_manual <- function (n, acc = 1)
{
? ? repeat {
? ? ? ? if (n <= 1)
? ? ? ? ? ? return(acc)
? ? ? ? else {
? ? ? ? ? ? .tailr_n <- n - 1
? ? ? ? ? ? .tailr_acc <- acc * n
? ? ? ? ? ? n <- .tailr_n
? ? ? ? ? ? acc <- .tailr_acc
? ? ? ? ? ? next
? ? ? ? }
? ? }
}

factorial_tr_automatic_1 <- function(n, acc = 1) {
? ? .tailr_n <- n
? ? .tailr_acc <- acc
? ? callCC(function(escape) {
? ? ? ? repeat {
? ? ? ? ? ? n <- .tailr_n
? ? ? ? ? ? acc <- .tailr_acc
? ? ? ? ? ? if (n <= 1) {
? ? ? ? ? ? ? ? escape(acc)
? ? ? ? ? ? } else {
? ? ? ? ? ? ? ? .tailr_n <<- n - 1
? ? ? ? ? ? ? ? .tailr_acc <<- n * acc
? ? ? ? ? ? }
? ? ? ? }
? ? })
}

factorial_tr_automatic_2 <- function(n, acc = 1) {
? ? .tailr_env <- rlang::get_env()
? ? callCC(function(escape) {
? ? ? ? repeat {
? ? ? ? ? ? if (n <= 1) {
? ? ? ? ? ? ? ? escape(acc)
? ? ? ? ? ? } else {
? ? ? ? ? ? ? ? .tailr_env$.tailr_n <- n - 1
? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- n * acc
? ? ? ? ? ? ? ? .tailr_env$n <- .tailr_env$.tailr_n
? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
? ? ? ? ? ? }
? ? ? ? }
? ? })
}

microbenchmark::microbenchmark(factorial(1000),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_manual(1000),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_1(1000),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_2(1000))
Unit: microseconds
? ? ? ? ? ? ? ? ? ? ? ? ? ?expr ? ? min ? ? ?lq ? ? ?mean ? median ? ? ? ?uq ? ?
?max neval
? ? ? ? ? ? ? ? factorial(1000) 884.137 942.060 1076.3949 977.6235 1042.5035
2889.779 ? 100
? ? ? factorial_tr_manual(1000) 110.215 116.919 ?130.2337 118.7350 ?122.7495
?255.062 ? 100
?factorial_tr_automatic_1(1000) 179.897 183.437 ?212.8879 187.8250 ?195.7670
?979.352 ? 100
?factorial_tr_automatic_2(1000) 508.353 534.328 ?601.9643 560.7830 ?587.8350
1424.260 ? 100

Cheers

On 26 Feb 2018, 21.12 +0100, Thomas Mailund <thomas.mailund at gmail.com>,
wrote:> Following up on this attempt of implementing the tail-recursion
optimisation ? now that I?ve finally had the chance to look at it again ? I find
that non-local return implemented with callCC doesn?t actually incur much
overhead once I do it more sensibly. I haven?t found a good way to handle
parallel assignments that isn?t vastly slower than simply introducing extra
variables, so I am going with that solution. However, I have now run into
another problem involving those local variables ? and assigning to local
variables in general.
>
> Consider again the factorial function and three different ways of
implementing it using the tail recursion optimisation:
>
> factorial <- function(n, acc = 1) {
> ? ? if (n == 1) acc
> ? ? else factorial(n - 1, n * acc)
> }
>
> factorial_tr_manual <- function (n, acc = 1)
> {
> ? ? repeat {
> ? ? ? ? if (n <= 1)
> ? ? ? ? ? ? return(acc)
> ? ? ? ? else {
> ? ? ? ? ? ? .tailr_n <- n - 1
> ? ? ? ? ? ? .tailr_acc <- acc * n
> ? ? ? ? ? ? n <- .tailr_n
> ? ? ? ? ? ? acc <- .tailr_acc
> ? ? ? ? ? ? next
> ? ? ? ? }
> ? ? }
> }
>
> factorial_tr_automatic_1 <- function(n, acc = 1) {
> ? ? callCC(function(escape) {
> ? ? ? ? repeat {
> ? ? ? ? ? ? if (n <= 1) {
> ? ? ? ? ? ? ? ? escape(acc)
> ? ? ? ? ? ? } else {
> ? ? ? ? ? ? ? ? .tailr_n <- n - 1
> ? ? ? ? ? ? ? ? .tailr_acc <- n * acc
> ? ? ? ? ? ? ? ? n <- .tailr_n
> ? ? ? ? ? ? ? ? acc <- .tailr_acc
> ? ? ? ? ? ? }
> ? ? ? ? }
> ? ? })
> }
>
> factorial_tr_automatic_2 <- function(n, acc = 1) {
> ? ? .tailr_env <- rlang::get_env()
> ? ? callCC(function(escape) {
> ? ? ? ? repeat {
> ? ? ? ? ? ? if (n <= 1) {
> ? ? ? ? ? ? ? ? escape(acc)
> ? ? ? ? ? ? } else {
> ? ? ? ? ? ? ? ? .tailr_env$.tailr_n <- n - 1
> ? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- n * acc
> ? ? ? ? ? ? ? ? .tailr_env$n <- .tailr_env$.tailr_n
> ? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
> ? ? ? ? ? ? }
> ? ? ? ? }
> ? ? })
> }
>
> The?factorial_tr_manual function is how I would implement the function
manually while?factorial_tr_automatic_1 is what my package used to come up with.
It handles non-local returns, because this is something I need in general.
Finally,?factorial_tr_automatic_2 accesses the local variables explicitly
through the environment, which is what my package currently produces.
>
> The difference between supporting non-local returns and not is tiny, but
explicitly accessing variables through their environment costs me about a factor
of five ? something that surprised me.
>
> > microbenchmark::microbenchmark(factorial(1000),
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_manual(1000),
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_1(1000),
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_2(1000))
> Unit: microseconds
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?expr ? ? min ? ? ? lq ? ? mean ? median
> ? ? ? ? ? ? ? ? factorial(1000) 756.357 810.4135 963.1040 856.3315
> ? ? ? factorial_tr_manual(1000) 104.838 119.7595 198.7347 129.0870
> ?factorial_tr_automatic_1(1000) 112.354 125.5145 211.6148 135.5255
> ?factorial_tr_automatic_2(1000) 461.015 544.7035 688.5988 565.3240
> ? ? ? ?uq ? ? ?max neval
> ?945.3110 4149.099 ? 100
> ?136.8200 4190.331 ? 100
> ?152.9625 5944.312 ? 100
> ?600.5235 7798.622 ? 100
>
> The simple solution, of course, is to not do that, but then I can?t handle
expressions inside calls to ?with?. And I would really like to, because then I
can combine tail recursion with pattern matching.
>
> I can define linked lists and a length function on them like this:
>
> library(pmatch)
> llist := NIL | CONS(car, cdr : llist)
>
> llength <- function(llist, acc = 0) {
> ? ? cases(llist,
> ? ? ? ? ? NIL -> acc,
> ? ? ? ? ? CONS(car, cdr) -> llength(cdr, acc + 1))
> }
>
> The tail-recursion I get out of transforming this function looks like this:
>
> llength_tr <- function (llist, acc = 0) {
> ? ? .tailr_env <- rlang::get_env()
> ? ? callCC(function(escape) {
> ? ? ? ? repeat {
> ? ? ? ? ? ? if (!rlang::is_null(..match_env <- test_pattern(llist,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? NIL)))
> ? ? ? ? ? ? ? ? with(..match_env, escape(acc))
>
> ? ? ? ? ? ? else if (!rlang::is_null(..match_env <-
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?test_pattern(llist, CONS(car, cdr))))
> ? ? ? ? ? ? ? ? with(..match_env, {
> ? ? ? ? ? ? ? ? ? ? .tailr_env$.tailr_llist <- cdr
> ? ? ? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- acc + 1
> ? ? ? ? ? ? ? ? ? ? .tailr_env$llist <- .tailr_env$.tailr_llist
> ? ? ? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
> ? ? ? ? ? ? ? ? })
> ? ? ? ? }
> ? ? })
> }
>
> Maybe not the prettiest code, but you are not supposed to actually see it,
of course.
>
> There is not much gain in speed
>
> Unit: milliseconds
> ? ? ? ? ? ? ? ? ? ?expr ? ? ?min ? ? ? lq ? ? mean ? median ? ? ? uq
> ? ? llength(test_llist) 70.74605 76.08734 87.78418 85.81193 94.66378
> ?llength_tr(test_llist) 45.16946 51.56856 59.09306 57.00101 63.07044
> ? ? ? max neval
> ?182.4894 ? 100
> ?166.6990 ? 100
>
> but you don?t run out of stack space
>
> > llength(make_llist(1000))
> Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?
> Error during wrapup: C stack usage ?7990648 is too close to the limit
> > llength_tr(make_llist(1000))
> [1] 1000
>
> I should be able to make the function go faster if I had a faster way of
handling the variable assignments, but inside ?with?, I?m not sure how to do
that?
>
> Any suggestions?
>
> Cheers
>
> On 11 Feb 2018, 16.48 +0100, Thomas Mailund <thomas.mailund at
gmail.com>, wrote:
> > Hi guys,
> >
> > I am working on some code for automatically translating recursive
functions into looping functions to implemented tail-recursion optimisations.
See https://github.com/mailund/tailr
> >
> > As a toy-example, consider the factorial function
> >
> > factorial <- function(n, acc = 1) {
> > if (n <= 1) acc
> > else factorial(n - 1, acc * n)
> > }
> >
> > I can automatically translate this into the loop-version
> >
> > factorial_tr_1 <- function (n, acc = 1)
> > {
> > repeat {
> > if (n <= 1)
> > return(acc)
> > else {
> > .tailr_n <- n - 1
> > .tailr_acc <- acc * acc
> > n <- .tailr_n
> > acc <- .tailr_acc
> > next
> > }
> > }
> > }
> >
> > which will run faster and not have problems with recursion depths.
However, I?m not entirely happy with this version for two reasons: I am not
happy with introducing the temporary variables and this rewrite will not work if
I try to over-scope an evaluation context.
> >
> > I have two related questions, one related to parallel assignments ?
i.e. expressions to variables so the expression uses the old variable values and
not the new values until the assignments are all done ? and one related to
restarting a loop from nested loops or from nested expressions in `with`
expressions or similar.
> >
> > I can implement parallel assignment using something like
rlang::env_bind:
> >
> > factorial_tr_2 <- function (n, acc = 1)
> > {
> > .tailr_env <- rlang::get_env()
> > repeat {
> > if (n <= 1)
> > return(acc)
> > else {
> > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > next
> > }
> > }
> > }
> >
> > This reduces the number of additional variables I need to one, but is
a couple of orders of magnitude slower than the first version.
> >
> > > microbenchmark::microbenchmark(factorial(100),
> > + factorial_tr_1(100),
> > + factorial_tr_2(100))
> > Unit: microseconds
> > expr min lq mean median uq max neval
> > factorial(100) 53.978 60.543 77.76203 71.0635 85.947 180.251 100
> > factorial_tr_1(100) 9.022 9.903 11.52563 11.0430 11.984 28.464 100
> > factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463
8177.635 100
> >
> >
> > Is there another way to do parallel assignments that doesn?t cost this
much in running time?
> >
> > My other problem is the use of `next`. I would like to combine
tail-recursion optimisation with pattern matching as in
https://github.com/mailund/pmatch where I can, for example, define a linked list
like this:
> >
> > devtools::install_github("mailund/pmatch?)
> > library(pmatch)
> > llist := NIL | CONS(car, cdr : llist)
> >
> > and define a function for computing the length of a list like this:
> >
> > list_length <- function(lst, acc = 0) {
> > force(acc)
> > cases(lst,
> > NIL -> acc,
> > CONS(car, cdr) -> list_length(cdr, acc + 1))
> > }
> >
> > The `cases` function creates an environment that binds variables in a
pattern-description that over-scopes the expression to the right of `->`, so
the recursive call in this example have access to the variables `cdr` and `car`.
> >
> > I can transform a `cases` call to one that creates the environment
containing the bound variables and then evaluate this using `eval` or `with`,
but in either case, a call to `next` will not work in such a context. The
expression will be evaluated inside `bind` or `with`, and not in the
`list_lenght` function.
> >
> > A version that *will* work, is something like this
> >
> > factorial_tr_3 <- function (n, acc = 1)
> > {
> > .tailr_env <- rlang::get_env()
> > .tailr_frame <- rlang::current_frame()
> > repeat {
> > if (n <= 1)
> > rlang::return_from(.tailr_frame, acc)
> > else {
> > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > rlang::return_to(.tailr_frame)
> > }
> > }
> > }
> >
> > Here, again, for the factorial function since this is easier to follow
than the list-length function.
> >
> > This solution will also work if you return values from inside loops,
where `next` wouldn?t work either.
> >
> > Using `rlang::return_from` and `rlang::return_to` implements the right
semantics, but costs me another order of magnitude in running time.
> >
> > microbenchmark::microbenchmark(factorial(100),
> > factorial_tr_1(100),
> > factorial_tr_2(100),
> > factorial_tr_3(100))
> > Unit: microseconds
> > expr min lq mean median uq max neval
> > factorial(100) 52.479 60.2640 93.43069 67.5130 83.925 2062.481 100
> > factorial_tr_1(100) 8.875 9.6525 49.19595 10.6945 11.217 3818.823 100
> > factorial_tr_2(100) 5296.350 5525.0745 5973.77664 5737.8730 6260.128
8471.301 100
> > factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725
89859.169 171039.228 100
> >
> > I can live with the ?introducing extra variables? solution to parallel
assignment, and I could hack my way out of using `with` or `bind` in rewriting
`cases`, but restarting a `repeat` loop would really make for a nicer solution.
I know that `goto` is considered harmful, but really, in this case, it is what I
want.
> >
> > A `callCC` version also solves the problem
> >
> > factorial_tr_4 <- function(n, acc = 1) {
> > function_body <- function(continuation) {
> > if (n <= 1) {
> > continuation(acc)
> > } else {
> > continuation(list("continue", n = n - 1, acc = acc * n))
> > }
> > }
> > repeat {
> > result <- callCC(function_body)
> > if (is.list(result) && result[[1]] == "continue") {
> > n <- result$n
> > acc <- result$acc
> > next
> > } else {
> > return(result)
> > }
> > }
> > }
> >
> > But this requires that I know how to distinguish between a valid
return value and a tag for ?next? and is still a lot slower than the `next`
solution
> >
> > microbenchmark::microbenchmark(factorial(100),
> > factorial_tr_1(100),
> > factorial_tr_2(100),
> > factorial_tr_3(100),
> > factorial_tr_4(100))
> > Unit: microseconds
> > expr min lq mean median uq max neval
> > factorial(100) 54.109 61.8095 81.33167 81.8785 89.748 243.554 100
> > factorial_tr_1(100) 9.025 9.9035 11.38607 11.1990 12.008 22.375 100
> > factorial_tr_2(100) 5272.524 5798.3965 6302.40467 6077.7180 6492.959
9967.237 100
> > factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665
75405.054 203785.673 100
> > factorial_tr_4(100) 270.978 302.7890 337.48763 313.9930 334.096
1425.702 100
> >
> > I don?t necessarily need the tail-recursion optimisation to be faster
than the recursive version; just getting out of the problem of too deep
recursions is a benefit, but I would rather not pay with an order of magnitude
for it. I could, of course, try to handle cases that works with `next` in one
way, and other cases using `callCC`, but I feel it should be possible with a
version that handles all cases the same way.
> >
> > Is there any way to achieve this?
> >
> > Cheers
> > Thomas
> >
> >
> >
> >
> >
> >
> >
> >
	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more reasonably related threads

R help - Feb 2018 - Parallel assignments and goto

[R] Parallel assignments and goto

[R] Parallel assignments and goto

[R] Parallel assignments and goto

[R] Parallel assignments and goto

[R] Parallel assignments and goto

Seemingly Similar Threads