thr3ads.net - R help - [R] Parallel assignments and goto [Feb 2018]

If this information is useful, please help other people find it:
Share via:

Thomas Mailund

2018-Feb-27 14:51 UTC

[R] Parallel assignments and goto

Interestingly, the <<- operator is also a lot faster than using a
namespace explicitly, and only slightly slower than using <- with local
variables, see below. But, surely, both must at some point insert values in a
given environment ? either the local one, for <-, or an enclosing one, for
<<- ? so I guess I am asking if there is a more low-level assignment
operation I can get my hands on without diving into C?


factorial <- function(n, acc = 1) {
? ? if (n == 1) acc
? ? else factorial(n - 1, n * acc)
}

factorial_tr_manual <- function (n, acc = 1)
{
? ? repeat {
? ? ? ? if (n <= 1)
? ? ? ? ? ? return(acc)
? ? ? ? else {
? ? ? ? ? ? .tailr_n <- n - 1
? ? ? ? ? ? .tailr_acc <- acc * n
? ? ? ? ? ? n <- .tailr_n
? ? ? ? ? ? acc <- .tailr_acc
? ? ? ? ? ? next
? ? ? ? }
? ? }
}

factorial_tr_automatic_1 <- function(n, acc = 1) {
? ? .tailr_n <- n
? ? .tailr_acc <- acc
? ? callCC(function(escape) {
? ? ? ? repeat {
? ? ? ? ? ? n <- .tailr_n
? ? ? ? ? ? acc <- .tailr_acc
? ? ? ? ? ? if (n <= 1) {
? ? ? ? ? ? ? ? escape(acc)
? ? ? ? ? ? } else {
? ? ? ? ? ? ? ? .tailr_n <<- n - 1
? ? ? ? ? ? ? ? .tailr_acc <<- n * acc
? ? ? ? ? ? }
? ? ? ? }
? ? })
}

factorial_tr_automatic_2 <- function(n, acc = 1) {
? ? .tailr_env <- rlang::get_env()
? ? callCC(function(escape) {
? ? ? ? repeat {
? ? ? ? ? ? if (n <= 1) {
? ? ? ? ? ? ? ? escape(acc)
? ? ? ? ? ? } else {
? ? ? ? ? ? ? ? .tailr_env$.tailr_n <- n - 1
? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- n * acc
? ? ? ? ? ? ? ? .tailr_env$n <- .tailr_env$.tailr_n
? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
? ? ? ? ? ? }
? ? ? ? }
? ? })
}

microbenchmark::microbenchmark(factorial(1000),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_manual(1000),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_1(1000),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_2(1000))
Unit: microseconds
? ? ? ? ? ? ? ? ? ? ? ? ? ?expr ? ? min ? ? ?lq ? ? ?mean ? median ? ? ? ?uq ? ?
?max neval
? ? ? ? ? ? ? ? factorial(1000) 884.137 942.060 1076.3949 977.6235 1042.5035
2889.779 ? 100
? ? ? factorial_tr_manual(1000) 110.215 116.919 ?130.2337 118.7350 ?122.7495
?255.062 ? 100
?factorial_tr_automatic_1(1000) 179.897 183.437 ?212.8879 187.8250 ?195.7670
?979.352 ? 100
?factorial_tr_automatic_2(1000) 508.353 534.328 ?601.9643 560.7830 ?587.8350
1424.260 ? 100

Cheers

On 26 Feb 2018, 21.12 +0100, Thomas Mailund <thomas.mailund at gmail.com>,
wrote:> Following up on this attempt of implementing the tail-recursion
optimisation ? now that I?ve finally had the chance to look at it again ? I find
that non-local return implemented with callCC doesn?t actually incur much
overhead once I do it more sensibly. I haven?t found a good way to handle
parallel assignments that isn?t vastly slower than simply introducing extra
variables, so I am going with that solution. However, I have now run into
another problem involving those local variables ? and assigning to local
variables in general.
>
> Consider again the factorial function and three different ways of
implementing it using the tail recursion optimisation:
>
> factorial <- function(n, acc = 1) {
> ? ? if (n == 1) acc
> ? ? else factorial(n - 1, n * acc)
> }
>
> factorial_tr_manual <- function (n, acc = 1)
> {
> ? ? repeat {
> ? ? ? ? if (n <= 1)
> ? ? ? ? ? ? return(acc)
> ? ? ? ? else {
> ? ? ? ? ? ? .tailr_n <- n - 1
> ? ? ? ? ? ? .tailr_acc <- acc * n
> ? ? ? ? ? ? n <- .tailr_n
> ? ? ? ? ? ? acc <- .tailr_acc
> ? ? ? ? ? ? next
> ? ? ? ? }
> ? ? }
> }
>
> factorial_tr_automatic_1 <- function(n, acc = 1) {
> ? ? callCC(function(escape) {
> ? ? ? ? repeat {
> ? ? ? ? ? ? if (n <= 1) {
> ? ? ? ? ? ? ? ? escape(acc)
> ? ? ? ? ? ? } else {
> ? ? ? ? ? ? ? ? .tailr_n <- n - 1
> ? ? ? ? ? ? ? ? .tailr_acc <- n * acc
> ? ? ? ? ? ? ? ? n <- .tailr_n
> ? ? ? ? ? ? ? ? acc <- .tailr_acc
> ? ? ? ? ? ? }
> ? ? ? ? }
> ? ? })
> }
>
> factorial_tr_automatic_2 <- function(n, acc = 1) {
> ? ? .tailr_env <- rlang::get_env()
> ? ? callCC(function(escape) {
> ? ? ? ? repeat {
> ? ? ? ? ? ? if (n <= 1) {
> ? ? ? ? ? ? ? ? escape(acc)
> ? ? ? ? ? ? } else {
> ? ? ? ? ? ? ? ? .tailr_env$.tailr_n <- n - 1
> ? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- n * acc
> ? ? ? ? ? ? ? ? .tailr_env$n <- .tailr_env$.tailr_n
> ? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
> ? ? ? ? ? ? }
> ? ? ? ? }
> ? ? })
> }
>
> The?factorial_tr_manual function is how I would implement the function
manually while?factorial_tr_automatic_1 is what my package used to come up with.
It handles non-local returns, because this is something I need in general.
Finally,?factorial_tr_automatic_2 accesses the local variables explicitly
through the environment, which is what my package currently produces.
>
> The difference between supporting non-local returns and not is tiny, but
explicitly accessing variables through their environment costs me about a factor
of five ? something that surprised me.
>
> > microbenchmark::microbenchmark(factorial(1000),
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_manual(1000),
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_1(1000),
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_2(1000))
> Unit: microseconds
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?expr ? ? min ? ? ? lq ? ? mean ? median
> ? ? ? ? ? ? ? ? factorial(1000) 756.357 810.4135 963.1040 856.3315
> ? ? ? factorial_tr_manual(1000) 104.838 119.7595 198.7347 129.0870
> ?factorial_tr_automatic_1(1000) 112.354 125.5145 211.6148 135.5255
> ?factorial_tr_automatic_2(1000) 461.015 544.7035 688.5988 565.3240
> ? ? ? ?uq ? ? ?max neval
> ?945.3110 4149.099 ? 100
> ?136.8200 4190.331 ? 100
> ?152.9625 5944.312 ? 100
> ?600.5235 7798.622 ? 100
>
> The simple solution, of course, is to not do that, but then I can?t handle
expressions inside calls to ?with?. And I would really like to, because then I
can combine tail recursion with pattern matching.
>
> I can define linked lists and a length function on them like this:
>
> library(pmatch)
> llist := NIL | CONS(car, cdr : llist)
>
> llength <- function(llist, acc = 0) {
> ? ? cases(llist,
> ? ? ? ? ? NIL -> acc,
> ? ? ? ? ? CONS(car, cdr) -> llength(cdr, acc + 1))
> }
>
> The tail-recursion I get out of transforming this function looks like this:
>
> llength_tr <- function (llist, acc = 0) {
> ? ? .tailr_env <- rlang::get_env()
> ? ? callCC(function(escape) {
> ? ? ? ? repeat {
> ? ? ? ? ? ? if (!rlang::is_null(..match_env <- test_pattern(llist,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? NIL)))
> ? ? ? ? ? ? ? ? with(..match_env, escape(acc))
>
> ? ? ? ? ? ? else if (!rlang::is_null(..match_env <-
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?test_pattern(llist, CONS(car, cdr))))
> ? ? ? ? ? ? ? ? with(..match_env, {
> ? ? ? ? ? ? ? ? ? ? .tailr_env$.tailr_llist <- cdr
> ? ? ? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- acc + 1
> ? ? ? ? ? ? ? ? ? ? .tailr_env$llist <- .tailr_env$.tailr_llist
> ? ? ? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
> ? ? ? ? ? ? ? ? })
> ? ? ? ? }
> ? ? })
> }
>
> Maybe not the prettiest code, but you are not supposed to actually see it,
of course.
>
> There is not much gain in speed
>
> Unit: milliseconds
> ? ? ? ? ? ? ? ? ? ?expr ? ? ?min ? ? ? lq ? ? mean ? median ? ? ? uq
> ? ? llength(test_llist) 70.74605 76.08734 87.78418 85.81193 94.66378
> ?llength_tr(test_llist) 45.16946 51.56856 59.09306 57.00101 63.07044
> ? ? ? max neval
> ?182.4894 ? 100
> ?166.6990 ? 100
>
> but you don?t run out of stack space
>
> > llength(make_llist(1000))
> Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?
> Error during wrapup: C stack usage ?7990648 is too close to the limit
> > llength_tr(make_llist(1000))
> [1] 1000
>
> I should be able to make the function go faster if I had a faster way of
handling the variable assignments, but inside ?with?, I?m not sure how to do
that?
>
> Any suggestions?
>
> Cheers
>
> On 11 Feb 2018, 16.48 +0100, Thomas Mailund <thomas.mailund at
gmail.com>, wrote:
> > Hi guys,
> >
> > I am working on some code for automatically translating recursive
functions into looping functions to implemented tail-recursion optimisations.
See https://github.com/mailund/tailr
> >
> > As a toy-example, consider the factorial function
> >
> > factorial <- function(n, acc = 1) {
> > if (n <= 1) acc
> > else factorial(n - 1, acc * n)
> > }
> >
> > I can automatically translate this into the loop-version
> >
> > factorial_tr_1 <- function (n, acc = 1)
> > {
> > repeat {
> > if (n <= 1)
> > return(acc)
> > else {
> > .tailr_n <- n - 1
> > .tailr_acc <- acc * acc
> > n <- .tailr_n
> > acc <- .tailr_acc
> > next
> > }
> > }
> > }
> >
> > which will run faster and not have problems with recursion depths.
However, I?m not entirely happy with this version for two reasons: I am not
happy with introducing the temporary variables and this rewrite will not work if
I try to over-scope an evaluation context.
> >
> > I have two related questions, one related to parallel assignments ?
i.e. expressions to variables so the expression uses the old variable values and
not the new values until the assignments are all done ? and one related to
restarting a loop from nested loops or from nested expressions in `with`
expressions or similar.
> >
> > I can implement parallel assignment using something like
rlang::env_bind:
> >
> > factorial_tr_2 <- function (n, acc = 1)
> > {
> > .tailr_env <- rlang::get_env()
> > repeat {
> > if (n <= 1)
> > return(acc)
> > else {
> > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > next
> > }
> > }
> > }
> >
> > This reduces the number of additional variables I need to one, but is
a couple of orders of magnitude slower than the first version.
> >
> > > microbenchmark::microbenchmark(factorial(100),
> > + factorial_tr_1(100),
> > + factorial_tr_2(100))
> > Unit: microseconds
> > expr min lq mean median uq max neval
> > factorial(100) 53.978 60.543 77.76203 71.0635 85.947 180.251 100
> > factorial_tr_1(100) 9.022 9.903 11.52563 11.0430 11.984 28.464 100
> > factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463
8177.635 100
> >
> >
> > Is there another way to do parallel assignments that doesn?t cost this
much in running time?
> >
> > My other problem is the use of `next`. I would like to combine
tail-recursion optimisation with pattern matching as in
https://github.com/mailund/pmatch where I can, for example, define a linked list
like this:
> >
> > devtools::install_github("mailund/pmatch?)
> > library(pmatch)
> > llist := NIL | CONS(car, cdr : llist)
> >
> > and define a function for computing the length of a list like this:
> >
> > list_length <- function(lst, acc = 0) {
> > force(acc)
> > cases(lst,
> > NIL -> acc,
> > CONS(car, cdr) -> list_length(cdr, acc + 1))
> > }
> >
> > The `cases` function creates an environment that binds variables in a
pattern-description that over-scopes the expression to the right of `->`, so
the recursive call in this example have access to the variables `cdr` and `car`.
> >
> > I can transform a `cases` call to one that creates the environment
containing the bound variables and then evaluate this using `eval` or `with`,
but in either case, a call to `next` will not work in such a context. The
expression will be evaluated inside `bind` or `with`, and not in the
`list_lenght` function.
> >
> > A version that *will* work, is something like this
> >
> > factorial_tr_3 <- function (n, acc = 1)
> > {
> > .tailr_env <- rlang::get_env()
> > .tailr_frame <- rlang::current_frame()
> > repeat {
> > if (n <= 1)
> > rlang::return_from(.tailr_frame, acc)
> > else {
> > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > rlang::return_to(.tailr_frame)
> > }
> > }
> > }
> >
> > Here, again, for the factorial function since this is easier to follow
than the list-length function.
> >
> > This solution will also work if you return values from inside loops,
where `next` wouldn?t work either.
> >
> > Using `rlang::return_from` and `rlang::return_to` implements the right
semantics, but costs me another order of magnitude in running time.
> >
> > microbenchmark::microbenchmark(factorial(100),
> > factorial_tr_1(100),
> > factorial_tr_2(100),
> > factorial_tr_3(100))
> > Unit: microseconds
> > expr min lq mean median uq max neval
> > factorial(100) 52.479 60.2640 93.43069 67.5130 83.925 2062.481 100
> > factorial_tr_1(100) 8.875 9.6525 49.19595 10.6945 11.217 3818.823 100
> > factorial_tr_2(100) 5296.350 5525.0745 5973.77664 5737.8730 6260.128
8471.301 100
> > factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725
89859.169 171039.228 100
> >
> > I can live with the ?introducing extra variables? solution to parallel
assignment, and I could hack my way out of using `with` or `bind` in rewriting
`cases`, but restarting a `repeat` loop would really make for a nicer solution.
I know that `goto` is considered harmful, but really, in this case, it is what I
want.
> >
> > A `callCC` version also solves the problem
> >
> > factorial_tr_4 <- function(n, acc = 1) {
> > function_body <- function(continuation) {
> > if (n <= 1) {
> > continuation(acc)
> > } else {
> > continuation(list("continue", n = n - 1, acc = acc * n))
> > }
> > }
> > repeat {
> > result <- callCC(function_body)
> > if (is.list(result) && result[[1]] == "continue") {
> > n <- result$n
> > acc <- result$acc
> > next
> > } else {
> > return(result)
> > }
> > }
> > }
> >
> > But this requires that I know how to distinguish between a valid
return value and a tag for ?next? and is still a lot slower than the `next`
solution
> >
> > microbenchmark::microbenchmark(factorial(100),
> > factorial_tr_1(100),
> > factorial_tr_2(100),
> > factorial_tr_3(100),
> > factorial_tr_4(100))
> > Unit: microseconds
> > expr min lq mean median uq max neval
> > factorial(100) 54.109 61.8095 81.33167 81.8785 89.748 243.554 100
> > factorial_tr_1(100) 9.025 9.9035 11.38607 11.1990 12.008 22.375 100
> > factorial_tr_2(100) 5272.524 5798.3965 6302.40467 6077.7180 6492.959
9967.237 100
> > factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665
75405.054 203785.673 100
> > factorial_tr_4(100) 270.978 302.7890 337.48763 313.9930 334.096
1425.702 100
> >
> > I don?t necessarily need the tail-recursion optimisation to be faster
than the recursive version; just getting out of the problem of too deep
recursions is a benefit, but I would rather not pay with an order of magnitude
for it. I could, of course, try to handle cases that works with `next` in one
way, and other cases using `callCC`, but I feel it should be possible with a
version that handles all cases the same way.
> >
> > Is there any way to achieve this?
> >
> > Cheers
> > Thomas
> >
> >
> >
> >
> >
> >
> >
> >
	[[alternative HTML version deleted]]

Bert Gunter

2018-Feb-27 16:16 UTC

head link

[R] Parallel assignments and goto

No clue, but see ?assign perhaps if you have not done so already.

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Feb 27, 2018 at 6:51 AM, Thomas Mailund <thomas.mailund at
gmail.com>
wrote:
> Interestingly, the <<- operator is also a lot faster than using a
> namespace explicitly, and only slightly slower than using <- with local
> variables, see below. But, surely, both must at some point insert values in
> a given environment ? either the local one, for <-, or an enclosing one,
> for <<- ? so I guess I am asking if there is a more low-level
assignment
> operation I can get my hands on without diving into C?
>
>
> factorial <- function(n, acc = 1) {
>     if (n == 1) acc
>     else factorial(n - 1, n * acc)
> }
>
> factorial_tr_manual <- function (n, acc = 1)
> {
>     repeat {
>         if (n <= 1)
>             return(acc)
>         else {
>             .tailr_n <- n - 1
>             .tailr_acc <- acc * n
>             n <- .tailr_n
>             acc <- .tailr_acc
>             next
>         }
>     }
> }
>
> factorial_tr_automatic_1 <- function(n, acc = 1) {
>     .tailr_n <- n
>     .tailr_acc <- acc
>     callCC(function(escape) {
>         repeat {
>             n <- .tailr_n
>             acc <- .tailr_acc
>             if (n <= 1) {
>                 escape(acc)
>             } else {
>                 .tailr_n <<- n - 1
>                 .tailr_acc <<- n * acc
>             }
>         }
>     })
> }
>
> factorial_tr_automatic_2 <- function(n, acc = 1) {
>     .tailr_env <- rlang::get_env()
>     callCC(function(escape) {
>         repeat {
>             if (n <= 1) {
>                 escape(acc)
>             } else {
>                 .tailr_env$.tailr_n <- n - 1
>                 .tailr_env$.tailr_acc <- n * acc
>                 .tailr_env$n <- .tailr_env$.tailr_n
>                 .tailr_env$acc <- .tailr_env$.tailr_acc
>             }
>         }
>     })
> }
>
> microbenchmark::microbenchmark(factorial(1000),
>                                factorial_tr_manual(1000),
>                                factorial_tr_automatic_1(1000),
>                                factorial_tr_automatic_2(1000))
> Unit: microseconds
>                            expr     min      lq      mean   median
>  uq      max neval
>                 factorial(1000) 884.137 942.060 1076.3949 977.6235
> 1042.5035 2889.779   100
>       factorial_tr_manual(1000) 110.215 116.919  130.2337 118.7350
>  122.7495  255.062   100
>  factorial_tr_automatic_1(1000) 179.897 183.437  212.8879 187.8250
>  195.7670  979.352   100
>  factorial_tr_automatic_2(1000) 508.353 534.328  601.9643 560.7830
>  587.8350 1424.260   100
>
> Cheers
>
> On 26 Feb 2018, 21.12 +0100, Thomas Mailund <thomas.mailund at
gmail.com>,
> wrote:
> > Following up on this attempt of implementing the tail-recursion
> optimisation ? now that I?ve finally had the chance to look at it again ? I
> find that non-local return implemented with callCC doesn?t actually incur
> much overhead once I do it more sensibly. I haven?t found a good way to
> handle parallel assignments that isn?t vastly slower than simply
> introducing extra variables, so I am going with that solution. However, I
> have now run into another problem involving those local variables ? and
> assigning to local variables in general.
> >
> > Consider again the factorial function and three different ways of
> implementing it using the tail recursion optimisation:
> >
> > factorial <- function(n, acc = 1) {
> >     if (n == 1) acc
> >     else factorial(n - 1, n * acc)
> > }
> >
> > factorial_tr_manual <- function (n, acc = 1)
> > {
> >     repeat {
> >         if (n <= 1)
> >             return(acc)
> >         else {
> >             .tailr_n <- n - 1
> >             .tailr_acc <- acc * n
> >             n <- .tailr_n
> >             acc <- .tailr_acc
> >             next
> >         }
> >     }
> > }
> >
> > factorial_tr_automatic_1 <- function(n, acc = 1) {
> >     callCC(function(escape) {
> >         repeat {
> >             if (n <= 1) {
> >                 escape(acc)
> >             } else {
> >                 .tailr_n <- n - 1
> >                 .tailr_acc <- n * acc
> >                 n <- .tailr_n
> >                 acc <- .tailr_acc
> >             }
> >         }
> >     })
> > }
> >
> > factorial_tr_automatic_2 <- function(n, acc = 1) {
> >     .tailr_env <- rlang::get_env()
> >     callCC(function(escape) {
> >         repeat {
> >             if (n <= 1) {
> >                 escape(acc)
> >             } else {
> >                 .tailr_env$.tailr_n <- n - 1
> >                 .tailr_env$.tailr_acc <- n * acc
> >                 .tailr_env$n <- .tailr_env$.tailr_n
> >                 .tailr_env$acc <- .tailr_env$.tailr_acc
> >             }
> >         }
> >     })
> > }
> >
> > The factorial_tr_manual function is how I would implement the function
> manually while factorial_tr_automatic_1 is what my package used to come up
> with. It handles non-local returns, because this is something I need in
> general. Finally, factorial_tr_automatic_2 accesses the local variables
> explicitly through the environment, which is what my package currently
> produces.
> >
> > The difference between supporting non-local returns and not is tiny,
but
> explicitly accessing variables through their environment costs me about a
> factor of five ? something that surprised me.
> >
> > > microbenchmark::microbenchmark(factorial(1000),
> > +                                factorial_tr_manual(1000),
> > +                                factorial_tr_automatic_1(1000),
> > +                                factorial_tr_automatic_2(1000))
> > Unit: microseconds
> >                            expr     min       lq     mean   median
> >                 factorial(1000) 756.357 810.4135 963.1040 856.3315
> >       factorial_tr_manual(1000) 104.838 119.7595 198.7347 129.0870
> >  factorial_tr_automatic_1(1000) 112.354 125.5145 211.6148 135.5255
> >  factorial_tr_automatic_2(1000) 461.015 544.7035 688.5988 565.3240
> >        uq      max neval
> >  945.3110 4149.099   100
> >  136.8200 4190.331   100
> >  152.9625 5944.312   100
> >  600.5235 7798.622   100
> >
> > The simple solution, of course, is to not do that, but then I can?t
> handle expressions inside calls to ?with?. And I would really like to,
> because then I can combine tail recursion with pattern matching.
> >
> > I can define linked lists and a length function on them like this:
> >
> > library(pmatch)
> > llist := NIL | CONS(car, cdr : llist)
> >
> > llength <- function(llist, acc = 0) {
> >     cases(llist,
> >           NIL -> acc,
> >           CONS(car, cdr) -> llength(cdr, acc + 1))
> > }
> >
> > The tail-recursion I get out of transforming this function looks like
> this:
> >
> > llength_tr <- function (llist, acc = 0) {
> >     .tailr_env <- rlang::get_env()
> >     callCC(function(escape) {
> >         repeat {
> >             if (!rlang::is_null(..match_env <- test_pattern(llist,
> >                                                             NIL)))
> >                 with(..match_env, escape(acc))
> >
> >             else if (!rlang::is_null(..match_env <-
> >                                      test_pattern(llist, CONS(car,
> cdr))))
> >                 with(..match_env, {
> >                     .tailr_env$.tailr_llist <- cdr
> >                     .tailr_env$.tailr_acc <- acc + 1
> >                     .tailr_env$llist <- .tailr_env$.tailr_llist
> >                     .tailr_env$acc <- .tailr_env$.tailr_acc
> >                 })
> >         }
> >     })
> > }
> >
> > Maybe not the prettiest code, but you are not supposed to actually see
> it, of course.
> >
> > There is not much gain in speed
> >
> > Unit: milliseconds
> >                    expr      min       lq     mean   median       uq
> >     llength(test_llist) 70.74605 76.08734 87.78418 85.81193 94.66378
> >  llength_tr(test_llist) 45.16946 51.56856 59.09306 57.00101 63.07044
> >       max neval
> >  182.4894   100
> >  166.6990   100
> >
> > but you don?t run out of stack space
> >
> > > llength(make_llist(1000))
> > Error: evaluation nested too deeply: infinite recursion /
> options(expressions=)?
> > Error during wrapup: C stack usage  7990648 is too close to the limit
> > > llength_tr(make_llist(1000))
> > [1] 1000
> >
> > I should be able to make the function go faster if I had a faster way
of
> handling the variable assignments, but inside ?with?, I?m not sure how to
> do that?
> >
> > Any suggestions?
> >
> > Cheers
> >
> > On 11 Feb 2018, 16.48 +0100, Thomas Mailund <thomas.mailund at
gmail.com>,
> wrote:
> > > Hi guys,
> > >
> > > I am working on some code for automatically translating recursive
> functions into looping functions to implemented tail-recursion
> optimisations. See https://github.com/mailund/tailr
> > >
> > > As a toy-example, consider the factorial function
> > >
> > > factorial <- function(n, acc = 1) {
> > > if (n <= 1) acc
> > > else factorial(n - 1, acc * n)
> > > }
> > >
> > > I can automatically translate this into the loop-version
> > >
> > > factorial_tr_1 <- function (n, acc = 1)
> > > {
> > > repeat {
> > > if (n <= 1)
> > > return(acc)
> > > else {
> > > .tailr_n <- n - 1
> > > .tailr_acc <- acc * acc
> > > n <- .tailr_n
> > > acc <- .tailr_acc
> > > next
> > > }
> > > }
> > > }
> > >
> > > which will run faster and not have problems with recursion
depths.
> However, I?m not entirely happy with this version for two reasons: I am not
> happy with introducing the temporary variables and this rewrite will not
> work if I try to over-scope an evaluation context.
> > >
> > > I have two related questions, one related to parallel assignments
?
> i.e. expressions to variables so the expression uses the old variable
> values and not the new values until the assignments are all done ? and one
> related to restarting a loop from nested loops or from nested expressions
> in `with` expressions or similar.
> > >
> > > I can implement parallel assignment using something like
> rlang::env_bind:
> > >
> > > factorial_tr_2 <- function (n, acc = 1)
> > > {
> > > .tailr_env <- rlang::get_env()
> > > repeat {
> > > if (n <= 1)
> > > return(acc)
> > > else {
> > > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > > next
> > > }
> > > }
> > > }
> > >
> > > This reduces the number of additional variables I need to one,
but is
> a couple of orders of magnitude slower than the first version.
> > >
> > > > microbenchmark::microbenchmark(factorial(100),
> > > + factorial_tr_1(100),
> > > + factorial_tr_2(100))
> > > Unit: microseconds
> > > expr min lq mean median uq max neval
> > > factorial(100) 53.978 60.543 77.76203 71.0635 85.947 180.251 100
> > > factorial_tr_1(100) 9.022 9.903 11.52563 11.0430 11.984 28.464
100
> > > factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830
6756.463
> 8177.635 100
> > >
> > >
> > > Is there another way to do parallel assignments that doesn?t cost
this
> much in running time?
> > >
> > > My other problem is the use of `next`. I would like to combine
> tail-recursion optimisation with pattern matching as in
> https://github.com/mailund/pmatch where I can, for example, define a
> linked list like this:
> > >
> > > devtools::install_github("mailund/pmatch?)
> > > library(pmatch)
> > > llist := NIL | CONS(car, cdr : llist)
> > >
> > > and define a function for computing the length of a list like
this:
> > >
> > > list_length <- function(lst, acc = 0) {
> > > force(acc)
> > > cases(lst,
> > > NIL -> acc,
> > > CONS(car, cdr) -> list_length(cdr, acc + 1))
> > > }
> > >
> > > The `cases` function creates an environment that binds variables
in a
> pattern-description that over-scopes the expression to the right of
`->`,
> so the recursive call in this example have access to the variables `cdr`
> and `car`.
> > >
> > > I can transform a `cases` call to one that creates the
environment
> containing the bound variables and then evaluate this using `eval` or
> `with`, but in either case, a call to `next` will not work in such a
> context. The expression will be evaluated inside `bind` or `with`, and not
> in the `list_lenght` function.
> > >
> > > A version that *will* work, is something like this
> > >
> > > factorial_tr_3 <- function (n, acc = 1)
> > > {
> > > .tailr_env <- rlang::get_env()
> > > .tailr_frame <- rlang::current_frame()
> > > repeat {
> > > if (n <= 1)
> > > rlang::return_from(.tailr_frame, acc)
> > > else {
> > > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > > rlang::return_to(.tailr_frame)
> > > }
> > > }
> > > }
> > >
> > > Here, again, for the factorial function since this is easier to
follow
> than the list-length function.
> > >
> > > This solution will also work if you return values from inside
loops,
> where `next` wouldn?t work either.
> > >
> > > Using `rlang::return_from` and `rlang::return_to` implements the
right
> semantics, but costs me another order of magnitude in running time.
> > >
> > > microbenchmark::microbenchmark(factorial(100),
> > > factorial_tr_1(100),
> > > factorial_tr_2(100),
> > > factorial_tr_3(100))
> > > Unit: microseconds
> > > expr min lq mean median uq max neval
> > > factorial(100) 52.479 60.2640 93.43069 67.5130 83.925 2062.481
100
> > > factorial_tr_1(100) 8.875 9.6525 49.19595 10.6945 11.217 3818.823
100
> > > factorial_tr_2(100) 5296.350 5525.0745 5973.77664 5737.8730
6260.128
> 8471.301 100
> > > factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725
> 89859.169 171039.228 100
> > >
> > > I can live with the ?introducing extra variables? solution to
parallel
> assignment, and I could hack my way out of using `with` or `bind` in
> rewriting `cases`, but restarting a `repeat` loop would really make for a
> nicer solution. I know that `goto` is considered harmful, but really, in
> this case, it is what I want.
> > >
> > > A `callCC` version also solves the problem
> > >
> > > factorial_tr_4 <- function(n, acc = 1) {
> > > function_body <- function(continuation) {
> > > if (n <= 1) {
> > > continuation(acc)
> > > } else {
> > > continuation(list("continue", n = n - 1, acc = acc *
n))
> > > }
> > > }
> > > repeat {
> > > result <- callCC(function_body)
> > > if (is.list(result) && result[[1]] ==
"continue") {
> > > n <- result$n
> > > acc <- result$acc
> > > next
> > > } else {
> > > return(result)
> > > }
> > > }
> > > }
> > >
> > > But this requires that I know how to distinguish between a valid
> return value and a tag for ?next? and is still a lot slower than the `next`
> solution
> > >
> > > microbenchmark::microbenchmark(factorial(100),
> > > factorial_tr_1(100),
> > > factorial_tr_2(100),
> > > factorial_tr_3(100),
> > > factorial_tr_4(100))
> > > Unit: microseconds
> > > expr min lq mean median uq max neval
> > > factorial(100) 54.109 61.8095 81.33167 81.8785 89.748 243.554 100
> > > factorial_tr_1(100) 9.025 9.9035 11.38607 11.1990 12.008 22.375
100
> > > factorial_tr_2(100) 5272.524 5798.3965 6302.40467 6077.7180
6492.959
> 9967.237 100
> > > factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665
> 75405.054 203785.673 100
> > > factorial_tr_4(100) 270.978 302.7890 337.48763 313.9930 334.096
> 1425.702 100
> > >
> > > I don?t necessarily need the tail-recursion optimisation to be
faster
> than the recursive version; just getting out of the problem of too deep
> recursions is a benefit, but I would rather not pay with an order of
> magnitude for it. I could, of course, try to handle cases that works with
> `next` in one way, and other cases using `callCC`, but I feel it should be
> possible with a version that handles all cases the same way.
> > >
> > > Is there any way to achieve this?
> > >
> > > Cheers
> > > Thomas
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Thomas Mailund

2018-Feb-27 16:46 UTC

head link

[R] Parallel assignments and goto

I did try assign. That was the slowest version from what my profiling could
tell, as far as I recall, which really surprised me. I had expected it to be the
fastest. The second slowest was using the [[ operator on environments. Or it
might be the reverse for those two. They were both slower than the other
versions I posted here.

Cheers

On 27 Feb 2018, 17.16 +0100, Bert Gunter <bgunter.4567 at gmail.com>,
wrote:> No clue, but see ?assign perhaps if you have not done so already.
>
> -- Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
> > On Tue, Feb 27, 2018 at 6:51 AM, Thomas Mailund <thomas.mailund at
gmail.com> wrote:
> > > Interestingly, the <<- operator is also a lot faster than
using a namespace explicitly, and only slightly slower than using <- with
local variables, see below. But, surely, both must at some point insert values
in a given environment ? either the local one, for <-, or an enclosing one,
for <<- ? so I guess I am asking if there is a more low-level assignment
operation I can get my hands on without diving into C?
> > >
> > >
> > > factorial <- function(n, acc = 1) {
> > > ? ? if (n == 1) acc
> > > ? ? else factorial(n - 1, n * acc)
> > > }
> > >
> > > factorial_tr_manual <- function (n, acc = 1)
> > > {
> > > ? ? repeat {
> > > ? ? ? ? if (n <= 1)
> > > ? ? ? ? ? ? return(acc)
> > > ? ? ? ? else {
> > > ? ? ? ? ? ? .tailr_n <- n - 1
> > > ? ? ? ? ? ? .tailr_acc <- acc * n
> > > ? ? ? ? ? ? n <- .tailr_n
> > > ? ? ? ? ? ? acc <- .tailr_acc
> > > ? ? ? ? ? ? next
> > > ? ? ? ? }
> > > ? ? }
> > > }
> > >
> > > factorial_tr_automatic_1 <- function(n, acc = 1) {
> > > ? ? .tailr_n <- n
> > > ? ? .tailr_acc <- acc
> > > ? ? callCC(function(escape) {
> > > ? ? ? ? repeat {
> > > ? ? ? ? ? ? n <- .tailr_n
> > > ? ? ? ? ? ? acc <- .tailr_acc
> > > ? ? ? ? ? ? if (n <= 1) {
> > > ? ? ? ? ? ? ? ? escape(acc)
> > > ? ? ? ? ? ? } else {
> > > ? ? ? ? ? ? ? ? .tailr_n <<- n - 1
> > > ? ? ? ? ? ? ? ? .tailr_acc <<- n * acc
> > > ? ? ? ? ? ? }
> > > ? ? ? ? }
> > > ? ? })
> > > }
> > >
> > > factorial_tr_automatic_2 <- function(n, acc = 1) {
> > > ? ? .tailr_env <- rlang::get_env()
> > > ? ? callCC(function(escape) {
> > > ? ? ? ? repeat {
> > > ? ? ? ? ? ? if (n <= 1) {
> > > ? ? ? ? ? ? ? ? escape(acc)
> > > ? ? ? ? ? ? } else {
> > > ? ? ? ? ? ? ? ? .tailr_env$.tailr_n <- n - 1
> > > ? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- n * acc
> > > ? ? ? ? ? ? ? ? .tailr_env$n <- .tailr_env$.tailr_n
> > > ? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
> > > ? ? ? ? ? ? }
> > > ? ? ? ? }
> > > ? ? })
> > > }
> > >
> > > microbenchmark::microbenchmark(factorial(1000),
> > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_manual(1000),
> > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_1(1000),
> > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_automatic_2(1000))
> > > Unit: microseconds
> > > ? ? ? ? ? ? ? ? ? ? ? ? ? ?expr ? ? min ? ? ?lq ? ? ?mean ?
median ? ? ? ?uq ? ? ?max neval
> > > ? ? ? ? ? ? ? ? factorial(1000) 884.137 942.060 1076.3949
977.6235 1042.5035 2889.779 ? 100
> > > ? ? ? factorial_tr_manual(1000) 110.215 116.919 ?130.2337
118.7350 ?122.7495 ?255.062 ? 100
> > > ?factorial_tr_automatic_1(1000) 179.897 183.437 ?212.8879
187.8250 ?195.7670 ?979.352 ? 100
> > > ?factorial_tr_automatic_2(1000) 508.353 534.328 ?601.9643
560.7830 ?587.8350 1424.260 ? 100
> > >
> > > Cheers
> > >
> > > On 26 Feb 2018, 21.12 +0100, Thomas Mailund <thomas.mailund at
gmail.com>, wrote:
> > > > Following up on this attempt of implementing the
tail-recursion optimisation ? now that I?ve finally had the chance to look at it
again ? I find that non-local return implemented with callCC doesn?t actually
incur much overhead once I do it more sensibly. I haven?t found a good way to
handle parallel assignments that isn?t vastly slower than simply introducing
extra variables, so I am going with that solution. However, I have now run into
another problem involving those local variables ? and assigning to local
variables in general.
> > > >
> > > > Consider again the factorial function and three different
ways of implementing it using the tail recursion optimisation:
> > > >
> > > > factorial <- function(n, acc = 1) {
> > > > ? ? if (n == 1) acc
> > > > ? ? else factorial(n - 1, n * acc)
> > > > }
> > > >
> > > > factorial_tr_manual <- function (n, acc = 1)
> > > > {
> > > > ? ? repeat {
> > > > ? ? ? ? if (n <= 1)
> > > > ? ? ? ? ? ? return(acc)
> > > > ? ? ? ? else {
> > > > ? ? ? ? ? ? .tailr_n <- n - 1
> > > > ? ? ? ? ? ? .tailr_acc <- acc * n
> > > > ? ? ? ? ? ? n <- .tailr_n
> > > > ? ? ? ? ? ? acc <- .tailr_acc
> > > > ? ? ? ? ? ? next
> > > > ? ? ? ? }
> > > > ? ? }
> > > > }
> > > >
> > > > factorial_tr_automatic_1 <- function(n, acc = 1) {
> > > > ? ? callCC(function(escape) {
> > > > ? ? ? ? repeat {
> > > > ? ? ? ? ? ? if (n <= 1) {
> > > > ? ? ? ? ? ? ? ? escape(acc)
> > > > ? ? ? ? ? ? } else {
> > > > ? ? ? ? ? ? ? ? .tailr_n <- n - 1
> > > > ? ? ? ? ? ? ? ? .tailr_acc <- n * acc
> > > > ? ? ? ? ? ? ? ? n <- .tailr_n
> > > > ? ? ? ? ? ? ? ? acc <- .tailr_acc
> > > > ? ? ? ? ? ? }
> > > > ? ? ? ? }
> > > > ? ? })
> > > > }
> > > >
> > > > factorial_tr_automatic_2 <- function(n, acc = 1) {
> > > > ? ? .tailr_env <- rlang::get_env()
> > > > ? ? callCC(function(escape) {
> > > > ? ? ? ? repeat {
> > > > ? ? ? ? ? ? if (n <= 1) {
> > > > ? ? ? ? ? ? ? ? escape(acc)
> > > > ? ? ? ? ? ? } else {
> > > > ? ? ? ? ? ? ? ? .tailr_env$.tailr_n <- n - 1
> > > > ? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- n * acc
> > > > ? ? ? ? ? ? ? ? .tailr_env$n <- .tailr_env$.tailr_n
> > > > ? ? ? ? ? ? ? ? .tailr_env$acc <- .tailr_env$.tailr_acc
> > > > ? ? ? ? ? ? }
> > > > ? ? ? ? }
> > > > ? ? })
> > > > }
> > > >
> > > > The?factorial_tr_manual function is how I would implement
the function manually while?factorial_tr_automatic_1 is what my package used to
come up with. It handles non-local returns, because this is something I need in
general. Finally,?factorial_tr_automatic_2 accesses the local variables
explicitly through the environment, which is what my package currently produces.
> > > >
> > > > The difference between supporting non-local returns and not
is tiny, but explicitly accessing variables through their environment costs me
about a factor of five ? something that surprised me.
> > > >
> > > > > microbenchmark::microbenchmark(factorial(1000),
> > > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?factorial_tr_manual(1000),
> > > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
?factorial_tr_automatic_1(1000),
> > > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
?factorial_tr_automatic_2(1000))
> > > > Unit: microseconds
> > > > ? ? ? ? ? ? ? ? ? ? ? ? ? ?expr ? ? min ? ? ? lq ? ? mean ?
median
> > > > ? ? ? ? ? ? ? ? factorial(1000) 756.357 810.4135 963.1040
856.3315
> > > > ? ? ? factorial_tr_manual(1000) 104.838 119.7595 198.7347
129.0870
> > > > ?factorial_tr_automatic_1(1000) 112.354 125.5145 211.6148
135.5255
> > > > ?factorial_tr_automatic_2(1000) 461.015 544.7035 688.5988
565.3240
> > > > ? ? ? ?uq ? ? ?max neval
> > > > ?945.3110 4149.099 ? 100
> > > > ?136.8200 4190.331 ? 100
> > > > ?152.9625 5944.312 ? 100
> > > > ?600.5235 7798.622 ? 100
> > > >
> > > > The simple solution, of course, is to not do that, but then
I can?t handle expressions inside calls to ?with?. And I would really like to,
because then I can combine tail recursion with pattern matching.
> > > >
> > > > I can define linked lists and a length function on them like
this:
> > > >
> > > > library(pmatch)
> > > > llist := NIL | CONS(car, cdr : llist)
> > > >
> > > > llength <- function(llist, acc = 0) {
> > > > ? ? cases(llist,
> > > > ? ? ? ? ? NIL -> acc,
> > > > ? ? ? ? ? CONS(car, cdr) -> llength(cdr, acc + 1))
> > > > }
> > > >
> > > > The tail-recursion I get out of transforming this function
looks like this:
> > > >
> > > > llength_tr <- function (llist, acc = 0) {
> > > > ? ? .tailr_env <- rlang::get_env()
> > > > ? ? callCC(function(escape) {
> > > > ? ? ? ? repeat {
> > > > ? ? ? ? ? ? if (!rlang::is_null(..match_env <-
test_pattern(llist,
> > > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
NIL)))
> > > > ? ? ? ? ? ? ? ? with(..match_env, escape(acc))
> > > >
> > > > ? ? ? ? ? ? else if (!rlang::is_null(..match_env <-
> > > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?test_pattern(llist,
CONS(car, cdr))))
> > > > ? ? ? ? ? ? ? ? with(..match_env, {
> > > > ? ? ? ? ? ? ? ? ? ? .tailr_env$.tailr_llist <- cdr
> > > > ? ? ? ? ? ? ? ? ? ? .tailr_env$.tailr_acc <- acc + 1
> > > > ? ? ? ? ? ? ? ? ? ? .tailr_env$llist <-
.tailr_env$.tailr_llist
> > > > ? ? ? ? ? ? ? ? ? ? .tailr_env$acc <-
.tailr_env$.tailr_acc
> > > > ? ? ? ? ? ? ? ? })
> > > > ? ? ? ? }
> > > > ? ? })
> > > > }
> > > >
> > > > Maybe not the prettiest code, but you are not supposed to
actually see it, of course.
> > > >
> > > > There is not much gain in speed
> > > >
> > > > Unit: milliseconds
> > > > ? ? ? ? ? ? ? ? ? ?expr ? ? ?min ? ? ? lq ? ? mean ? median
? ? ? uq
> > > > ? ? llength(test_llist) 70.74605 76.08734 87.78418 85.81193
94.66378
> > > > ?llength_tr(test_llist) 45.16946 51.56856 59.09306 57.00101
63.07044
> > > > ? ? ? max neval
> > > > ?182.4894 ? 100
> > > > ?166.6990 ? 100
> > > >
> > > > but you don?t run out of stack space
> > > >
> > > > > llength(make_llist(1000))
> > > > Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?
> > > > Error during wrapup: C stack usage ?7990648 is too close to
the limit
> > > > > llength_tr(make_llist(1000))
> > > > [1] 1000
> > > >
> > > > I should be able to make the function go faster if I had a
faster way of handling the variable assignments, but inside ?with?, I?m not sure
how to do that?
> > > >
> > > > Any suggestions?
> > > >
> > > > Cheers
> > > >
> > > > On 11 Feb 2018, 16.48 +0100, Thomas Mailund
<thomas.mailund at gmail.com>, wrote:
> > > > > Hi guys,
> > > > >
> > > > > I am working on some code for automatically translating
recursive functions into looping functions to implemented tail-recursion
optimisations. See https://github.com/mailund/tailr
> > > > >
> > > > > As a toy-example, consider the factorial function
> > > > >
> > > > > factorial <- function(n, acc = 1) {
> > > > > if (n <= 1) acc
> > > > > else factorial(n - 1, acc * n)
> > > > > }
> > > > >
> > > > > I can automatically translate this into the
loop-version
> > > > >
> > > > > factorial_tr_1 <- function (n, acc = 1)
> > > > > {
> > > > > repeat {
> > > > > if (n <= 1)
> > > > > return(acc)
> > > > > else {
> > > > > .tailr_n <- n - 1
> > > > > .tailr_acc <- acc * acc
> > > > > n <- .tailr_n
> > > > > acc <- .tailr_acc
> > > > > next
> > > > > }
> > > > > }
> > > > > }
> > > > >
> > > > > which will run faster and not have problems with
recursion depths. However, I?m not entirely happy with this version for two
reasons: I am not happy with introducing the temporary variables and this
rewrite will not work if I try to over-scope an evaluation context.
> > > > >
> > > > > I have two related questions, one related to parallel
assignments ? i.e. expressions to variables so the expression uses the old
variable values and not the new values until the assignments are all done ? and
one related to restarting a loop from nested loops or from nested expressions in
`with` expressions or similar.
> > > > >
> > > > > I can implement parallel assignment using something
like rlang::env_bind:
> > > > >
> > > > > factorial_tr_2 <- function (n, acc = 1)
> > > > > {
> > > > > .tailr_env <- rlang::get_env()
> > > > > repeat {
> > > > > if (n <= 1)
> > > > > return(acc)
> > > > > else {
> > > > > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > > > > next
> > > > > }
> > > > > }
> > > > > }
> > > > >
> > > > > This reduces the number of additional variables I need
to one, but is a couple of orders of magnitude slower than the first version.
> > > > >
> > > > > > microbenchmark::microbenchmark(factorial(100),
> > > > > + factorial_tr_1(100),
> > > > > + factorial_tr_2(100))
> > > > > Unit: microseconds
> > > > > expr min lq mean median uq max neval
> > > > > factorial(100) 53.978 60.543 77.76203 71.0635 85.947
180.251 100
> > > > > factorial_tr_1(100) 9.022 9.903 11.52563 11.0430 11.984
28.464 100
> > > > > factorial_tr_2(100) 5870.565 6109.905 6534.13607
6320.4830 6756.463 8177.635 100
> > > > >
> > > > >
> > > > > Is there another way to do parallel assignments that
doesn?t cost this much in running time?
> > > > >
> > > > > My other problem is the use of `next`. I would like to
combine tail-recursion optimisation with pattern matching as in
https://github.com/mailund/pmatch where I can, for example, define a linked list
like this:
> > > > >
> > > > > devtools::install_github("mailund/pmatch?)
> > > > > library(pmatch)
> > > > > llist := NIL | CONS(car, cdr : llist)
> > > > >
> > > > > and define a function for computing the length of a
list like this:
> > > > >
> > > > > list_length <- function(lst, acc = 0) {
> > > > > force(acc)
> > > > > cases(lst,
> > > > > NIL -> acc,
> > > > > CONS(car, cdr) -> list_length(cdr, acc + 1))
> > > > > }
> > > > >
> > > > > The `cases` function creates an environment that binds
variables in a pattern-description that over-scopes the expression to the right
of `->`, so the recursive call in this example have access to the variables
`cdr` and `car`.
> > > > >
> > > > > I can transform a `cases` call to one that creates the
environment containing the bound variables and then evaluate this using `eval`
or `with`, but in either case, a call to `next` will not work in such a context.
The expression will be evaluated inside `bind` or `with`, and not in the
`list_lenght` function.
> > > > >
> > > > > A version that *will* work, is something like this
> > > > >
> > > > > factorial_tr_3 <- function (n, acc = 1)
> > > > > {
> > > > > .tailr_env <- rlang::get_env()
> > > > > .tailr_frame <- rlang::current_frame()
> > > > > repeat {
> > > > > if (n <= 1)
> > > > > rlang::return_from(.tailr_frame, acc)
> > > > > else {
> > > > > rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> > > > > rlang::return_to(.tailr_frame)
> > > > > }
> > > > > }
> > > > > }
> > > > >
> > > > > Here, again, for the factorial function since this is
easier to follow than the list-length function.
> > > > >
> > > > > This solution will also work if you return values from
inside loops, where `next` wouldn?t work either.
> > > > >
> > > > > Using `rlang::return_from` and `rlang::return_to`
implements the right semantics, but costs me another order of magnitude in
running time.
> > > > >
> > > > > microbenchmark::microbenchmark(factorial(100),
> > > > > factorial_tr_1(100),
> > > > > factorial_tr_2(100),
> > > > > factorial_tr_3(100))
> > > > > Unit: microseconds
> > > > > expr min lq mean median uq max neval
> > > > > factorial(100) 52.479 60.2640 93.43069 67.5130 83.925
2062.481 100
> > > > > factorial_tr_1(100) 8.875 9.6525 49.19595 10.6945
11.217 3818.823 100
> > > > > factorial_tr_2(100) 5296.350 5525.0745 5973.77664
5737.8730 6260.128 8471.301 100
> > > > > factorial_tr_3(100) 77554.457 80757.0905 87307.28737
84004.0725 89859.169 171039.228 100
> > > > >
> > > > > I can live with the ?introducing extra variables?
solution to parallel assignment, and I could hack my way out of using `with` or
`bind` in rewriting `cases`, but restarting a `repeat` loop would really make
for a nicer solution. I know that `goto` is considered harmful, but really, in
this case, it is what I want.
> > > > >
> > > > > A `callCC` version also solves the problem
> > > > >
> > > > > factorial_tr_4 <- function(n, acc = 1) {
> > > > > function_body <- function(continuation) {
> > > > > if (n <= 1) {
> > > > > continuation(acc)
> > > > > } else {
> > > > > continuation(list("continue", n = n - 1, acc
= acc * n))
> > > > > }
> > > > > }
> > > > > repeat {
> > > > > result <- callCC(function_body)
> > > > > if (is.list(result) && result[[1]] ==
"continue") {
> > > > > n <- result$n
> > > > > acc <- result$acc
> > > > > next
> > > > > } else {
> > > > > return(result)
> > > > > }
> > > > > }
> > > > > }
> > > > >
> > > > > But this requires that I know how to distinguish
between a valid return value and a tag for ?next? and is still a lot slower than
the `next` solution
> > > > >
> > > > > microbenchmark::microbenchmark(factorial(100),
> > > > > factorial_tr_1(100),
> > > > > factorial_tr_2(100),
> > > > > factorial_tr_3(100),
> > > > > factorial_tr_4(100))
> > > > > Unit: microseconds
> > > > > expr min lq mean median uq max neval
> > > > > factorial(100) 54.109 61.8095 81.33167 81.8785 89.748
243.554 100
> > > > > factorial_tr_1(100) 9.025 9.9035 11.38607 11.1990
12.008 22.375 100
> > > > > factorial_tr_2(100) 5272.524 5798.3965 6302.40467
6077.7180 6492.959 9967.237 100
> > > > > factorial_tr_3(100) 66186.080 72336.2810 76480.75172
73632.9665 75405.054 203785.673 100
> > > > > factorial_tr_4(100) 270.978 302.7890 337.48763 313.9930
334.096 1425.702 100
> > > > >
> > > > > I don?t necessarily need the tail-recursion
optimisation to be faster than the recursive version; just getting out of the
problem of too deep recursions is a benefit, but I would rather not pay with an
order of magnitude for it. I could, of course, try to handle cases that works
with `next` in one way, and other cases using `callCC`, but I feel it should be
possible with a version that handles all cases the same way.
> > > > >
> > > > > Is there any way to achieve this?
> > > > >
> > > > > Cheers
> > > > > Thomas
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > >
> > > ? ? ? ? [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible
code.
>
	[[alternative HTML version deleted]]

Maybe Matching Threads

Search for more apparently analagous threads

R help - Feb 2018 - Parallel assignments and goto

[R] Parallel assignments and goto

[R] Parallel assignments and goto

[R] Parallel assignments and goto

Maybe Matching Threads