Duncan Murdoch
2023-Mar-11 16:11 UTC
[Rd] Multiple Assignment built into the R Interpreter?
On 11/03/2023 9:54 a.m., Sebastian Martin Krantz wrote:> Thanks Duncan, > > I know about list2env(), in fact a previous version of collapse::`%=%` > was coded as > > "%=%" <- function(lhs, rhs) { > ?? if(!is.character(lhs)) stop("lhs needs to be character") > ?? if(!is.list(rhs)) rhs <- as.vector(rhs, "list") > ?? if(length(lhs) != length(rhs)) stop("length(lhs) not equal to > length(rhs)") > ?? list2env(`names<-`(rhs, lhs), envir = parent.frame()) > ?? invisible() > } > > but as you say, the input needs to be converted to a list, and it calls > several R functions, which led me to end up writing `%=%` in C: > https://github.com/SebKrantz/collapse/blob/master/src/small_helper.c#L162 <https://github.com/SebKrantz/collapse/blob/master/src/small_helper.c#L162>. > This implementation works in the way you describe, i.e. it has separate > methods for all the standard vector types, and coerces to list otherwise. > > That being said, all implementations in packages falls short of being > very useful, because R CMD Check it will still require global bindings > for variables, > unless this becomes a standard feature of the language. So I cannot use > this in packages, and there is still a performance cost to it, in my > case a call to > .Call() and parent.frame(), which is quite low, but still high compared > to the cost of `<-` or `=`.Another R way to do what you're doing would be to stay within a list the whole time, i.e. code it as mats <- init_matrices(X, Y, Z) with(mats, ... do things with A, C, Q, and R ... ) This won't give warnings about globals, and it makes very clear that those 4 matrices are all closely related, and it allows you to work with multiple 4-tuples of matrices, etc.> So what I am requesting is indeed nothing less than to consider making > this a permanent feature of the language itself.That's clear, but your proposal violates a very basic property of the language, i.e. that all statements are expressions and have a value. What's the value of 1 + (A, C = init_matrices()) ? I think you would disallow the above (though you didn't address it when I raised it the first time), which means there would now be two kinds of statements: ones that are expressions and therefore can be used as function arguments, and ones that aren't.> Given that the other 3 major scientific computing languages (Matlab, > Python and Julia) have implemented it very successfully, > I don't think the general practicality of it should be an issue. > Regarding implementation in other languages, Julia works as follows: > > function init_matrices() > ??? A = 1; C = 2; Q = 3; R = 4 > ??? return A, C, Q, R > end > > res = init_matrices()???????????? # gives a Julia Tuple (A, C, Q, R) > A, C = init_matrices()?????????? # Works, A is 1, C is 2, the others are > droppedThat's pretty ugly having a singular LHS handled so much differently from a plural LHS.> A, C, Q, R = init_matrices()? # Standard > > I think as far as R is concerned multiple return values are not really > necessary given that one can always, > return(list(A, C, Q, R)), although of course there is also a cost to > list(). I also wouldn't mind being strict about it and > not allowing A, C = init_matrices(), but others might disagree.Another ambiguity: suppose f() returns list(A = 1, B = 2) and I do B, A <- f() Should assignment be by position or by name? Honestly, given that this is simply syntactic sugar, I don't think I would support it. Duncan Murdoch> > Best regards, > > Sebastian > > > On Sat, 11 Mar 2023 at 15:37, Duncan Murdoch <murdoch.duncan at gmail.com > <mailto:murdoch.duncan at gmail.com>> wrote: > > I think the standard way to do this in R is given by list2env(), as > described in a couple of answers on the SO page you linked. > > The syntax you proposed would be likely to be confusing in complex > expressions, e.g. > > ? ?f(A, C, Q, R = init_matrices(X, Y, Z)) > > would obviously not work but wouldn't trigger a syntax error, and > > ? ?f((A, C, Q, R = init_matrices(X, Y, Z))) > > could work, but looks too much like the previous one.? So I think R > would want Javascript-like > > ? ?[A, C, Q, R] <- init_matrices(X, Y, Z) > > instead.? But then the question would come up about how to handle the > RHS.? Does the function have to return a list?? What if the length of > the list is not 4?? Or is it just guaranteed to be equivalent to > > ? ?temp <- init_matrices(X, Y, Z) > ? ?A <- temp[[1]] > ? ?C <- temp[[2]] > ? ?Q <- temp[[3]] > ? ?R <- temp[[4]] > > which would work for other vector types besides lists? > > BTW, here's a little hack that almost works: > > `vals<-` <- function(x, ..., value) { > ? ? others <- substitute(list(...)) > ? ? if (length(others) > 1) > ? ? ? for (i in seq_along(others)[-1]) > ? ? ? ? assign(as.character(others[[i]]), value[[i]], envir > parent.frame()) > ? ? value[[1]] > } > > You call it as > > ? vals(a, b, c) <- 1:3 > > and it assigns 1 to a, 2 to b, and 3 to c.? It doesn't quite do what > you > want because it requires that a exists already, but b and c don't > have to. > > Duncan Murdoch > > On 11/03/2023 4:04 a.m., Sebastian Martin Krantz wrote: > > Dear R Core, > > > > working on my dynamic factor modelling package, which requires > several > > subroutines to create and update several system matrices, I come > back to > > the issue of being annoyed by R not supporting multiple > assignment out of > > the box like Matlab, Python and julia. e.g. something like > > > > A, C, Q, R = init_matrices(X, Y, Z) > > > > would be a great addition to the language. I know there are several > > workarounds such as the %<-% operator in the zeallot package or > my own %=% > > operator in collapse, but these don't work well for package > development as > > R CMD Check warns about missing global bindings for the created > variables, > > e.g. I would have to use > > > > A <- C <- Q <- R <- NULL > > .c(A, C, Q, R) %=% init_matrices(X, Y, Z) > > > > in a package, which is simply annoying. Of course the standard way of > > > > init <- init_matrices(X, Y, Z) > >? ?A <- init$A; C <- init$C; Q <- init$Q; R <- init$R > > rm(init) > > > > is also super cumbersome compared to Python or Julia. Another > reason is of > > course performance, even my %=% operator written in C has a > non-negligible > > performance cost for very tight loops, compared to a solution at the > > interpretor level or in a primitive function such as `=`. > > > > So my conclusion at this point is that it is just significantly > easier to > > implement such codes in Julia, in addition to the greater > performance it > > offers. There are obvious reasons why I am still coding in R and > C, thanks > > to the robust API and great ecosystem of packages, but adding > this could be > > a presumably low-hanging fruit to make my life a bit easier. > Several issues > > for this have been filed on Stackoverflow, the most popular one ( > > > https://stackoverflow.com/questions/7519790/assign-multiple-new-variables-on-lhs-in-a-single-line <https://stackoverflow.com/questions/7519790/assign-multiple-new-variables-on-lhs-in-a-single-line>) > > has been viewed 77 thousand times. > > > > But maybe this has already been discussed here and already > decided against. > > In that case, a way to browse R-devel archives to find out would > be nice. > > > > Best regards, > > > > Sebastian > > > >? ? ? ?[[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > <https://stat.ethz.ch/mailman/listinfo/r-devel> >
On Sat, 11 Mar 2023 11:11:06 -0500 Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> That's clear, but your proposal violates a very basic property of the > language, i.e. that all statements are expressions and have a value.How about reframing this feature request from multiple assignment (which does go contrary to "everything has only one value, even if it's sometimes invisible(NULL)") to "structured binding" / "destructuring assignment" [*], which takes this single single value returned by the expression and subsets it subject to certain rules? It may be easier to make a decision on the semantics for destructuring assignment (e.g. languages which have this feature typically allow throwing unneeded parts of the return value away), and it doesn't seem to break as much of the rest of the language if implemented. I see you've already mentioned it ("JavaScript-like"). I think it would fulfil Sebastian's requirements too, as long as it is considered "true assignment" by the rest of the language. The hard part is to propose the actual grammar of the new feature (in terms of src/main/gram.y, preferably without introducing conflicts) and its semantics (including the corner cases, some of which you have already mentioned). I'm not sure I'm up to the task. -- Best regards, Ivan [*] https://en.cppreference.com/w/cpp/language/structured_binding https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment