Janko Thyson
2011-May-19 12:15 UTC
[Rd] Feature request: extend functionality of 'unlist()' by args 'delim=c("/", "_", etc.)' and 'keep.special=TRUE/FALSE'
Dear list, I hope this is the right place to post a feature request. If there's exists a more formal channel (e.g. as for bug reports), I'd appreciate a pointer. I work a lot with named nested lists with arbitrary degrees of "nestedness". In order to retrieve the names and/or values of "bottom layer/bottom tier", I love the functionality of 'unlist()', or 'names(unlist(x))', respectively as it avoids traversing the nested lists via recursive loop constructs. I'm also aware that the general suggestion is probably to keep nestedness as low as possible when working with lists, but arbitrary deeply nested lists came in quite handy for me as long as each element is named and as long as I can quickly add and retrieve element values via "name paths". Here's a little example list: lst <- list(a=list(a.1=list(a.1.1=NA, a.1.2=5), a.2=list()), b=NULL) It would be awesome if 'unlist(x)' could be extended with the following functionality: 1) An argument such as 'delim' that controls how the respective layer names are pasted. Right now, they are always separated by a dot: > names(unlist(lst)) [1] "a.a.1.a.1.1" "a.a.1.a.1.2" Desired: > names(unlist(lst, delim="/")) [1] "a/a.1/a.1.1" "a/a.1/a.1.2" > names(unlist(lst, delim="_")) [1] "a_a.1_a.1.1" "a_a.1_a.1.2" 2) An argument that allows to include either elements of zero length or of value NULL to be *included* in the resulting output. Right now, they are dropped (which makes perfect sense as NULL values and zero length values are dropped in vectors): > c(1,2, NULL, numeric()) [1] 1 2 > unlist(lst) a.a.1.a.1.1 a.a.1.a.1.2 NA 5 Desired: > unlist(lst, delim="/", keep.special=TRUE) $a/a.1/a.1.1 [1] NA $a/a.1/a.1.2 [1] 5 $a/a.2 list() $b NULL Of course, this would not be a true 'unlist' anymore, but something like 'retrieveBottomLayer()'. Thanks a lot for providing such fast stuff as 'unlist()'! Unfortunately, I don't know my way around internal C routines and therefore I would greatly appreciate if core team developers would consider my two suggestions. Best regards, Janko -- ------------------------------------------------------------------------ *Janko Thyson* janko.thyson@ku-eichstaett.de <mailto:janko.thyson@ku-eichstaett.de> Catholic University of Eichstätt-Ingolstadt Ingolstadt School of Management Statistics and Quantitative Methods Auf der Schanz 49 D-85049 Ingolstadt www.wfi.edu/lsqm <http://www.wfi.edu/lsqm> Fon: +49 841 937-1923 Fax: +49 841 937-1965 This e-mail and any attachment is for authorized use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. -- ------------------------------------------------------------------------ *Janko Thyson* janko.thyson@googlemail.com <mailto:janko.thyson@googlemail.com> Jesuitenstraße 3 D-85049 Ingolstadt Mobile: +49 (0)176 83294257 This e-mail and any attachment is for authorized use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. [[alternative HTML version deleted]]
Duncan Murdoch
2011-May-19 12:58 UTC
[Rd] Feature request: extend functionality of 'unlist()' by args 'delim=c("/", "_", etc.)' and 'keep.special=TRUE/FALSE'
On 19/05/2011 8:15 AM, Janko Thyson wrote:> Dear list, > > I hope this is the right place to post a feature request. If there's > exists a more formal channel (e.g. as for bug reports), I'd appreciate a > pointer.This is a good place to post.> > I work a lot with named nested lists with arbitrary degrees of > "nestedness". In order to retrieve the names and/or values of "bottom > layer/bottom tier", I love the functionality of 'unlist()', or > 'names(unlist(x))', respectively as it avoids traversing the nested > lists via recursive loop constructs. I'm also aware that the general > suggestion is probably to keep nestedness as low as possible when > working with lists, but arbitrary deeply nested lists came in quite > handy for me as long as each element is named and as long as I can > quickly add and retrieve element values via "name paths". > > Here's a little example list: > lst<- list(a=list(a.1=list(a.1.1=NA, a.1.2=5), a.2=list()), b=NULL) > > It would be awesome if 'unlist(x)' could be extended with the following > functionality: > > 1) An argument such as 'delim' that controls how the respective layer > names are pasted. > Right now, they are always separated by a dot: > > names(unlist(lst)) > [1] "a.a.1.a.1.1" "a.a.1.a.1.2" > Desired: > > names(unlist(lst, delim="/")) > [1] "a/a.1/a.1.1" "a/a.1/a.1.2" > > names(unlist(lst, delim="_")) > [1] "a_a.1_a.1.1" "a_a.1_a.1.2" > > 2) An argument that allows to include either elements of zero length or > of value NULL to be *included* in the resulting output. > Right now, they are dropped (which makes perfect sense as NULL values > and zero length values are dropped in vectors): > > c(1,2, NULL, numeric()) > [1] 1 2 > > unlist(lst) > a.a.1.a.1.1 a.a.1.a.1.2 > NA 5 > Desired: > > unlist(lst, delim="/", keep.special=TRUE) > $a/a.1/a.1.1 > [1] NA > > $a/a.1/a.1.2 > [1] 5 > > $a/a.2 > list() > > $b > NULL > Of course, this would not be a true 'unlist' anymore, but something like > 'retrieveBottomLayer()'. > > Thanks a lot for providing such fast stuff as 'unlist()'! Unfortunately, > I don't know my way around internal C routines and therefore I would > greatly appreciate if core team developers would consider my two > suggestions.The suggestions seem reasonable, but are difficult to implement. The problem is that unlist() is a generic function, but there's no unlist.default() in R: the default and method dispatch are implemented at the C level. Normally adding arguments to the default method doesn't cause problems elsewhere, because methods only need to be compatible with the generic. But since there's no way to modify the argument list of the default method in this case, the generic function would need to be modified, and that means every unlist method would need to be modified too. So I wouldn't want to take this on. In case someone else does, I'd suggest a different change than the "keep.special" argument. I think a "coerce=TRUE" argument would be better: If TRUE, you get the current behaviour, which coerces components according to the hierarchy listed on the help page. If FALSE, then no coercion is done, and unlist() just flattens the list into a new one, e.g. unlist( list(1, 2, NULL, list("A", "B")), coerce=FALSE) would return list(1, 2, NULL, "A", "B") instead of c("1", "2", "A", "B"). One workaround I thought of was to add an element to the list that couldn't be coerced, but this doesn't work. For example: e <- environment() # can't be coerced x <- list(1, 2, NULL, list("A", "B"), e) unlist(x) # Returns list(1,2,"A","B",e) I think it would be reasonable for this version to retain the NULL, since it is not doing any coercion. Duncan Murdoch
Reasonably Related Threads
- Flattening lists and environments (was: "how to flatten a list to the same level?")
- Problems building own package (Error: "package has been build before R-2.10.0")
- Problems building own package (Error: "package has been build before R-2.10.0")
- Function "nsl()" missing in package utils
- Can an object reference itself?