Janko Thyson
2011-May-19 12:15 UTC
[Rd] Feature request: extend functionality of 'unlist()' by args 'delim=c("/", "_", etc.)' and 'keep.special=TRUE/FALSE'
Dear list,
I hope this is the right place to post a feature request. If there's
exists a more formal channel (e.g. as for bug reports), I'd appreciate a
pointer.
I work a lot with named nested lists with arbitrary degrees of
"nestedness". In order to retrieve the names and/or values of
"bottom
layer/bottom tier", I love the functionality of 'unlist()', or
'names(unlist(x))', respectively as it avoids traversing the nested
lists via recursive loop constructs. I'm also aware that the general
suggestion is probably to keep nestedness as low as possible when
working with lists, but arbitrary deeply nested lists came in quite
handy for me as long as each element is named and as long as I can
quickly add and retrieve element values via "name paths".
Here's a little example list:
lst <- list(a=list(a.1=list(a.1.1=NA, a.1.2=5), a.2=list()), b=NULL)
It would be awesome if 'unlist(x)' could be extended with the following
functionality:
1) An argument such as 'delim' that controls how the respective layer
names are pasted.
Right now, they are always separated by a dot:
> names(unlist(lst))
[1] "a.a.1.a.1.1" "a.a.1.a.1.2"
Desired:
> names(unlist(lst, delim="/"))
[1] "a/a.1/a.1.1" "a/a.1/a.1.2"
> names(unlist(lst, delim="_"))
[1] "a_a.1_a.1.1" "a_a.1_a.1.2"
2) An argument that allows to include either elements of zero length or
of value NULL to be *included* in the resulting output.
Right now, they are dropped (which makes perfect sense as NULL values
and zero length values are dropped in vectors):
> c(1,2, NULL, numeric())
[1] 1 2
> unlist(lst)
a.a.1.a.1.1 a.a.1.a.1.2
NA 5
Desired:
> unlist(lst, delim="/", keep.special=TRUE)
$a/a.1/a.1.1
[1] NA
$a/a.1/a.1.2
[1] 5
$a/a.2
list()
$b
NULL
Of course, this would not be a true 'unlist' anymore, but something like
'retrieveBottomLayer()'.
Thanks a lot for providing such fast stuff as 'unlist()'! Unfortunately,
I don't know my way around internal C routines and therefore I would
greatly appreciate if core team developers would consider my two
suggestions.
Best regards,
Janko
--
------------------------------------------------------------------------
*Janko Thyson*
janko.thyson@ku-eichstaett.de <mailto:janko.thyson@ku-eichstaett.de>
Catholic University of Eichstätt-Ingolstadt
Ingolstadt School of Management
Statistics and Quantitative Methods
Auf der Schanz 49
D-85049 Ingolstadt
www.wfi.edu/lsqm <http://www.wfi.edu/lsqm>
Fon: +49 841 937-1923
Fax: +49 841 937-1965
This e-mail and any attachment is for authorized use by the intended
recipient(s) only. It may contain proprietary material, confidential
information and/or be subject to legal privilege. It should not be
copied, disclosed to, retained or used by any other party.
If you are not an intended recipient then please promptly delete this
e-mail and any attachment and all copies and inform the sender.
--
------------------------------------------------------------------------
*Janko Thyson*
janko.thyson@googlemail.com <mailto:janko.thyson@googlemail.com>
Jesuitenstraße 3
D-85049 Ingolstadt
Mobile: +49 (0)176 83294257
This e-mail and any attachment is for authorized use by the intended
recipient(s) only. It may contain proprietary material, confidential
information and/or be subject to legal privilege. It should not be
copied, disclosed to, retained or used by any other party.
If you are not an intended recipient then please promptly delete this
e-mail and any attachment and all copies and inform the sender.
[[alternative HTML version deleted]]
Duncan Murdoch
2011-May-19 12:58 UTC
[Rd] Feature request: extend functionality of 'unlist()' by args 'delim=c("/", "_", etc.)' and 'keep.special=TRUE/FALSE'
On 19/05/2011 8:15 AM, Janko Thyson wrote:> Dear list, > > I hope this is the right place to post a feature request. If there's > exists a more formal channel (e.g. as for bug reports), I'd appreciate a > pointer.This is a good place to post.> > I work a lot with named nested lists with arbitrary degrees of > "nestedness". In order to retrieve the names and/or values of "bottom > layer/bottom tier", I love the functionality of 'unlist()', or > 'names(unlist(x))', respectively as it avoids traversing the nested > lists via recursive loop constructs. I'm also aware that the general > suggestion is probably to keep nestedness as low as possible when > working with lists, but arbitrary deeply nested lists came in quite > handy for me as long as each element is named and as long as I can > quickly add and retrieve element values via "name paths". > > Here's a little example list: > lst<- list(a=list(a.1=list(a.1.1=NA, a.1.2=5), a.2=list()), b=NULL) > > It would be awesome if 'unlist(x)' could be extended with the following > functionality: > > 1) An argument such as 'delim' that controls how the respective layer > names are pasted. > Right now, they are always separated by a dot: > > names(unlist(lst)) > [1] "a.a.1.a.1.1" "a.a.1.a.1.2" > Desired: > > names(unlist(lst, delim="/")) > [1] "a/a.1/a.1.1" "a/a.1/a.1.2" > > names(unlist(lst, delim="_")) > [1] "a_a.1_a.1.1" "a_a.1_a.1.2" > > 2) An argument that allows to include either elements of zero length or > of value NULL to be *included* in the resulting output. > Right now, they are dropped (which makes perfect sense as NULL values > and zero length values are dropped in vectors): > > c(1,2, NULL, numeric()) > [1] 1 2 > > unlist(lst) > a.a.1.a.1.1 a.a.1.a.1.2 > NA 5 > Desired: > > unlist(lst, delim="/", keep.special=TRUE) > $a/a.1/a.1.1 > [1] NA > > $a/a.1/a.1.2 > [1] 5 > > $a/a.2 > list() > > $b > NULL > Of course, this would not be a true 'unlist' anymore, but something like > 'retrieveBottomLayer()'. > > Thanks a lot for providing such fast stuff as 'unlist()'! Unfortunately, > I don't know my way around internal C routines and therefore I would > greatly appreciate if core team developers would consider my two > suggestions.The suggestions seem reasonable, but are difficult to implement. The problem is that unlist() is a generic function, but there's no unlist.default() in R: the default and method dispatch are implemented at the C level. Normally adding arguments to the default method doesn't cause problems elsewhere, because methods only need to be compatible with the generic. But since there's no way to modify the argument list of the default method in this case, the generic function would need to be modified, and that means every unlist method would need to be modified too. So I wouldn't want to take this on. In case someone else does, I'd suggest a different change than the "keep.special" argument. I think a "coerce=TRUE" argument would be better: If TRUE, you get the current behaviour, which coerces components according to the hierarchy listed on the help page. If FALSE, then no coercion is done, and unlist() just flattens the list into a new one, e.g. unlist( list(1, 2, NULL, list("A", "B")), coerce=FALSE) would return list(1, 2, NULL, "A", "B") instead of c("1", "2", "A", "B"). One workaround I thought of was to add an element to the list that couldn't be coerced, but this doesn't work. For example: e <- environment() # can't be coerced x <- list(1, 2, NULL, list("A", "B"), e) unlist(x) # Returns list(1,2,"A","B",e) I think it would be reasonable for this version to retain the NULL, since it is not doing any coercion. Duncan Murdoch
Seemingly Similar Threads
- Flattening lists and environments (was: "how to flatten a list to the same level?")
- Problems building own package (Error: "package has been build before R-2.10.0")
- Problems building own package (Error: "package has been build before R-2.10.0")
- Function "nsl()" missing in package utils
- Can an object reference itself?