davison at stats.ox.ac.uk
2008-Aug-20 13:25 UTC
[Rd] unlist on nested lists of factors (PR#12572)
Here is a description and a proposed solution for a bug in unlist(). I've used version 2.7.2 RC (2008-08-18 r46382) to look at this, under linux. unlist(recursive=TRUE) incorrectly returns a factor with zero levels when passed either a nested list of factors, or a data frame containing only factor columns. You can't print() the result. x <- list(list(v=factor("a"))) str(unlist(x)) ## Factor w/ 0 levels: NA ## - attr(*, "names")= chr "v" ## Warning message: ## In str.default(unlist(x)) : 'object' does not have valid levels() y <- list(data.frame(v=factor("a"))) str(unlist(y)) ## Factor w/ 0 levels: NA ## - attr(*, "names")= chr "v" ## Warning message: ## In str.default(unlist(y)) : 'object' does not have valid levels() unlist is defined as unlist <- function(x, recursive=TRUE, use.names=TRUE) { if(.Internal(islistfactor(x, recursive))) { lv <- unique(.Internal(unlist(lapply(x, levels), recursive, FALSE))) nm <- if(use.names) names(.Internal(unlist(x, recursive, use.names))) res <- .Internal(unlist(lapply(x, as.character), recursive, FALSE)) res <- match(res, lv) ## we cannot make this ordered as level set may have been changed structure(res, levels=lv, names=nm, class="factor") } else .Internal(unlist(x, recursive, use.names)) } The error occurs because, in both cases, at the C level, islistfactor recurses and finds that all elements are factors, and the if test condition is TRUE. However, the two instances of lapply do not recurse, and return inappropriate results. A possible solution is to replace both instances of lapply with rapply. This results in appropriate factor answers in this case: str(unlist(x)) ## Factor w/ 1 level "a": 1 ## - attr(*, "names")= chr "v" str(unlist(y)) ## Factor w/ 1 level "a": 1 ## - attr(*, "names")= chr "v" An alternative is to not return a factor result, by altering the if test condition so that nested lists of factors, and lists of factor-only data frames, fail. Dan -- www.stats.ox.ac.uk/~davison