Hi all, I know this issue has been discussed a few times in the past already, but Martin Maechler suggested in a bug report [1] that I raise it here. Basically, there is no (easy) way of printing NAs for all variables when calling xtabs() on factors. Passing 'exclude=NULL, na.action=na.pass' works for character vectors, but not for factors.> test <- data.frame(x=c("a",NA)) > xtabs(~ x, exclude=NULL,na.action=na.pass, data=test) x a? 1?> test <- data.frame(x=factor(c("a",NA))) > xtabs(~ x, exclude=NULL,na.action=na.pass, data=test) x a? 1? Even if it's documented, this inconsistency is annoying. When checking data, it is often useful to print all NA values temporarily, without calling addNA() individually on all crossed variables. Would it make sense to add a new argument similar to table()'s useNA which would behave the same for all input vector types? Regards 1: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=14630
>>>>> Milan Bouchet-Valat <nalimilan at club.fr> >>>>> on Thu, 19 Jan 2017 13:58:31 +0100 writes:> Hi all, > I know this issue has been discussed a few times in the past already, > but Martin Maechler suggested in a bug report [1] that I raise it here. > > Basically, there is no (easy) way of printing NAs for all variables > when calling xtabs() on factors. Passing 'exclude=NULL, > na.action=na.pass' works for character vectors, but not for factors. >[ yes, but your example below is *not* showing that ... so may be a bit confusing !] {Reason: stringsAsFactors etc}> > test <- data.frame(x=c("a",NA)) > > xtabs(~ x, exclude=NULL, > na.action=na.pass, data=test) > x > a? > 1? > > > test <- data.frame(x=factor(c("a",NA))) > > xtabs(~ x, exclude=NULL, > na.action=na.pass, data=test) > x > a? > 1? > > > Even if it's documented, this inconsistency is annoying. When checking > data, it is often useful to print all NA values temporarily, without > calling addNA() individually on all crossed variables.{Note this is not (just) about print()ing; the issue is about the resulting *object*.}> > Would it make sense to add a new argument similar to table()'s useNA > which would behave the same for all input vector types?You have to be aware that table() has been changed since R 3.3.2, i.e., is different in R-devel and hence will be different in R 3.4.0. table()'s handling of NAs has become very involved / sophisticated(*), and currently I'd rather like to keep xtabs()'s behavior much simpler. Interestingly, after starting to play with data containing NA's and xtabs(*, na.action=na.pass) I have already detected bugs (for sparse=TRUE) and cases where the current xtabs() behavior seems dubious to me. So, the issue is --- as so often --- more involved than assumed initially. We (R core) will probably do something, but do need more time before we can promise anything more... Thank you for raising the issue, Martin Maechler, ETH Zurich *) R-devel sources always current at https://svn.r-project.org/R/trunk/src/library/base/R/table.R> > Regards> [1] https://bugs.r-project.org/bugzilla/show_bug.cgi?id=14630
Le vendredi 20 janvier 2017 ? 18:59 +0100, Martin Maechler a ?crit?:> > > > > > > > > > > > Milan Bouchet-Valat <nalimilan at club.fr> > > > > > > ????on Thu, 19 Jan 2017 13:58:31 +0100 writes: > > Hi all, > > I know this issue has been discussed a few times in the past already, > > but Martin Maechler suggested in a bug report [1] that I raise it here. > > > > Basically, there is no (easy) way of printing NAs for all variables > > when calling xtabs() on factors. Passing 'exclude=NULL, > > na.action=na.pass' works for character vectors, but not for factors. > > > > [ yes, but your example below is *not* showing that ... so may be > ? a bit confusing !]??{Reason: stringsAsFactors etc}Yes, sorry, that illustrates why?one should never try to make an example prettier in the last minute. For reference, here's the correct example:> test <- data.frame(x=c("a",NA), stringsAsFactors=FALSE) > xtabs(~ x, exclude=NULL, na.action=na.pass, data=test)x ???a <NA>? ???1????1?> test <- data.frame(x=factor(c("a",NA))) > xtabs(~ x, exclude=NULL, na.action=na.pass, data=test)x a? 1?> > > test <- data.frame(x=c("a",NA)) > > > xtabs(~ x, exclude=NULL, > > > > na.action=na.pass, data=test) > > x > > a? > > 1? > > > > > test <- data.frame(x=factor(c("a",NA))) > > > xtabs(~ x, exclude=NULL, > > > > na.action=na.pass, data=test) > > x > > a? > > 1? > > > > > > Even if it's documented, this inconsistency is annoying. When checking > > data, it is often useful to print all NA values temporarily, without > > calling addNA() individually on all crossed variables. > > ? {Note this is not (just) about print()ing; the issue is > ???about the resulting *object*.} > > > > Would it make sense to add a new argument similar to table()'s useNA > > which would behave the same for all input vector types? > > You have to be aware that??table()??has been changed since R > 3.3.2, i.e., is different in R-devel and hence will be different > in R 3.4.0. > table()'s handling of NAs has become very involved / > sophisticated(*), and currently I'd rather like to keep > xtabs()'s behavior much simpler.? > > Interestingly, after starting to play with data containing NA's and > ? xtabs(*, na.action=na.pass) > I have already detected bugs (for sparse=TRUE) and cases where > the current xtabs() behavior seems dubious to me. > So, the issue is --- as so often --- more involved than assumed initially. > > We (R core) will probably do something, but do need more time > before we can promise anything more...OK, thanks. Given for how long this behavior has existed, there's certainly no hurry... Regards> Thank you for raising the issue, > Martin Maechler, ETH Zurich > > > *) R-devel sources always current at > ???https://svn.r-project.org/R/trunk/src/library/base/R/table.R > > > > > Regards > > [1] https://bugs.r-project.org/bugzilla/show_bug.cgi?id=14630