bogdan romocea
2006-Oct-05 15:26 UTC
[R] unexpected behavior of boxplot(x, notch=TRUE, log="y")
A function I've been using for a while returned a surprising [to me, given the data] error recently: Error in plot.window(xlim, ylim, log, asp, ...) : Logarithmic axis must have positive limits After some digging I realized what was going on: x <- c(10460.97, 10808.67, 29499.98, 1, 35818.62, 48535.59, 1, 1, 42512.1, 1627.39, 1, 7571.06, 21479.69, 25, 1, 16143.85, 12736.96, 1, 7603.63, 1, 33155.24, 1, 1, 50, 3361.78, 1, 37781.84, 1, 1, 1, 46492.05, 22334.88, 1, 1) summary(x) boxplot(x,notch=TRUE,log="y") #unexpected boxplot(x) #ok boxplot(x,log="y") #ok boxplot(x,notch=TRUE) #aha I can get around this, but thought that maybe boxplot() should be adjusted to deal with something like this on its own. Thank you, b. platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 4.0 year 2006 month 10 day 03 svn rev 39566 language R version.string R version 2.4.0 (2006-10-03)
Ben Bolker
2006-Oct-07 23:40 UTC
[R] unexpected behavior of boxplot(x, notch=TRUE, log="y")
bogdan romocea <br44114 <at> gmail.com> writes:> > A function I've been using for a while returned a surprising [to me, > given the data] error recently: > Error in plot.window(xlim, ylim, log, asp, ...) : > Logarithmic axis must have positive limits > > After some digging I realized what was going on: > x <- c(10460.97, 10808.67, 29499.98, 1, 35818.62, 48535.59, 1, 1, > 42512.1, 1627.39, 1, 7571.06, 21479.69, 25, 1, 16143.85, 12736.96, > 1, 7603.63, 1, 33155.24, 1, 1, 50, 3361.78, 1, 37781.84, 1, 1, > 1, 46492.05, 22334.88, 1, 1) > summary(x) > boxplot(x,notch=TRUE,log="y") #unexpected > boxplot(x) #ok > boxplot(x,log="y") #ok > boxplot(x,notch=TRUE) #aha >Mick Crawley (author of several books on ecological data analysis in R) submitted a related issue as bug #7690, which I was mildly surprised to see filed as "not reproducible" (I didn't have problems reproducing it at the time ... I posted my then-patch to R-devel at the time https://stat.ethz.ch/pipermail/r-devel/2006-January/036257.html ) The problem typically occurs for very small data sets, when the notches can be bigger than the hinges. As I said then,> I can imagine debate about what should be done in this case -- > you could just say "don't do that", since the notches are based > on an asymptotic argument ... the diff below just truncates > the notches to the hinges, but produces a warning saying that the > notches have been truncated.The interaction with log="y" is new to me, though, and my old patch didn't catch it. Here's my reproducible version: set.seed(1001) npts <- 7 X <- rnorm(2*npts,rep(c(3,4.5),each=npts),sd=1) f <- factor(rep(1:2,each=npts)) par(mfrow=c(1,2)) boxplot(X~f,notch=TRUE) A possible fix is to truncate the notches (and issue a warning) when this happens, in src/library/grDevices/R/calc.R: *** calc.R 2006-10-07 17:44:49.000000000 -0400 --- newcalc.R 2006-10-07 19:25:38.000000000 -0400 *************** *** 16,21 **** --- 16,26 ---- if(any(out[nna])) stats[c(1, 5)] <- range(x[!out], na.rm = TRUE) } conf <- if(do.conf) stats[3] + c(-1.58, 1.58) * iqr / sqrt(n) + if (do.conf) { + if (conf[1]<stats[2] || conf[2]>stats[4]) warning("confidence limits > hinges: notches truncated") + conf[1] <- max(conf[1],stats[2]) + conf[2] <- min(conf[2],stats[4]) + } list(stats = stats, n = n, conf = conf, out = if(do.out) x[out & nna] else numeric(0)) }