Hi everyone, it looks like geom_ribbon removes missing values and plots a single ribbon over the whole interval of x values. However, I'd rather want it to act like geom_line, that is, interrupt the ribbon for the interval of missing values and continue once there are new values. Here's an example: library(ggplot2) df <- data.frame( date = seq(from = as.Date("2010-05-15"), to = as.Date("2010-05-24"), by = "1 day"), low = c(4, 5, 4, 5, NA, NA, 4, 5, 4, 5), mid = c(8, 9, 8, 9, NA, NA, 8, 9, 8, 9), high = c(12, 13, 12, 13, NA, NA, 12, 13, 12, 13)) ggplot(df, aes(x = date, y = mid, ymin = low, ymax = high)) + geom_line() + geom_ribbon(fill = alpha("blue", 0.5)) When running this code, R tells me: Warning message: Removed 2 rows containing missing values (geom_ribbon). When you look at the graph, you can see that the line stops at May 18 and starts again on May 21. But the ribbon reaches from May 15 to 24, even though there are no values on May 19 and 20. Is there an option that I could set? Or a geom/stat that I should use instead? In my pre-ggplot2 times I used polygon(), but I figured there must be something better in ggplot2 (as there has always been so far). Thanks, --Karsten
Hi Karsten, There's no easy way to do this because behind the scenes geom_ribbon uses grid.polygon. Hadley On Sun, May 30, 2010 at 7:26 AM, Karsten Loesing <karsten.loesing at gmx.net> wrote:> Hi everyone, > > it looks like geom_ribbon removes missing values and plots a single > ribbon over the whole interval of x values. However, I'd rather want it > to act like geom_line, that is, interrupt the ribbon for the interval of > missing values and continue once there are new values. Here's an example: > > library(ggplot2) > df <- data.frame( > ?date = seq(from = as.Date("2010-05-15"), > ? ? ? ? ? ? to = as.Date("2010-05-24"), > ? ? ? ? ? ? by = "1 day"), > ?low = c(4, 5, 4, 5, NA, NA, 4, 5, 4, 5), > ?mid = c(8, 9, 8, 9, NA, NA, 8, 9, 8, 9), > ?high = c(12, 13, 12, 13, NA, NA, 12, 13, 12, 13)) > ggplot(df, aes(x = date, y = mid, ymin = low, ymax = high)) + > ?geom_line() + > ?geom_ribbon(fill = alpha("blue", 0.5)) > > When running this code, R tells me: > > Warning message: > Removed 2 rows containing missing values (geom_ribbon). > > When you look at the graph, you can see that the line stops at May 18 > and starts again on May 21. But the ribbon reaches from May 15 to 24, > even though there are no values on May 19 and 20. > > Is there an option that I could set? Or a geom/stat that I should use > instead? In my pre-ggplot2 times I used polygon(), but I figured there > must be something better in ggplot2 (as there has always been so far). > > Thanks, > --Karsten > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
Hi William, On 6/10/10 2:07 AM, William Dunlap wrote:> I'm not sure exactly what you want in poly_ids, but > if x is a vector of numbers that might contain NA's > and you want a vector of integers that identify each > run of non-NA's and are NA for each then you can get > it with > poly_id <- cumsum(is.na(x)) + 1 # bump count for each NA seen > poly_id[is.na(x)] <- NA > E.g., > > x<-c(1.5, 2.5, NA, 4.5, 5.5, 6.5, NA, 8.5, 9.5, NA, NA, 12.5) > > poly_ids <- cumsum(is.na(x)) + 1 > > poly_ids[is.na(x)] <- NA > > rbind(x, poly_ids) # to line up input and output > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] > [,12] > x 1.5 2.5 NA 4.5 5.5 6.5 NA 8.5 9.5 NA NA > 12.5 > poly_ids 1.0 1.0 NA 2.0 2.0 2.0 NA 3.0 3.0 NA NA > 5.0Great! That's exactly what I want in poly_ids. Thanks! Please find the new patch below. I also put a new branch on GitHub that is based on ggplot2 master and that has this patch. Note that I still don't know how to run ggplot2 from sources, so you'll have to trust in my copy-and-paste fu: http://github.com/kloesing/ggplot2/commit/177e69ae654da074 --- ggplot2-orig 2010-06-06 14:02:25.000000000 +0200 +++ ggplot2 2010-06-10 08:31:02.000000000 +0200 @@ -5044,9 +5044,16 @@ draw <- function(., data, scales, coordinates, na.rm = FALSE, ...) { - data <- remove_missing(data, na.rm, - c("x","ymin","ymax"), name = "geom_ribbon") data <- data[order(data$group, data$x), ] + + # Instead of removing NA values from the data and plotting a single + # polygon, we want to "stop" plotting the polygon whenever we're missing + # values and "start" a new polygon as soon as we have new values. We do + # this by creating an id vector for polygonGrob that has distinct + # polygon numbers for sequences of non-NA values and NA for NA values in + # the original data. Example: c(NA, 2, 2, 2, NA, NA, 4, 4, 4, NA) + poly_ids <- cumsum(is.na(data$ymin) | is.na(data$ymax)) +1 + poly_ids[is.na(data$ymin) | is.na(data$ymax)] <- NA tb <- with(data, coordinates$munch(data.frame(x=c(x, rev(x)), y=c(ymax, rev(ymin))), scales) @@ -5054,12 +5061,12 @@ with(data, ggname(.$my_name(), gTree(children=gList( ggname("fill", polygonGrob( - tb$x, tb$y, + tb$x, tb$y, id=c(poly_ids, rev(poly_ids)), default.units="native", gp=gpar(fill=alpha(fill, alpha), col=NA) )), ggname("outline", polygonGrob( - tb$x, tb$y, + tb$x, tb$y, id=c(poly_ids, rev(poly_ids)), default.units="native", gp=gpar(fill=NA, col=colour, lwd=size * .pt, lty=linetype) )) Best, --Karsten