michael westphal
2010-May-09 17:33 UTC
[R] changing parameters of the box and whisker plot
Hello: I am plotting some data using the box and whisker plot. However, I only want to plot the median, max and min, as I only have these values and not the quartile values. It seems R arbitrarily constructs the box margins to be halfway between the median and the max/min. How do I make the box and whisker plots without the box values? Thanks, Michael [[alternative HTML version deleted]]
On May 9, 2010, at 1:33 PM, michael westphal wrote:> Hello: > > I am plotting some data using the box and whisker plot.What function?> However, I only want to plot the median, max and min, as I only > have these values and not the quartile values.You could just read the help page for bxp and send it values that yield what you want to see. ?boxplot.stats ?bxp Or ... using lattice::bwplot you can alter the stats function in panel.bwplot. ?panel.bwplot> It seems R arbitrarily constructs the box margins to be halfway > between the median and the max/min.Not exactly ... and not at all arbitrary. Those are at the Q1 and Q3 values.> How do I make the box and whisker plots without the box values? > > Thanks, > > MichaelDavid Winsemius, MD West Hartford, CT
Hi: Here's one way to get what you want, using the plyr and ggplot2 packages. # Fake some data dd <- data.frame(g = factor(rep(LETTERS[1:10], each = 30)), y = rnorm(300)) # Summarize to get min, median and max per group # Uses function ddply() in the plyr package... library(ggplot2) # loads plyr as well # Produce the min, max and median of the response y for each group dd2 <- ddply(dd, .(g), summarise, m = median(y), ymin = min(y), ymax max(y)) # Create a pointrange plot: p <- ggplot(dd2, aes(x = g, y = m, ymin = ymin, ymax = ymax)) p + geom_pointrange() This plot only requires the min, max and central value (in this case, the median), so it should work for you. Bells and whistles can be added rather easily. In the aes() clause, the x refers to the grouping variable, y to the response, and the ymin and ymax to the variables in the data frame (here, dd) that contain the min and max, respectively. If you already have these, the ddply() statement is unnecessary. On Sun, May 9, 2010 at 10:33 AM, michael westphal <mi_westphal@yahoo.com>wrote:> Hello: > > I am plotting some data using the box and whisker plot. However, I only > want to plot the median, max and min, as I only have these values and not > the quartile values. It seems R arbitrarily constructs the box margins to > be halfway between the median and the max/min. How do I make the box and > whisker plots without the box values? >R (or any other useful statistical software) does not 'arbitrarily' construct the box margins in a box plot. They are clearly defined to be the first and third quartiles of the distribution. Does the placement of the upper part of the box in the three boxplots coming from the code below look to be 'halfway' between the median and the maximum? (The maximum is not where the upper whisker is positioned, but rather where the highest point is plotted - the exception is where the two coincide.) There are plenty of references concerning the construction of a box plot, in books and on line. boxplot(cbind(rexp(50), rlnorm(50), rgamma(50, 5)), xaxt = 'n') axis(1, at = 1:3, labels = c('Exponential', 'Lognormal', 'Gamma')) HTH, Dennis [[alternative HTML version deleted]]