I have the task of producing some boxplot graphics with the requirement that these have the same general appearance as a set of such graphics as were produced last year. I do not have access to the code that was used to produce the "last year" graphics. There are multiple boxplots per figure and these boxplots appear in groups (with two boxplots in each group in the simplest instance; there are four or more per group in other instances, but I figure that if I can work out how to handle two, then ....). After a bit of Googling I found that ggplot() does basically what I want. However my mindset seems to be substantially incompatible with that of ggplot() and I cannot figure out how to make some adjustments which are needed in order to make my plots look like last year's. In last year's graphics the boxes were unfilled and were distinguished (within groups) by their boundary colours, which were "red" and "black" in the simple two-per-group instance. I achieved the "unfilled" effect by setting alpha=0 inside the call to geom_boxplot(). (Is this the Right Thing to Do?) However I cannot get the boundary colours of the boxes to be "red" and "black". I have attached a sourceable script ("demo.txt") showing what I have tried so far. I don't really understand the code; I simply copied and adjusted code that I saw on stackoverflow. (Fairly mindlessly I'm afraid.) Problems: (1) The borders of the boxes are distinct, but they are sort-of-pink and sort-of-blue, and I cannot for the life of me figure out how to make them red and black. (2) Putting in "color=Type" seemed to have the effect of creating two legends, one with the desired legend title but all in black, and one with legend title equal to "Type" but using the colours that actually appear. How can I get just one "appropriate" legend? (3) Last year's graphics have the x-axis starting at 0 (rather than at c. 3.5). I tried using + xlim(0,8.5) but got told "Error: Discrete value supplied to continuous scale". How can I make the appropriate adjustment? (4) Last year's graphics have y-axis tick marks, labels and grid lines at 700, 800, 900, ..., 2000, 2100. How can I reproduce this? I actually had several additional questions, but thought I'd better scrounge around a bit more before posting this, and thereby managed (mirabile dictu!) to answer them myself. Can anyone help me out with questions (1) --- (4)? Please keep it simple and very explicit, for I am a bear of very little brain and long words bother me! Thanks. cheers, Rolf Turner -- Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: demo.txt URL: <stat.ethz.ch/pipermail/r-help/attachments/20180728/693e4dbe/attachment-0002.txt>
When you understand the strong dependence on how the data controls ggplot, using it gets much easier. I still have to google details sometimes though. Note that it can be very difficult to make a weird plot (e.g. multiple parallel axes) in ggplot because it is very internally consistent... a blessing and a curse. 1) Colour is assigned in the scale according to order of levels of the factor. Note that while they are both discrete, the so-called "discrete" scales auto-colour, but "manual" scales require you to specify the exact colour sequence. 2) Assigning constants to properties is done outside the mapping (aes). Note that "colour" is for lines and shapes outlines, while "fill" is colour meant to fill in shapes. When the names of these two scales are the same and the values are the same, the legends will merge. If not, they will be shown separately. 3) Discrete scales are controlled by the levels in the data. To prevent ggplot from removing missing levels, use the drop=FALSE argument. 4) Breaks are a property of the scale. My changes were: Year <- factor( rep( 4:8, each = 50, times = 2 ), levels = 0:8 ) DemoDat <- data.frame(Year = Year, Score = c( X0 , X1 ), Type = Type ) ggplot( data = DemoDat , aes( x = Year, y = Score, color = Type ) , fill = NULL ) + geom_boxplot( position = position_dodge(1) ) + theme_minimal() + scale_colour_manual( name = "National v. Local" , values = c( "red", "black" ) ) + scale_x_discrete( drop = FALSE ) + scale_y_continuous( breaks = seq( 700, 2100, 100 ) ) Good luck with your graphics grammar! On Sat, 28 Jul 2018, Rolf Turner wrote:> > I have the task of producing some boxplot graphics with the requirement that > these have the same general appearance as a set of such graphics > as were produced last year. I do not have access to the code that was > used to produce the "last year" graphics. > > There are multiple boxplots per figure and these boxplots appear in groups > (with two boxplots in each group in the simplest instance; there are four or > more per group in other instances, but I figure that if I can work out how to > handle two, then ....). > > After a bit of Googling I found that ggplot() does basically what I want. > However my mindset seems to be substantially incompatible with that of > ggplot() and I cannot figure out how to make some adjustments which are > needed in order to make my plots look like last year's. > > In last year's graphics the boxes were unfilled and were distinguished > (within groups) by their boundary colours, which were "red" and "black" > in the simple two-per-group instance. I achieved the "unfilled" effect by > setting alpha=0 inside the call to geom_boxplot(). (Is this the Right Thing > to Do?) However I cannot get the boundary colours of the > boxes to be "red" and "black". > > I have attached a sourceable script ("demo.txt") showing what I have tried so > far. I don't really understand the code; I simply copied and adjusted code > that I saw on stackoverflow. (Fairly mindlessly I'm afraid.) > > Problems: > > (1) The borders of the boxes are distinct, but they are sort-of-pink and > sort-of-blue, and I cannot for the life of me figure out how to make them red > and black. > > (2) Putting in "color=Type" seemed to have the effect of creating two > legends, one with the desired legend title but all in black, and one with > legend title equal to "Type" but using the colours that actually appear. How > can I get just one "appropriate" legend? > > (3) Last year's graphics have the x-axis starting at 0 (rather than at > c. 3.5). I tried using + xlim(0,8.5) but got told "Error: Discrete value > supplied to continuous scale". How can I make the appropriate > adjustment? > > (4) Last year's graphics have y-axis tick marks, labels and grid lines at > 700, 800, 900, ..., 2000, 2100. How can I reproduce this? > > I actually had several additional questions, but thought I'd better scrounge > around a bit more before posting this, and thereby managed (mirabile dictu!) > to answer them myself. > > Can anyone help me out with questions (1) --- (4)? Please keep it simple and > very explicit, for I am a bear of very little brain and long words bother me! > > Thanks. > > cheers, > > Rolf Turner > > -- > Technical Editor ANZJS > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
On 28/07/18 17:03, Jeff Newmiller wrote:> When you understand the strong dependence on how the data controls > ggplot, using it gets much easier. I still have to google details > sometimes though. Note that it can be very difficult to make a weird > plot (e.g. multiple parallel axes) in ggplot because it is very > internally consistent... a blessing and a curse. > > 1) Colour is assigned in the scale according to order of levels of the > factor. Note that while they are both discrete, the so-called "discrete" > scales auto-colour, but "manual" scales require you to specify the exact > colour sequence. > > 2) Assigning constants to properties is done outside the mapping (aes). > Note that "colour" is for lines and shapes outlines, while "fill" is > colour meant to fill in shapes. When the names of these two scales are > the same and the values are the same, the legends will merge. If not, > they will be shown separately. > > 3) Discrete scales are controlled by the levels in the data. To prevent > ggplot from removing missing levels, use the drop=FALSE argument. > > 4) Breaks are a property of the scale. > > My changes were: > > Year <- factor( rep( 4:8, each = 50, times = 2 ), levels = 0:8 ) > DemoDat <- data.frame(Year = Year, Score = c( X0 , X1 ), Type = Type ) > > ggplot( data = DemoDat > ????? , aes( x = Year, y = Score, color = Type ) > ????? , fill = NULL > ????? ) + > ??? geom_boxplot( position = position_dodge(1) ) + > ??? theme_minimal() + > ??? scale_colour_manual( name = "National v. Local" > ?????????????????????? , values = c( "red", "black" ) ) + > ??? scale_x_discrete( drop = FALSE ) + > ??? scale_y_continuous( breaks = seq( 700, 2100, 100 ) ) > > Good luck with your graphics grammar!Dear Jeff, Thanks very much for this cogent advice and for taking the trouble to steer me in the right direction. However I am not quite out of the woods yet. (1) I'm still getting two legends. How do I stop this from happening? (2) The boxes are "filled" (with pinkish and blueish colours --- which are referenced in the second of the two legends that I get). How can I get "unfilled" boxes? (3) The y-axis scale runs only from 800 to 1800, rather than from 700 to 2100. How can I force it to run from 700 to 2100? (4) With the modified code we now get some "outliers" (points beyond the whisker tips) plotted --- which I didn't get before (and don't want, because "last year's" graphics did not include outliers). How can I suppress the plotting of outliers? I have attached a pdf containing the results of running the code that you provided, so that you can readily see what is happening. May I prevail upon your good graces to enlighten me about questions (1) --- (4) above? Ever so humbly grateful. cheers, Rolf -- Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 -------------- next part -------------- A non-text attachment was scrubbed... Name: demoPlot.pdf Type: application/pdf Size: 5772 bytes Desc: not available URL: <stat.ethz.ch/pipermail/r-help/attachments/20180728/ea156d7a/attachment-0002.pdf>