Joris Meijerink
2008-Aug-21 07:37 UTC
[R] Boxplot 5% and 95% quantile instead of 25% and 75%
Hi, I'm new to the whole R-thing as a replacement for Matlab, not disappointed sofar ;) I found out how to make nice looking boxplots, but i also would like the make a boxplot with 5% and 95% instead of the standard 25 and 75% quantiles. My csv input looks something like: LOCATION FILTER NR DATE VALUE MONTH Peelhorst01 1 14-Jan-94 23.07 1 Peelhorst01 1 28-Jan-94 23.68 1 Peelhorst01 1 14-Feb-94 23.38 2 Peelhorst01 1 28-Feb-94 23.27 2 Peelhorst01 1 14-Mar-94 23.25 3 Peelhorst01 1 28-Mar-94 23.69 3 Peelhorst01 1 14-Apr-94 23.63 4 Peelhorst01 1 28-Apr-94 23.3 4 Peelhorst01 1 14-May-94 23.14 5 Peelhorst01 1 28-May-94 23.09 5 Peelhorst01 1 14-Jun-94 23.06 6 Peelhorst01 1 28-Jun-94 22.86 6 Peelhorst01 1 14-Jul-94 22.63 7 Peelhorst01 1 28-Jul-94 22.48 7 Peelhorst01 1 14-Aug-94 22.35 8 Peelhorst01 1 28-Aug-94 22.27 8 Peelhorst01 1 14-Sep-94 22.21 9 Peelhorst01 1 28-Sep-94 22.27 9 Peelhorst01 1 14-Oct-94 22.33 10 Peelhorst01 1 28-Oct-94 22.28 10 Peelhorst01 1 14-Nov-94 22.37 11 Peelhorst01 1 28-Nov-94 22.49 11 Peelhorst01 1 14-Dec-94 22.56 12 Peelhorst01 1 28-Dec-94 22.62 12 going on for 13 more years I used the following to produce a boxplot: z <- boxplot(VALUE ~ MONTH, data = reeks, plot = FALSE ) Then I replace the numbers of the month in jan, feb etc. with z$names <- c('jan','feb','mrt','apr','mei','jun','jul','aug','sep','okt','nov','dec') and make the boxplot with the bxp function. Now I was thinking of using the same sollution by replacing row 2 and 4 in z$stats with the results of the quantile function for 5% and 95% but to be able to calculate that I need the vectors of only 1 month without the other months. How can i do that, or is there even a better/easier sollution to my problem? kind regards Joris ------------------------------------------------------------------------------------------ DISCLAIMER:\ This e-mail is strictly confidential and is...{{dropped:16}}
Joris, I found this (http://ceae.colorado.edu/~balajir/r-session-files/) on the web. It will do exactly what you want. Get the files: myboxplot-stats.r myboxplot.r Leesferry-mon-data.txt <= example data The usage is: #Boxplots #Source the ?myboxplot? codes from Balaji?s directory. source("myboxplot-stats.r") source("myboxplot.r") #Define Variable flow3 flow3=as.data.frame(flow2) #Only one graph per page: par(mfrow=c(1,1)) #For 12 months all on one graph: xs=1:12 zz=myboxplot(split(t(flow3),xs),plot=F,cex=1.0) zz$names=rep(" ",length(zz$names)) z1=bxp(zz,ylim=range(flow3,zmean),xlab="Month",ylab="Monthly Streamflow (cms)",axes=F) box() axis(1,at=z1,labels=months) axis(2) points(z1,zmean,lty=1,lwd=2, col="red") title(main=?Monthly Boxplots of Streamflow?) I hope this helps. I have a complete example for data I am using if you need something more complete. Regards, Tom Joris Meijerink wrote:> Hi, > > I'm new to the whole R-thing as a replacement for Matlab, not disappointed sofar ;) > > I found out how to make nice looking boxplots, but i also would like the make a boxplot with 5% and 95% instead of the standard 25 and 75% quantiles. > > My csv input looks something like: > LOCATION FILTER NR DATE VALUE MONTH > Peelhorst01 1 14-Jan-94 23.07 1 > Peelhorst01 1 28-Jan-94 23.68 1 > Peelhorst01 1 14-Feb-94 23.38 2 > Peelhorst01 1 28-Feb-94 23.27 2 > Peelhorst01 1 14-Mar-94 23.25 3 > Peelhorst01 1 28-Mar-94 23.69 3 > Peelhorst01 1 14-Apr-94 23.63 4 > Peelhorst01 1 28-Apr-94 23.3 4 > Peelhorst01 1 14-May-94 23.14 5 > Peelhorst01 1 28-May-94 23.09 5 > Peelhorst01 1 14-Jun-94 23.06 6 > Peelhorst01 1 28-Jun-94 22.86 6 > Peelhorst01 1 14-Jul-94 22.63 7 > Peelhorst01 1 28-Jul-94 22.48 7 > Peelhorst01 1 14-Aug-94 22.35 8 > Peelhorst01 1 28-Aug-94 22.27 8 > Peelhorst01 1 14-Sep-94 22.21 9 > Peelhorst01 1 28-Sep-94 22.27 9 > Peelhorst01 1 14-Oct-94 22.33 10 > Peelhorst01 1 28-Oct-94 22.28 10 > Peelhorst01 1 14-Nov-94 22.37 11 > Peelhorst01 1 28-Nov-94 22.49 11 > Peelhorst01 1 14-Dec-94 22.56 12 > Peelhorst01 1 28-Dec-94 22.62 12 > > going on for 13 more years > > I used the following to produce a boxplot: > z <- boxplot(VALUE ~ MONTH, data = reeks, > plot = FALSE > ) > > Then I replace the numbers of the month in jan, feb etc. with > z$names <- c('jan','feb','mrt','apr','mei','jun','jul','aug','sep','okt','nov','dec') > and make the boxplot with the bxp function. > > Now I was thinking of using the same sollution by replacing row 2 and 4 in z$stats with the results of the quantile function for 5% and 95% but to be able to calculate that I need the vectors of only 1 month without the other months. How can i do that, or is there even a better/easier sollution to my problem? > > kind regards > Joris > > ------------------------------------------------------------------------------------------ > DISCLAIMER:\ This e-mail is strictly confidential and is...{{dropped:16}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Thomas E Adams National Weather Service Ohio River Forecast Center 1901 South State Route 134 Wilmington, OH 45177 EMAIL: thomas.adams at noaa.gov VOICE: 937-383-0528 FAX: 937-383-0033
Frank E Harrell Jr
2008-Aug-21 11:17 UTC
[R] Boxplot 5% and 95% quantile instead of 25% and 75%
Joris Meijerink wrote:> Hi, > > I'm new to the whole R-thing as a replacement for Matlab, not disappointed sofar ;) > > I found out how to make nice looking boxplots, but i also would like the make a boxplot with 5% and 95% instead of the standard 25 and 75% quantiles. > > My csv input looks something like: > LOCATION FILTER NR DATE VALUE MONTH > Peelhorst01 1 14-Jan-94 23.07 1 > Peelhorst01 1 28-Jan-94 23.68 1 > Peelhorst01 1 14-Feb-94 23.38 2 > Peelhorst01 1 28-Feb-94 23.27 2 > Peelhorst01 1 14-Mar-94 23.25 3 > Peelhorst01 1 28-Mar-94 23.69 3 > Peelhorst01 1 14-Apr-94 23.63 4 > Peelhorst01 1 28-Apr-94 23.3 4 > Peelhorst01 1 14-May-94 23.14 5 > Peelhorst01 1 28-May-94 23.09 5 > Peelhorst01 1 14-Jun-94 23.06 6 > Peelhorst01 1 28-Jun-94 22.86 6 > Peelhorst01 1 14-Jul-94 22.63 7 > Peelhorst01 1 28-Jul-94 22.48 7 > Peelhorst01 1 14-Aug-94 22.35 8 > Peelhorst01 1 28-Aug-94 22.27 8 > Peelhorst01 1 14-Sep-94 22.21 9 > Peelhorst01 1 28-Sep-94 22.27 9 > Peelhorst01 1 14-Oct-94 22.33 10 > Peelhorst01 1 28-Oct-94 22.28 10 > Peelhorst01 1 14-Nov-94 22.37 11 > Peelhorst01 1 28-Nov-94 22.49 11 > Peelhorst01 1 14-Dec-94 22.56 12 > Peelhorst01 1 28-Dec-94 22.62 12 > > going on for 13 more years > > I used the following to produce a boxplot: > z <- boxplot(VALUE ~ MONTH, data = reeks, > plot = FALSE > ) > > Then I replace the numbers of the month in jan, feb etc. with > z$names <- c('jan','feb','mrt','apr','mei','jun','jul','aug','sep','okt','nov','dec') > and make the boxplot with the bxp function. > > Now I was thinking of using the same sollution by replacing row 2 and 4 in z$stats with the results of the quantile function for 5% and 95% but to be able to calculate that I need the vectors of only 1 month without the other months. How can i do that, or is there even a better/easier sollution to my problem? > > kind regards > Joris >There is not reason why a box plot need not show multiple quantiles. By the default the extended box plot obtained by using the lattice package's bwplot function with the Hmisc package's panel.bpplot function will provie the 0.05 and 0.95 quantiles and several more, plus the mean. Type ?panel.bpplot and example(panel.bpplot) for more information. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
Joris Meijerink
2008-Aug-21 11:59 UTC
[R] Betr.: Boxplot 5% and 95% quantile instead of 25% and 75%
tnx for the nice sollutions, I'll look into them because they look much prettier then my current sollution, quick and durty I replaced some lines in boxplot$stats with the following: #replace quantile for (i in 1:12) { z$stats[1,i] <- max(reeks$VALUE * match(reeks$MONTH, i), na.rm = TRUE) z$stats[2,i] <- quantile(reeks$VALUE * match(reeks$MONTH, i), probs = c(0.95), na.rm = TRUE, names= FALSE) z$stats[4,i] <- quantile(reeks$VALUE * match(reeks$MONTH, i), probs = c(0.05), na.rm = TRUE, names= FALSE) z$stats[5,i] <- min(reeks$VALUE * match(reeks$MONTH, i), na.rm = TRUE) } end #remove outliers z$out <- NA z$group <- NA This did it for me. regards, Joris ------------------------------------------------------------------------------------------ DISCLAIMER:\ This e-mail is strictly confidential and is...{{dropped:14}}