Joris Meijerink
2008-Aug-21 07:37 UTC
[R] Boxplot 5% and 95% quantile instead of 25% and 75%
Hi,
I'm new to the whole R-thing as a replacement for Matlab, not disappointed
sofar ;)
I found out how to make nice looking boxplots, but i also would like the make a
boxplot with 5% and 95% instead of the standard 25 and 75% quantiles.
My csv input looks something like:
LOCATION FILTER NR DATE VALUE MONTH
Peelhorst01 1 14-Jan-94 23.07 1
Peelhorst01 1 28-Jan-94 23.68 1
Peelhorst01 1 14-Feb-94 23.38 2
Peelhorst01 1 28-Feb-94 23.27 2
Peelhorst01 1 14-Mar-94 23.25 3
Peelhorst01 1 28-Mar-94 23.69 3
Peelhorst01 1 14-Apr-94 23.63 4
Peelhorst01 1 28-Apr-94 23.3 4
Peelhorst01 1 14-May-94 23.14 5
Peelhorst01 1 28-May-94 23.09 5
Peelhorst01 1 14-Jun-94 23.06 6
Peelhorst01 1 28-Jun-94 22.86 6
Peelhorst01 1 14-Jul-94 22.63 7
Peelhorst01 1 28-Jul-94 22.48 7
Peelhorst01 1 14-Aug-94 22.35 8
Peelhorst01 1 28-Aug-94 22.27 8
Peelhorst01 1 14-Sep-94 22.21 9
Peelhorst01 1 28-Sep-94 22.27 9
Peelhorst01 1 14-Oct-94 22.33 10
Peelhorst01 1 28-Oct-94 22.28 10
Peelhorst01 1 14-Nov-94 22.37 11
Peelhorst01 1 28-Nov-94 22.49 11
Peelhorst01 1 14-Dec-94 22.56 12
Peelhorst01 1 28-Dec-94 22.62 12
going on for 13 more years
I used the following to produce a boxplot:
z <- boxplot(VALUE ~ MONTH, data = reeks,
plot = FALSE
)
Then I replace the numbers of the month in jan, feb etc. with
z$names <-
c('jan','feb','mrt','apr','mei','jun','jul','aug','sep','okt','nov','dec')
and make the boxplot with the bxp function.
Now I was thinking of using the same sollution by replacing row 2 and 4 in
z$stats with the results of the quantile function for 5% and 95% but to be able
to calculate that I need the vectors of only 1 month without the other months.
How can i do that, or is there even a better/easier sollution to my problem?
kind regards
Joris
------------------------------------------------------------------------------------------
DISCLAIMER:\ This e-mail is strictly confidential and is...{{dropped:16}}
Joris,
I found this (http://ceae.colorado.edu/~balajir/r-session-files/) on the web. It
will do exactly what you want. Get the files:
myboxplot-stats.r
myboxplot.r
Leesferry-mon-data.txt <= example data
The usage is:
#Boxplots
#Source the ?myboxplot? codes from Balaji?s directory.
source("myboxplot-stats.r")
source("myboxplot.r")
#Define Variable flow3
flow3=as.data.frame(flow2)
#Only one graph per page:
par(mfrow=c(1,1))
#For 12 months all on one graph:
xs=1:12
zz=myboxplot(split(t(flow3),xs),plot=F,cex=1.0)
zz$names=rep(" ",length(zz$names))
z1=bxp(zz,ylim=range(flow3,zmean),xlab="Month",ylab="Monthly
Streamflow (cms)",axes=F)
box()
axis(1,at=z1,labels=months)
axis(2)
points(z1,zmean,lty=1,lwd=2, col="red")
title(main=?Monthly Boxplots of Streamflow?)
I hope this helps. I have a complete example for data I am using if you
need something more complete.
Regards,
Tom
Joris Meijerink wrote:> Hi,
>
> I'm new to the whole R-thing as a replacement for Matlab, not
disappointed sofar ;)
>
> I found out how to make nice looking boxplots, but i also would like the
make a boxplot with 5% and 95% instead of the standard 25 and 75% quantiles.
>
> My csv input looks something like:
> LOCATION FILTER NR DATE VALUE MONTH
> Peelhorst01 1 14-Jan-94 23.07 1
> Peelhorst01 1 28-Jan-94 23.68 1
> Peelhorst01 1 14-Feb-94 23.38 2
> Peelhorst01 1 28-Feb-94 23.27 2
> Peelhorst01 1 14-Mar-94 23.25 3
> Peelhorst01 1 28-Mar-94 23.69 3
> Peelhorst01 1 14-Apr-94 23.63 4
> Peelhorst01 1 28-Apr-94 23.3 4
> Peelhorst01 1 14-May-94 23.14 5
> Peelhorst01 1 28-May-94 23.09 5
> Peelhorst01 1 14-Jun-94 23.06 6
> Peelhorst01 1 28-Jun-94 22.86 6
> Peelhorst01 1 14-Jul-94 22.63 7
> Peelhorst01 1 28-Jul-94 22.48 7
> Peelhorst01 1 14-Aug-94 22.35 8
> Peelhorst01 1 28-Aug-94 22.27 8
> Peelhorst01 1 14-Sep-94 22.21 9
> Peelhorst01 1 28-Sep-94 22.27 9
> Peelhorst01 1 14-Oct-94 22.33 10
> Peelhorst01 1 28-Oct-94 22.28 10
> Peelhorst01 1 14-Nov-94 22.37 11
> Peelhorst01 1 28-Nov-94 22.49 11
> Peelhorst01 1 14-Dec-94 22.56 12
> Peelhorst01 1 28-Dec-94 22.62 12
>
> going on for 13 more years
>
> I used the following to produce a boxplot:
> z <- boxplot(VALUE ~ MONTH, data = reeks,
> plot = FALSE
> )
>
> Then I replace the numbers of the month in jan, feb etc. with
> z$names <-
c('jan','feb','mrt','apr','mei','jun','jul','aug','sep','okt','nov','dec')
> and make the boxplot with the bxp function.
>
> Now I was thinking of using the same sollution by replacing row 2 and 4 in
z$stats with the results of the quantile function for 5% and 95% but to be able
to calculate that I need the vectors of only 1 month without the other months.
How can i do that, or is there even a better/easier sollution to my problem?
>
> kind regards
> Joris
>
>
------------------------------------------------------------------------------------------
> DISCLAIMER:\ This e-mail is strictly confidential and is...{{dropped:16}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177
EMAIL: thomas.adams at noaa.gov
VOICE: 937-383-0528
FAX: 937-383-0033
Frank E Harrell Jr
2008-Aug-21 11:17 UTC
[R] Boxplot 5% and 95% quantile instead of 25% and 75%
Joris Meijerink wrote:> Hi, > > I'm new to the whole R-thing as a replacement for Matlab, not disappointed sofar ;) > > I found out how to make nice looking boxplots, but i also would like the make a boxplot with 5% and 95% instead of the standard 25 and 75% quantiles. > > My csv input looks something like: > LOCATION FILTER NR DATE VALUE MONTH > Peelhorst01 1 14-Jan-94 23.07 1 > Peelhorst01 1 28-Jan-94 23.68 1 > Peelhorst01 1 14-Feb-94 23.38 2 > Peelhorst01 1 28-Feb-94 23.27 2 > Peelhorst01 1 14-Mar-94 23.25 3 > Peelhorst01 1 28-Mar-94 23.69 3 > Peelhorst01 1 14-Apr-94 23.63 4 > Peelhorst01 1 28-Apr-94 23.3 4 > Peelhorst01 1 14-May-94 23.14 5 > Peelhorst01 1 28-May-94 23.09 5 > Peelhorst01 1 14-Jun-94 23.06 6 > Peelhorst01 1 28-Jun-94 22.86 6 > Peelhorst01 1 14-Jul-94 22.63 7 > Peelhorst01 1 28-Jul-94 22.48 7 > Peelhorst01 1 14-Aug-94 22.35 8 > Peelhorst01 1 28-Aug-94 22.27 8 > Peelhorst01 1 14-Sep-94 22.21 9 > Peelhorst01 1 28-Sep-94 22.27 9 > Peelhorst01 1 14-Oct-94 22.33 10 > Peelhorst01 1 28-Oct-94 22.28 10 > Peelhorst01 1 14-Nov-94 22.37 11 > Peelhorst01 1 28-Nov-94 22.49 11 > Peelhorst01 1 14-Dec-94 22.56 12 > Peelhorst01 1 28-Dec-94 22.62 12 > > going on for 13 more years > > I used the following to produce a boxplot: > z <- boxplot(VALUE ~ MONTH, data = reeks, > plot = FALSE > ) > > Then I replace the numbers of the month in jan, feb etc. with > z$names <- c('jan','feb','mrt','apr','mei','jun','jul','aug','sep','okt','nov','dec') > and make the boxplot with the bxp function. > > Now I was thinking of using the same sollution by replacing row 2 and 4 in z$stats with the results of the quantile function for 5% and 95% but to be able to calculate that I need the vectors of only 1 month without the other months. How can i do that, or is there even a better/easier sollution to my problem? > > kind regards > Joris >There is not reason why a box plot need not show multiple quantiles. By the default the extended box plot obtained by using the lattice package's bwplot function with the Hmisc package's panel.bpplot function will provie the 0.05 and 0.95 quantiles and several more, plus the mean. Type ?panel.bpplot and example(panel.bpplot) for more information. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
Joris Meijerink
2008-Aug-21 11:59 UTC
[R] Betr.: Boxplot 5% and 95% quantile instead of 25% and 75%
tnx for the nice sollutions, I'll look into them because they look much
prettier then my current sollution, quick and durty I replaced some lines in
boxplot$stats with the following:
#replace quantile
for (i in 1:12) {
z$stats[1,i] <- max(reeks$VALUE * match(reeks$MONTH, i), na.rm = TRUE)
z$stats[2,i] <- quantile(reeks$VALUE * match(reeks$MONTH, i), probs =
c(0.95), na.rm = TRUE, names= FALSE)
z$stats[4,i] <- quantile(reeks$VALUE * match(reeks$MONTH, i), probs =
c(0.05), na.rm = TRUE, names= FALSE)
z$stats[5,i] <- min(reeks$VALUE * match(reeks$MONTH, i), na.rm = TRUE)
}
end
#remove outliers
z$out <- NA
z$group <- NA
This did it for me.
regards,
Joris
------------------------------------------------------------------------------------------
DISCLAIMER:\ This e-mail is strictly confidential and is...{{dropped:14}}