Hi all, I have a question regarding the boxplot function. The data I am working on has 1 grouping variable (G) and it has many numerical variables (V1, V2, V3, V4, Vx, etc). What I would like to do is create a boxplot where the Y-axis represents the numerical values of variable V1...Vx (all the variables have the same range). The X-axis needs to represent the G-V combination. So suppose the possible values for G are a, b and c, Then along the x-axis there would be a boxplot for each of the combinations: V1Ga, V1Gb, V1Gc, V2Ga, V2Gb, V2Gc, V3Ga, V3Gb, V3Gc,.....VxGa, VxGb, VxGc, etc ie all values of V1 where the G values are a, all values of V1 where the G values are b, etc In addition, if possible, it would be nice if each G value would have a a different colour on the plot so that they could be seen more clearly. I'm not sure whether such a function already exists within R or whether it would have to be written. Either way, I would appreciate it very much if somebody could help and give me some advice as to how I can achieve this. Many Thanks Rishabh __________________________________________________ Everything you'll ever need on one web page from News and Sport to Email and Music Charts
On Fri, 14 Mar 2003, [iso-8859-1] Rishabh Gupta wrote:> Hi all, > I have a question regarding the boxplot function. The data I am working on has 1 grouping > variable (G) and it has many numerical variables (V1, V2, V3, V4, Vx, etc). What I would like to > do is create a boxplot where the Y-axis represents the numerical values of variable V1...Vx (all > the variables have the same range). The X-axis needs to represent the G-V combination. So suppose > the possible values for G are a, b and c, Then along the x-axis there would be a boxplot for each > of the combinations: > > V1Ga, V1Gb, V1Gc, V2Ga, V2Gb, V2Gc, V3Ga, V3Gb, V3Gc,.....VxGa, VxGb, VxGc, etc > ie > all values of V1 where the G values are a, all values of V1 where the G values are b, etc > In addition, if possible, it would be nice if each G value would have a a different colour on the > plot so that they could be seen more clearly. > > I'm not sure whether such a function already exists within R or whether it would have to be > written. Either way, I would appreciate it very much if somebody could help and give me some > advice as to how I can achieve this. >I'm going to work with a data frame that has two variables and a binary grouping factor df<-data.frame(x1=rnorm(100),x2=rnorm(100),g=rep(0:1,50)) There's at least two ways to do this. boxplot() will take a list of vectors and do boxplots of them, so we can split() each of the vectors lapply(df[,1:2], function(v) split(v, df$g)) and then combine them into a single list with do.call("c",) and then boxplot() them. That is: boxplot(do.call("c",lapply(df[,1:2],function(v) split(v,df$g)))) This labels the x-axis "x1.0" "x1.1", "x2.0", "x2.1" We can also do the opposite: combine the vectors into a single variable, add a new factor indicating which vector each observation came from, and use boxplot() with a formula. ddf<-reshape(df,varying=list(x=c("x1","x2")),direction="long") boxplot(x1~interaction(time,g),data=ddf) This labels the x-axis "1.0" "2.0" "1.1" "2.1" -thomas
>-----Original Message----- >From: r-help-bounces at stat.math.ethz.ch >[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Rishabh Gupta >Sent: Friday, March 14, 2003 4:10 PM >To: r-help at stat.math.ethz.ch >Subject: [R] boxplots with multiple numerical variables > > >Hi all, > I have a question regarding the boxplot function. The data >I am working on has 1 grouping variable (G) and it has many >numerical variables (V1, V2, V3, V4, Vx, etc). What I would >like to do is create a boxplot where the Y-axis represents the >numerical values of variable V1...Vx (all the variables have >the same range). The X-axis needs to represent the G-V >combination. So suppose the possible values for G are a, b and >c, Then along the x-axis there would be a boxplot for each of >the combinations: > > V1Ga, V1Gb, V1Gc, V2Ga, V2Gb, V2Gc, V3Ga, V3Gb, >V3Gc,.....VxGa, VxGb, VxGc, etc ie > all values of V1 where the G values are a, all values of V1 >where the G values are b, etc In addition, if possible, it >would be nice if each G value would have a a different colour >on the plot so that they could be seen more clearly. > >I'm not sure whether such a function already exists within R >or whether it would have to be written. Either way, I would >appreciate it very much if somebody could help and give me >some advice as to how I can achieve this. > >Many Thanks > >RishabhIf I understand correctly, you basically want to generate a grouped boxplot, where each of the numeric variables are in groups of three (a, b and c) across the x-axis. The final example in ?boxplot does this, though for groups of 2 and it uses a formula method. You would just need to modify the code in the example adding a third call to boxplot() with the 'add = TRUE' argument set as in the second call. Note the use of the 'at' argument, which enables you to pick mid-points for the groups and then draw the boxplots at offsets from the mid-points. In the case of groups of three, you would use the mid-points and then the mid-points minus and plus an offset. You then also use the 'boxwex' argument, to adjust the width of the boxplots to fit the groupings. Finally, for the colors, you can use a different 'col' argument in each call to boxplot, which would yield three colors, one for each of the three boxplots in each grouping. HTH, Marc Schwartz