Bryan Hanson
2009-Oct-06 15:50 UTC
[R] ggplot2: mapping categorical variable to color aesthetic with faceting
Hello Again... I?m making a faceted plot of a response on two categorical variables using ggplot2 and having troubles with the coloring. Here is a sample that produces the desired plot: compareCats <- function(data, res, fac1, fac2, colors) { require(ggplot2) p <- ggplot(data, aes(fac1, res)) + facet_grid(. ~ fac2) jit <- position_jitter(width = 0.1) p <- p + layer(geom = "jitter", position = jit, color = colors) print(p) } test <- data.frame(res = rnorm(100), fac1 = as.factor(rep(c("A", "B"), 50)), fac2 = as.factor(rep(c("lrg", "lrg", "sm", "sm"), 25))) compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors c("red", "blue")) Now, if I get away from idealized data where there are the same number of data points per group (25 in this case), I run into problems. So, if you do: rem <- runif(5, 1, 100) # randomly remove a few points here and there test <- test[-rem,] compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors c("red", "blue")) R throws an error due to mismatch between the recycling of colors and the actual number of data points: Error in `[<-.data.frame`(`*tmp*`, gp, value = list(colour = c("red", : replacement element 1 has 2 rows, need 47 I'm new to ggplot2, but have been through the book and the web site enough to know that my problem is "mapping the varible to the aesthetic"; I also know I can either "map" or "set" the colors. The question, finally: is there an simple/elegant way to map a list of two colors corresponding to A and B onto any random sample size of A and B with faceting? If not, and I must "set" the colors: Do I compute the length of all possible combos of A, B with lrg, sm, and then create one long vector of colors for the entire plot? I tried something like this, and was not successful, but perhaps could be with more work. All advice appreciated, Bryan (session info below) ************* Bryan Hanson Professor of Chemistry & Biochemistry DePauw University, Greencastle IN USA> sessionInfo()R version 2.9.2 (2009-08-24) i386-apple-darwin8.11.1 locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid datasets tools utils stats graphics grDevices methods [9] base other attached packages: [1] ggplot2_0.8.3 reshape_0.8.3 proto_0.3-8 mvbutils_2.2.0 [5] ChemoSpec_1.1 lattice_0.17-25 mvoutlier_1.4 plyr_0.1.8 [9] RColorBrewer_1.0-2 chemometrics_0.4 som_0.3-4 robustbase_0.4-5 [13] rpart_3.1-45 pls_2.1-0 pcaPP_1.7 mvtnorm_0.9-7 [17] nnet_7.2-48 mclust_3.2 MASS_7.2-48 lars_0.9-7 [21] e1071_1.5-19 class_7.2-48
baptiste auguie
2009-Oct-06 16:36 UTC
[R] ggplot2: mapping categorical variable to color aesthetic with faceting
Hi, I may be missing an important design decision, but could you not have only a single data.frame as an argument of your function? From your example, it seems that the colour can be mapped to the fac1 variable of "data", compareCats <- function(data) { require(ggplot2) p <- ggplot(data, aes(fac1, res, color=fac1)) + facet_grid(. ~ fac2) jit <- position_jitter(width = 0.1) p <- p + layer(geom = "jitter", position = jit) + scale_colour_manual(values=c("red", "blue")) print(p) } test <- data.frame(res = rnorm(100), fac1 = as.factor(rep(c("A", "B"), 50)), fac2 = as.factor(rep(c("lrg", "lrg", "sm", "sm"), 25))) compareCats(data = test) rem <- runif(5, 1, 100) # randomly remove a few points here and there last_plot() %+% test[-rem,] # replot with new dataset HTH, baptiste 2009/10/6 Bryan Hanson <hanson at depauw.edu>:> Hello Again... ?I?m making a faceted plot of a response on two categorical > variables using ggplot2 and having troubles with the coloring. Here is a > sample that produces the desired plot: > > compareCats <- function(data, res, fac1, fac2, colors) { > > ? ?require(ggplot2) > ? ?p <- ggplot(data, aes(fac1, res)) + facet_grid(. ~ fac2) > ? ?jit <- position_jitter(width = 0.1) > ? ?p <- p + layer(geom = "jitter", position = jit, color = colors) > ? ?print(p) > ? ?} > > test <- data.frame(res = rnorm(100), fac1 = as.factor(rep(c("A", "B"), 50)), > ? ?fac2 = as.factor(rep(c("lrg", "lrg", "sm", "sm"), 25))) > > compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors > c("red", "blue")) > > Now, if I get away from idealized data where there are the same number of > data points per group (25 in this case), I run into problems. ?So, if you > do: > > rem <- runif(5, 1, 100) # randomly remove a few points here and there > test <- test[-rem,] > compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors > c("red", "blue")) > > R throws an error due to mismatch between the recycling of colors and the > actual number of data points: > > Error in `[<-.data.frame`(`*tmp*`, gp, value = list(colour = c("red", ?: > ?replacement element 1 has 2 rows, need 47 > > I'm new to ggplot2, but have been through the book and the web site enough > to know that my problem is "mapping the varible to the aesthetic"; I also > know I can either "map" or "set" the colors. > > The question, finally: ?is there an simple/elegant way to map a list of two > colors corresponding to A and B onto any random sample size of A and B with > faceting? ?If not, and I must "set" the colors: ?Do I compute the length of > all possible combos of A, B with lrg, sm, and then create one long vector of > colors for the entire plot? ?I tried something like this, and was not > successful, but perhaps could be with more work. > > All advice appreciated, Bryan (session info below) > > ************* > Bryan Hanson > Professor of Chemistry & Biochemistry > DePauw University, Greencastle IN USA > >> sessionInfo() > R version 2.9.2 (2009-08-24) > i386-apple-darwin8.11.1 > > locale: > en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] grid ? ? ?datasets ?tools ? ? utils ? ? stats ? ? graphics ?grDevices > methods > [9] base > > other attached packages: > ?[1] ggplot2_0.8.3 ? ? ?reshape_0.8.3 ? ? ?proto_0.3-8 ? ? ? ?mvbutils_2.2.0 > ?[5] ChemoSpec_1.1 ? ? ?lattice_0.17-25 ? ?mvoutlier_1.4 ? ? ?plyr_0.1.8 > ?[9] RColorBrewer_1.0-2 chemometrics_0.4 ? som_0.3-4 > robustbase_0.4-5 > [13] rpart_3.1-45 ? ? ? pls_2.1-0 ? ? ? ? ?pcaPP_1.7 ? ? ? ? ?mvtnorm_0.9-7 > [17] nnet_7.2-48 ? ? ? ?mclust_3.2 ? ? ? ? MASS_7.2-48 ? ? ? ?lars_0.9-7 > [21] e1071_1.5-19 ? ? ? class_7.2-48 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >