Hi all, I am trying to plot a weighted density plot for two different types and want to show the data points on the x axis. The code is as follows. The data points are very concentrated. Is there a better way to present it( should I set the alpha value or something else)? Thanks! YL library(ggplot2) x <- rnorm(10000) a <- rnorm(5000) b <- rnorm(5000) weights.x <- abs(a/sum(a)) weights.y <- abs(b/sum(b)) weight <- c(weights.x, weights.y) ze <- rep(0,10000) type <- c(rep("a",5000), rep("b",5000)) d <- data.frame(expo = x, weight = weight, type = type, ze = ze) m <- ggplot(d, aes(x = expo, group = type, col = type, weight = weight)) m+geom_density()+geom_point(aes(x = expo, y = ze, shape = type))
Hi Yang, Strategies for dealing with overplotting include transparency, size, and jittering. In your example you'll probably need all three. m + geom_point(aes(x = expo, y = ze, shape = type), size = 1, alpha = .2, position = position_jitter(width = 0, height = 5)) + geom_density() seems to work OK. Best, Ista On Tue, Jul 5, 2011 at 8:46 PM, Yang Lu <Yang.Lu at williams.edu> wrote:> > Hi all, > > I am trying to plot a weighted density plot for two different types and want to show the data points on the x axis. > > The code is as follows. The data points are very concentrated. Is there a better way to present it( should I set the alpha value or something else)? > > Thanks! > > YL > > library(ggplot2) > > x <- rnorm(10000) > > a <- rnorm(5000) > > b <- rnorm(5000) > > weights.x <- abs(a/sum(a)) > > weights.y <- abs(b/sum(b)) > > weight <- c(weights.x, weights.y) > > ze <- rep(0,10000) > > type <- c(rep("a",5000), rep("b",5000)) > > d <- data.frame(expo = x, weight = weight, type = type, ze = ze) > > m <- ggplot(d, aes(x = expo, group = type, col = type, weight = weight)) > > m+geom_density()+geom_point(aes(x = expo, y = ze, shape = type)) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org
Joshua Wiley
2011-Jul-06 04:10 UTC
[R] how to best present concentrated data points/ ggplot2
Hi Yang, I would take a slightly different approach and use what Wilkinson calls stripe density plots. The idea is that if you are trying to show a univariate density on dimension 1 with many overlapping or extremely close observations, space on dimension 1 is precious, in two dimensions, space on dimension 2 is abundant. Rather than use things like circles or squares which take up equal space on dims 1 & 2, use something that takes up little space on dim 1, of course for human perception, you want your plot to be visible, so extend the space used on dimension two. What I just described (in probably the most obfuscated possible way) are lines. Also, colour is sufficient to distinguish different types so I did not bother with different line types. Here is an example: library(ggplot2) set.seed(10) x <- rnorm(10000) a <- rnorm(5000) b <- rnorm(5000) weights.x <- abs(a/sum(a)) weights.y <- abs(b/sum(b)) weight <- c(weights.x, weights.y) type <- c(rep("a", 5000), rep("b", 5000)) ## make it so different types of points do not overlap ze <- c(rep(0, 5000), rep(-.5, 5000)) d <- data.frame(expo = x, weight = weight, type = type, ze = ze) m <- ggplot(d, aes(x = expo, group = type, col = type, weight = weight)) ## note, with this many observations and alpha, plot may be sloow m + geom_density() + geom_linerange(aes(x = expo, ymin = ze - .1, ymax = ze + .1), alpha = .25) HTH, Josh On Tue, Jul 5, 2011 at 5:46 PM, Yang Lu <Yang.Lu at williams.edu> wrote:> Hi all, > > I am trying to plot a weighted density plot for two different types and want to show the data points on the x axis. > > The code is as follows. The data points are very concentrated. Is there a better way to present it( should I set the alpha value or something else)? > > Thanks! > > YL > > library(ggplot2) > > x <- rnorm(10000) > > a <- rnorm(5000) > > b <- rnorm(5000) > > weights.x <- abs(a/sum(a)) > > weights.y <- abs(b/sum(b)) > > weight <- c(weights.x, weights.y) > > ze <- rep(0,10000) > > type <- c(rep("a",5000), rep("b",5000)) > > d <- data.frame(expo = x, weight = weight, type = type, ze = ze) > > m <- ggplot(d, aes(x = expo, group = type, col = type, weight = weight)) > > m+geom_density()+geom_point(aes(x = expo, y = ze, shape = type)) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles https://joshuawiley.com/