Hi all, I am trying hard to do the following and have already spent a few hours in vain: I wanted to do the scatter plot. But given the high dispersion on those dots, I would like to bin the x-axis and then for each bin of the x-axis, plot the quantiles of the y-values of the data points in each bin: 1. Uniform bin size on the x-axis; 2. Equal number of observations in each bin; How to do that in R? I guess for the sake of prettyness, I'd better do it in ggplot2? Thank you! [[alternative HTML version deleted]]
Hi Michael, Although I do think ggplot2 does a superb job of elegant data visualization; I am not sure any graphics package will do what you want. I suspect you will first have to do some work binning your data, and then plot in your package of choice. In the situation that you have described, I do not believe your two criteria can be met. Having x bins of equal size seems prohibitive of having equal number of points in each, when values are highly dispersed. Just as a heads up, for this specific of a task, I would expect you will spend a few hours more than you already have. If you are willing to be a bit more flexible in your requirements, there are various binning algorithms in ggplot2 and other packages you could use to bin x valus, and then plot those against y quantiles. You are more likely to get a clear answer from the list if you can provide some sample data and perhaps a few example graphs showing what you hope to achieve. An easy way to provide some sample data is using the dput() function and then paste the output into your (plaintext please) email). Cheers, Josh On Fri, Mar 9, 2012 at 4:37 PM, Michael <comtech.usa at gmail.com> wrote:> Hi all, > > I am trying hard to do the following and have already spent a few hours in > vain: > > I wanted to do the scatter plot. > > But given the high dispersion on those dots, I would like to bin the x-axis > and then for each bin of the x-axis, plot the quantiles of the y-values of > the data points in each bin: > > 1. Uniform bin size on the x-axis; > 2. Equal number of observations in each bin; > > How to do that in R? I guess for the sake of prettyness, I'd better do it > in ggplot2? > > Thank you! > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/
R. Michael Weylandt
2012-Mar-10 01:07 UTC
[R] How do I do a pretty scatter plot using ggplot2?
That doesn't really seem to make sense to me as a graphical representation (transforming adjacent y values differently), but if you really want to do so, here's what I'd do if I understand your goal (the preprocessing is independent of the graphics engine): DAT <- data.frame(x = runif(1000, 0, 20), y = rcauchy(1000)^2) # Nice and volatile! # split y based on some x binning and assign empirical quantiles of each group DAT$yquant <- with(DAT, ave(y, cut(x, seq(0, 20, 5)), FUN function(x) ecdf(x)(x))) # BASE plot(yquant ~ x, data = DAT) # ggplot2 library(ggplot2) p <- ggplot(DAT, aes(x = x, y = yquant)) + geom_point() print(p) Michael Weylandt PS -- I see Josh Wiley just responded pointing out your requirements #1 and #2 are incompatible: I've used 1 here. On Fri, Mar 9, 2012 at 7:37 PM, Michael <comtech.usa at gmail.com> wrote:> Hi all, > > I am trying hard to do the following and have already spent a few hours in > vain: > > I wanted to do the scatter plot. > > But given the high dispersion on those dots, I would like to bin the x-axis > and then for each bin of the x-axis, plot the quantiles of the y-values of > the data points in each bin: > > 1. Uniform bin size on the x-axis; > 2. Equal number of observations in each bin; > > How to do that in R? I guess for the sake of prettyness, I'd better do it > in ggplot2? > > Thank you! > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On 03/10/2012 11:37 AM, Michael wrote:> Hi all, > > I am trying hard to do the following and have already spent a few hours in > vain: > > I wanted to do the scatter plot. > > But given the high dispersion on those dots, I would like to bin the x-axis > and then for each bin of the x-axis, plot the quantiles of the y-values of > the data points in each bin: > > 1. Uniform bin size on the x-axis; > 2. Equal number of observations in each bin; > > How to do that in R? I guess for the sake of prettyness, I'd better do it > in ggplot2? >Hi Michael, While it is not in ggplot2, a variation on the count.overplot function might do what you want. This function displays counts of closely spaced points rather than the points, but it applies the same area of aggregation across the whole plot. Getting the equal x bins is easy, and I assume that you mean equal observations within each bin, not across all bins. If you are stuck, I can probably hack up something from count.overplot. Jim