I'm working with a lot of data right now, but I'm new to R, and not very good with it, hence my request for help. What type of graph could I use to straighten out things like... http://r.789695.n4.nabble.com/file/n3711389/Untitled.png ...this? I want to see general frequencies. Should I use something like a 3D histogram, or is there an easier way like, say, shading? I'm sure these are both possible, but I don't know which is easiest or how to implement either of them. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Clean-up-a-scatterplot-with-too-much-data-tp3711389p3711389.html Sent from the R help mailing list archive at Nabble.com.
Hi, One solution could be to subsample the data, or jitter the data (give it some random noise). A more elegant solution, imho, is to use a 2d histogram (3d histogram is not a good alternative, I think it is much better to use color instead of a third dimension). I don't think this is easy to make using the standard plot system in R, but ggplot2 handles it nicely. This would involve you needing to learn ggplot2, but I would highly recommend that anyways :). An example of the plot I have in mind can be seen at: http://had.co.nz/ggplot2/stat_bin2d.html Just scroll down a bit for some examples. cheers, Paul On 08/02/2011 05:26 AM, DimmestLemming wrote:> I'm working with a lot of data right now, but I'm new to R, and not very good > with it, hence my request for help. What type of graph could I use to > straighten out things like... > > http://r.789695.n4.nabble.com/file/n3711389/Untitled.png > > ...this? > > I want to see general frequencies. Should I use something like a 3D > histogram, or is there an easier way like, say, shading? I'm sure these are > both possible, but I don't know which is easiest or how to implement either > of them. > > Thanks! > > -- > View this message in context: http://r.789695.n4.nabble.com/Clean-up-a-scatterplot-with-too-much-data-tp3711389p3711389.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770
DimmestLemming wrote:> I'm working with a lot of data right now, but I'm new to R, and not very > good with it, hence my request for help. What type of graph could I use to > straighten out things like... > > http://r.789695.n4.nabble.com/file/n3711389/Untitled.pngThree nice alternatives: example(smoothScatter) example(sunflowerplot) library(hexbin) example(hexbinplot) (And do remove the outliers before plotting.) -- Karl Ove Hufthammer
In addition to the other responses (all of which I liked), a couple of other alternatives to consider are 2D density plots (see ?kde2d in the MASS package, for example) or geom_tile() in the ggplot2 package, which you can think of as a 3D histogram projected to 2D with color corresponding to (relative) frequency, as suggested by Paul Hiemstra. geom_tile() is a discretized, gridded version of a hexbin plot, but I would start with the hexbin myself. I echo KOH's comment: make sure you remove the outliers first, especially that one in the upper left corner :) After looking at your plot, here's my question: why would you plot kills/minute vs. minutes played? Doesn't the first variable render the second one moot? Wouldn't kills vs. minutes played be a more relevant (scatter)plot? If you have information on the skill level of the players, you could incorporate that information into the plot as well. There are several nice ways to go if this is the case. If kills/minute is the more appropriate measure, a univariate density plot would make sense, or a histogram. HTH, Dennis On Mon, Aug 1, 2011 at 10:26 PM, DimmestLemming <NICOADAMS000 at gmail.com> wrote:> I'm working with a lot of data right now, but I'm new to R, and not very good > with it, hence my request for help. What type of graph could I use to > straighten out things like... > > http://r.789695.n4.nabble.com/file/n3711389/Untitled.png > > ...this? > > I want to see general frequencies. Should I use something like a 3D > histogram, or is there an easier way like, say, shading? I'm sure these are > both possible, but I don't know which is easiest or how to implement either > of them. > > Thanks! > > -- > View this message in context: http://r.789695.n4.nabble.com/Clean-up-a-scatterplot-with-too-much-data-tp3711389p3711389.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >