Dear listers, A quick question about breaks in hist(). The histogram is highly screwed to the right, say, the range of the vector is [0, 2], but 95% of the value is squeezed in the interval (0.01, 0.2). My question is : how to set the breaks then make the histogram look even? Thanks in advance, Leaf
Berton Gunter
2005-Nov-02 18:09 UTC
[R] Visualizing a Data Distribution -- Was: breaks in hist()
Leaf: An interesting question concerning graphical perception. As you have noted, choice of bin boundaries in a histogram can have a big effect on how a distribution is perceived. My $.02 (U.S.): Histograms are a relic of manual data plotting. We have much better alternatives these days that should be used instead. e.g. 1. (my preference, but properly not consumer-friendly). Plot the cdf instead (?ecdf) . 2. Plot a density estimator (?density ; ?densityplot) 3. See David Scott's ash package, perhaps the KernSmooth package also (though density() probably already has anything that you'd need from it). Cheers, -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Leaf Sun > Sent: Wednesday, November 02, 2005 9:49 AM > To: r-help at stat.math.ethz.ch > Subject: [R] breaks in hist() > > Dear listers, > > A quick question about breaks in hist(). > > The histogram is highly screwed to the right, say, the range > of the vector is [0, 2], but 95% of the value is squeezed in > the interval (0.01, 0.2). My question is : how to set the > breaks then make the histogram look even? > > Thanks in advance, > > Leaf > >
Hi Leaf The word "even" can be interpreted in several ways but I will give it a shot. If you want to specify the breakpoints to represent the aggregation in your data you can use the argument breaks within histogram i.e. x=c(runif(95,0,0.2),runif(5,.21,2)) hist(x, breaks=seq(0,2,.1), freq=F )#It will use breakpoints at 0,0.1,0.2,...2 or you can also suggest a pre-defined number of cells i.e. hist(x, breaks=7, freq=F ) You can also add a density line over the histogram using lines lines(density(x, bw=.1)) Alternativelly, you can use some more basic visualization like a stem-and-leaf plot stem(x) I hope this helps Francisco>From: "Leaf Sun" <leaflovesun at yahoo.ca> >To: "r-help at stat.math.ethz.ch" <r-help at stat.math.ethz.ch> >Subject: [R] breaks in hist() >Date: Wed, 2 Nov 2005 10:48:45 -0700 > >Dear listers, > >A quick question about breaks in hist(). > >The histogram is highly screwed to the right, say, the range of the vector >is [0, 2], but 95% of the value is squeezed in the interval (0.01, 0.2). My >question is : how to set the breaks then make the histogram look even? > >Thanks in advance, > >Leaf >>______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! >http://www.R-project.org/posting-guide.html