Hi, This is a very basic question, but I would like to undestand hist(). I thought that the hist( , freq=FALSE) should provide the relative frequencies (probabilities), and so they should sum 1, however: set.seed(2) ah <- hist(rnorm(100), freq=F) sum(ah$intensities) [1] 2 set.seed(2) bh <- hist(rlnorm(100), freq=F) sum(bh$intensities) [1] 0.4999996 I'm getting similar figures with truehist() in MASS. So I suppose I'm misunderstanding hist(). Any help? Thanks Juli
A histogram has area one, not sum one. From ?truehist
Details:
This plots a true histogram, a density estimate of total area 1.
On Sat, 8 Mar 2003, juli g. pausas wrote:
> Hi,
> This is a very basic question, but I would like to undestand hist(). I
> thought that the hist( , freq=FALSE) should provide the relative
> frequencies (probabilities), and so they should sum 1, however:
>
> set.seed(2)
> ah <- hist(rnorm(100), freq=F)
> sum(ah$intensities)
> [1] 2
>
> set.seed(2)
> bh <- hist(rlnorm(100), freq=F)
> sum(bh$intensities)
> [1] 0.4999996
>
> I'm getting similar figures with truehist() in MASS.
> So I suppose I'm misunderstanding hist(). Any help?
>
> Thanks
>
> Juli
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
On Sat, 8 Mar 2003, juli g. pausas wrote:> Hi, > This is a very basic question, but I would like to undestand hist(). I > thought that the hist( , freq=FALSE) should provide the relative > frequencies (probabilities), and so they should sum 1, however:No, it provides probability *densities*, which *integrate* to 1. That is, the height of the bar is the relative frequency divided by the width of the interval. This is important because - it means histograms with different cutpoints are comparable - it means histograms are comparable with mathematical densities such as a Normal, and with kernel density estimates - it means that the bars don't have to have the same width. If histograms plotted relative frequencies there would be no need to distinguish them from barplots. -thomas