Alex Davies
2006-Apr-05 19:30 UTC
[R] hist function: freq=FALSE for standardised histograms
Dear All, I am a undergraduate using R for the first time. It seems like an excellent program and one that I look forward to using a lot over the next few years, but I have hit a very basic problem that I can't solve. I want to produce a standardised histogram, i.e. one where the area under the graph is equal to 1. I look at the manual for the histogram function and find this: freq: logical; if 'TRUE', the histogram graphic is a representation of frequencies, the 'counts' component of the result; if 'FALSE', probability densities, component 'density', are plotted (so that the histogram has a total area of one). Defaults to 'TRUE' _iff_ 'breaks' are equidistant (and 'probability' is not specified). I therefore expect that the following command:> h <- hist(StockReturns, freq=FALSE)where StockReturns has the following data in it:> sourcedata$StockReturns[1] -0.006983 0.111565 0.053782 0.027966 0.068956 0.165424 -0.022133 [8] -0.001910 0.052174 0.072589 -0.023002 0.000521 -0.015688 0.148459 [15] 0.054111 0.141044 0.096686 -0.012256 -0.030397 0.039365 0.021407 [22] -0.175750 0.053901 -0.095730 0.129717 0.333333 0.061563 0.085052 [29] 0.072295 -0.008500 0.100000 0.020000 -0.199763 0.081856 0.013636 [36] 0.007812 0.038647 -0.026945 0.037965 -0.079889 0.056234 -0.083333 [43] -0.012792 0.131711 0.015996 0.008149 0.104568 0.004046 -0.027750 [50] 0.050802 0.045714 0.092327 -0.017857 0.022574 0.083333 0.051366 [57] 0.004215 0.083228 0.046803 0.021335 0.023797 0.094891 0.036541 [64] 0.016423 -0.126365 0.034219 0.098330 0.079292 -0.009901 0.021559 [71] -0.039414 0.114286 0.101856 -0.010452 0.111111 0.097274 0.104843 [78] 0.144439 0.021868 0.106667 0.081250 0.002097 0.073302 0.087889 [85] -0.145165 0.014592 0.035000 0.131711 -0.126937 0.133989 would result in a graph that has an area of equal to 1.000. However, it does not - it produces frequency densities not standardized frequency densities. Can someone point me in the right direction here - I know I am being fantastically thick but can't find out how to do such a simple operation! My complete set of commands looks like this:> sourcedata <- read.table("c:/data.dat",header=T) > attach(sourcedata) > h <- hist(StockReturns, col='red', labels=TRUE, ylab="Frequency Density",probability=TRUE) Where c:\data.dat is a file with the numbers above it, one per line, and the first line containing the string "StockReturns". Many thanks, Alex Davies [[alternative HTML version deleted]]
Marco Geraci
2006-Apr-05 20:16 UTC
[R] hist function: freq=FALSE for standardised histograms
Hi, how did you evaluate the total area? Here is a simple example ### set.seed(100) x <- rnorm(100) x.h <- hist(x, freq=F, plot=F)> x.h$breaks [1] -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 $counts [1] 3 4 9 14 22 20 13 7 5 2 1 $intensities [1] 0.05999999 0.08000000 0.18000000 0.28000000 0.44000000 0.40000000 [7] 0.26000000 0.14000000 0.10000000 0.04000000 0.02000000 $density [1] 0.05999999 0.08000000 0.18000000 0.28000000 0.44000000 0.40000000 [7] 0.26000000 0.14000000 0.10000000 0.04000000 0.02000000 $mids [1] -2.25 -1.75 -1.25 -0.75 -0.25 0.25 0.75 1.25 1.75 2.25 2.75 $xname [1] "x" $equidist [1] TRUE attr(,"class") [1] "histogram"> sum(diff(x.h$breaks)*x.h$density)[1] 1 # Also, you can verify> diff(x.h$breaks)*x.h$density*100[1] 2.999999 4.000000 9.000000 14.000000 22.000000 20.000000 13.000000 [8] 7.000000 5.000000 2.000000 1.000000 HTH Marco --- Alex Davies <alex at davz.net> wrote:> Dear All, > > I am a undergraduate using R for the first time. It > seems like an excellent > program and one that I look forward to using a lot > over the next few years, > but I have hit a very basic problem that I can't > solve. > > I want to produce a standardised histogram, i.e. one > where the area under > the graph is equal to 1. I look at the manual for > the histogram function and > find this: > > freq: logical; if 'TRUE', the histogram graphic > is a representation > of frequencies, the 'counts' component of > the result; if > 'FALSE', probability densities, component > 'density', are > plotted (so that the histogram has a total > area of one). > Defaults to 'TRUE' _iff_ 'breaks' are > equidistant (and > 'probability' is not specified). > > I therefore expect that the following command: > > > h <- hist(StockReturns, freq=FALSE) > > where StockReturns has the following data in it: > > > sourcedata$StockReturns > [1] -0.006983 0.111565 0.053782 0.027966 > 0.068956 0.165424 -0.022133 > [8] -0.001910 0.052174 0.072589 -0.023002 > 0.000521 -0.015688 0.148459 > [15] 0.054111 0.141044 0.096686 -0.012256 > -0.030397 0.039365 0.021407 > [22] -0.175750 0.053901 -0.095730 0.129717 > 0.333333 0.061563 0.085052 > [29] 0.072295 -0.008500 0.100000 0.020000 > -0.199763 0.081856 0.013636 > [36] 0.007812 0.038647 -0.026945 0.037965 > -0.079889 0.056234 -0.083333 > [43] -0.012792 0.131711 0.015996 0.008149 > 0.104568 0.004046 -0.027750 > [50] 0.050802 0.045714 0.092327 -0.017857 > 0.022574 0.083333 0.051366 > [57] 0.004215 0.083228 0.046803 0.021335 > 0.023797 0.094891 0.036541 > [64] 0.016423 -0.126365 0.034219 0.098330 > 0.079292 -0.009901 0.021559 > [71] -0.039414 0.114286 0.101856 -0.010452 > 0.111111 0.097274 0.104843 > [78] 0.144439 0.021868 0.106667 0.081250 > 0.002097 0.073302 0.087889 > [85] -0.145165 0.014592 0.035000 0.131711 > -0.126937 0.133989 > > would result in a graph that has an area of equal to > 1.000. However, it does > not - it produces frequency densities not > standardized frequency densities. > Can someone point me in the right direction here - I > know I am being > fantastically thick but can't find out how to do > such a simple operation! > > My complete set of commands looks like this: > > > sourcedata <- read.table("c:/data.dat",header=T) > > attach(sourcedata) > > h <- hist(StockReturns, col='red', labels=TRUE, > ylab="Frequency Density", > probability=TRUE) > > Where c:\data.dat is a file with the numbers above > it, one per line, and the > first line containing the string "StockReturns". > > Many thanks, > > Alex Davies > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >