Gregor GORJANC <gregor.gorjanc at bfro.uni-lj.si> writes:> Hello! > > Up to now I have been using hist() to display the distributions. > Howevere, I noteiced strange numbers on y (vertical) axis, if I used > probability = T or freq = F option. I thought it is a bug and launched > the R-bug system and found some posts on that matter. Brian Ripley > responded to one, that one should look at truehist() for that. Ok I > can use truehist() if I want to see the ratios or probabilities, but > what is then the "density or probability" in hist()?...> truehist(mydata) # looks OKAnd truehist(mydata, h=.5)? It is a density estimate. The sum of the bar _areas_ should be 1.> sum(x$intensities * .5)[1] 0.9999998 -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Hello!
Up to now I have been using hist() to display the distributions.
Howevere, I noteiced strange numbers on y (vertical) axis, if I used
probability = T or freq = F option. I thought it is a bug and launched
the R-bug system and found some posts on that matter. Brian Ripley
responded to one, that one should look at truehist() for that. Ok I can
use truehist() if I want to see the ratios or probabilities, but what is
then the "density or probability" in hist()?
For example:
# some data
mydata <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,3,4,5)
# histogram with frequencies
hist(mydata)
# histogram with ratios or probabilities
hist(mydata, freq = F) # what are that values on vertical axis
# lets take a look at values behind
x <-hist(mydata, freq = F, plot = F); x
$breaks
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
$counts
[1] 22 1 0 1 0 1 0 1
$intensities
[1] 1.69230735 0.07692308 0.00000000 0.07692308 0.00000000 0.07692308
0.00000000
[8] 0.07692308
$density
[1] 1.69230735 0.07692308 0.00000000 0.07692308 0.00000000 0.07692308
0.00000000
[8] 0.07692308
$mids
[1] 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
$xname
[1] "mydata"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
# HOW are this intensities and density values calculated? What they
actually represent?
# MASS packages
library(MASS)
# again histogram with prob = T by default
truehist(mydata) # looks OK
--
Lep pozdrav / With regards / Con respeto,
Gregor GORJANC
---------------------------------------------------------------
University of Ljubljana
Biotechnical Faculty URI: http://www.bfro.uni-lj.si
Zootechnical Department mail: gregor.gorjanc <at> bfro.uni-lj.si
Groblje 3 tel: +386 (0)1 72 17 861
SI-1230 Domzale fax: +386 (0)1 72 41 005
Slovenia
Peter thanks for the response.
So the results from hist(mydata) and truehist(mydata, h = .5) are the
same. OK, but the sum of densities or intensities, for case that I gave,
don't sum to 1 but to 2. Look bellow. I have an example where these
density values are also up to 4 and sum to 5 (I have attached the PDF of
that plot).
This is really frustrating for me. What are actually these intensisties
and densities, how are they calculated. Why are they the same?
mydata <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,3,4,5)
# histogram with frequencies
hist(mydata)
# histogram with ratios or probabilities
hist(mydata, freq = F) # what are that values on vertical axis
# lets take a look at values behind
x <-hist(mydata, freq = F, plot = F); x
# Sum values
sum(x$intensities)
[1] 2.000000
R > sum(x$density)
[1] 2.000000
When I think of histogram (the one not with frequencies) I think of
gathering records into "some classes" and then divide the number of
records in each "class" by total number of all records. This is not
the
case in hist().
Sorry for being pain in the ..., but this is really weird. Above that
R-team is really doing a great job. Thanks for such a good tool!
--
Lep pozdrav / With regards / Con respeto,
Gregor GORJANC
---------------------------------------------------------------
University of Ljubljana
Biotechnical Faculty URI: http://www.bfro.uni-lj.si
Zootechnical Department mail: gregor.gorjanc <at> bfro.uni-lj.si
Groblje 3 tel: +386 (0)1 72 17 861
SI-1230 Domzale fax: +386 (0)1 72 41 005
Slovenia
---------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplots.pdf
Type: application/pdf
Size: 4018 bytes
Desc: not available
Url :
https://stat.ethz.ch/pipermail/r-help/attachments/20041129/affffeb4/Rplots.pdf