Gregor GORJANC <gregor.gorjanc at bfro.uni-lj.si> writes:> Hello! > > Up to now I have been using hist() to display the distributions. > Howevere, I noteiced strange numbers on y (vertical) axis, if I used > probability = T or freq = F option. I thought it is a bug and launched > the R-bug system and found some posts on that matter. Brian Ripley > responded to one, that one should look at truehist() for that. Ok I > can use truehist() if I want to see the ratios or probabilities, but > what is then the "density or probability" in hist()?...> truehist(mydata) # looks OKAnd truehist(mydata, h=.5)? It is a density estimate. The sum of the bar _areas_ should be 1.> sum(x$intensities * .5)[1] 0.9999998 -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Hello! Up to now I have been using hist() to display the distributions. Howevere, I noteiced strange numbers on y (vertical) axis, if I used probability = T or freq = F option. I thought it is a bug and launched the R-bug system and found some posts on that matter. Brian Ripley responded to one, that one should look at truehist() for that. Ok I can use truehist() if I want to see the ratios or probabilities, but what is then the "density or probability" in hist()? For example: # some data mydata <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,3,4,5) # histogram with frequencies hist(mydata) # histogram with ratios or probabilities hist(mydata, freq = F) # what are that values on vertical axis # lets take a look at values behind x <-hist(mydata, freq = F, plot = F); x $breaks [1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 $counts [1] 22 1 0 1 0 1 0 1 $intensities [1] 1.69230735 0.07692308 0.00000000 0.07692308 0.00000000 0.07692308 0.00000000 [8] 0.07692308 $density [1] 1.69230735 0.07692308 0.00000000 0.07692308 0.00000000 0.07692308 0.00000000 [8] 0.07692308 $mids [1] 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 $xname [1] "mydata" $equidist [1] TRUE attr(,"class") [1] "histogram" # HOW are this intensities and density values calculated? What they actually represent? # MASS packages library(MASS) # again histogram with prob = T by default truehist(mydata) # looks OK -- Lep pozdrav / With regards / Con respeto, Gregor GORJANC --------------------------------------------------------------- University of Ljubljana Biotechnical Faculty URI: http://www.bfro.uni-lj.si Zootechnical Department mail: gregor.gorjanc <at> bfro.uni-lj.si Groblje 3 tel: +386 (0)1 72 17 861 SI-1230 Domzale fax: +386 (0)1 72 41 005 Slovenia
Peter thanks for the response. So the results from hist(mydata) and truehist(mydata, h = .5) are the same. OK, but the sum of densities or intensities, for case that I gave, don't sum to 1 but to 2. Look bellow. I have an example where these density values are also up to 4 and sum to 5 (I have attached the PDF of that plot). This is really frustrating for me. What are actually these intensisties and densities, how are they calculated. Why are they the same? mydata <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,3,4,5) # histogram with frequencies hist(mydata) # histogram with ratios or probabilities hist(mydata, freq = F) # what are that values on vertical axis # lets take a look at values behind x <-hist(mydata, freq = F, plot = F); x # Sum values sum(x$intensities) [1] 2.000000 R > sum(x$density) [1] 2.000000 When I think of histogram (the one not with frequencies) I think of gathering records into "some classes" and then divide the number of records in each "class" by total number of all records. This is not the case in hist(). Sorry for being pain in the ..., but this is really weird. Above that R-team is really doing a great job. Thanks for such a good tool! -- Lep pozdrav / With regards / Con respeto, Gregor GORJANC --------------------------------------------------------------- University of Ljubljana Biotechnical Faculty URI: http://www.bfro.uni-lj.si Zootechnical Department mail: gregor.gorjanc <at> bfro.uni-lj.si Groblje 3 tel: +386 (0)1 72 17 861 SI-1230 Domzale fax: +386 (0)1 72 41 005 Slovenia --------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: Rplots.pdf Type: application/pdf Size: 4018 bytes Desc: not available Url : https://stat.ethz.ch/pipermail/r-help/attachments/20041129/affffeb4/Rplots.pdf