hp wan
2013-Jan-21 22:18 UTC
[R] Why using hist when setting the parameter probability=TRUE does not create probability plot?
Hi All, When carrying out hist(samples,breaks=50,probability=TRUE), the column values are considerably greater than 1, which seams very unreasonable. The plot is attached. I think the column value of the hist plot should correspond to x$counts/sum(x$counts) (x=hist(samples,breaks=50,probability=TRUE)). The size of data is a little bit larger, causing failure of uploading. If you need the data, I can email it to you. Can anyone help me? Thanks! Best regares, Huaping Wan -------------- next part -------------- A non-text attachment was scrubbed... Name: hist.png Type: image/png Size: 3592 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130122/9a829983/attachment.png>
Duncan Murdoch
2013-Jan-21 23:28 UTC
[R] Why using hist when setting the parameter probability=TRUE does not create probability plot?
On 13-01-21 5:18 PM, hp wan wrote:> Hi All, > > When carrying out hist(samples,breaks=50,probability=TRUE), the column > values are considerably greater than 1, which seams very unreasonable. The > plot is attached. > > I think the column value of the hist plot should correspond to > x$counts/sum(x$counts) > (x=hist(samples,breaks=50,probability=TRUE)). The size of data is a > little bit larger, causing failure of uploading. If you need the data, I > can email it to you. > > Can anyone help me?I think you need to reread the documentation. It is a "probability density" plot, not a "probability plot." You need to integrate the values to get probabilities. Probability density functions can be bigger than 1 as long as they integrate to 1. Duncan Murdoch> > Thanks! > > Best regares, > > Huaping Wan > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Mark Leeds
2013-Jan-22 00:39 UTC
[R] Why using hist when setting the parameter probability=TRUE does not create probability plot?
let me look at but it's probably best to send to the whole list because there are many people on it way more knowledgable than myself. I'm ccing the list and hope you don't mind. my fault for replying privately initially. On Mon, Jan 21, 2013 at 7:36 PM, hp wan <huaping.wan@gmail.com> wrote:> Thanks for your reply! > > > breaks=c(-1.55,-1.50,-1.45,-1.40,-1.35,-1.30,-1.25,-1.20,-1.15,-1.10,-1.05,-1.00,-0.95,-0.90,-0.85,-0.80,-0.75,-0.70,-0.65,-0.60,-0.55,-0.50,-0.45,-0.40,-0.35,-0.30,-0.25-0.20,-0.15,-0.10,-0.05,0.00,0.05,0.10,0.15,0.20,0.25,0.30,0.35,0.40,0.45,0.50,0.55) > > counts=c(287,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,212,2624,2918,0,0,0,75,36317,4963,0,0,2462,0,0,0,0,0,142) > percentage=counts/sum(counts) > barplot(percentage,xlab=breaks) > > The horizontal value (that is xlabe) looks very ugly. I hope it looks like > the xlab of hist, that is x axis correspond to breaks. > > After ?barplot, I also have no idea to implement it. > > 2013/1/22 Mark Leeds <markleeds2@gmail.com> > >> I'm not sure that I understand but can't you just take the data and >> divide it by the sum of the data and plot that ? >> >> >> >> On Mon, Jan 21, 2013 at 6:36 PM, hp wan <huaping.wan@gmail.com> wrote: >> >>> Thanks for your reply. >>> >>> If I set the probability = FALSE, the column values are corresponding to >>> the refrequency (the numbers of values falling in intervals). I want the >>> coulumn values are percentage, that is x$counts/sum(x$counts). >>> >>> >>> 2013/1/22 Mark Leeds <markleeds2@gmail.com> >>> >>>> Hi: the density integrates to 1 but the actual height of the density at >>>> each point is not less necessarily than 1. for what you want, you should be >>>> using probability = FALSE. >>>> >>>> you can do pnorm(x=0,0,1) to see this. >>>> >>>> >>>> >>>> On Mon, Jan 21, 2013 at 5:18 PM, hp wan <huaping.wan@gmail.com> wrote: >>>> >>>>> Hi All, >>>>> >>>>> When carrying out hist(samples,breaks=50,probability=TRUE), the column >>>>> values are considerably greater than 1, which seams very unreasonable. >>>>> The >>>>> plot is attached. >>>>> >>>>> I think the column value of the hist plot should correspond to >>>>> x$counts/sum(x$counts) >>>>> (x=hist(samples,breaks=50,probability=TRUE)). The size of data is a >>>>> little bit larger, causing failure of uploading. If you need the >>>>> data, I >>>>> can email it to you. >>>>> >>>>> Can anyone help me? >>>>> >>>>> Thanks! >>>>> >>>>> Best regares, >>>>> >>>>> Huaping Wan >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>> >>>> >>> >> >[[alternative HTML version deleted]]