ierickson at starmine.com
2008-May-30 16:15 UTC
[Rd] Quartile summary generated by density() is misleading (PR#11541)
Full_Name: Ian Erickson Version: 2.5.1 (2007-06-27) OS: x86_64-redhat-linux-gnu Submission from: (NULL) (204.16.153.138) The quartile breaks reported by the density() function should intuitively be cumulative density quartiles for the distribution being estimated. However, what is calculated is instead simple quartiles for points used to plot the generated curve. Example: running density(rnorm(100000)) gives a 1st quartile of -2.2, and a 3rd quartile of +2.2. However, graphing the density using plot(density(rnorm(100000))) shows what would be expected - at -2.2, the cumulative density is only a few percent rather than 25%. Let me know if you have questions - the current calculated quartile numbers are trivially just the range of the data divided by 4. Thanks.