Dear list, I am calculating the 95th percentile of a set of values with R and with SPSS In R:> normal200<-rnorm(200,0,1) > qnorm(0.95,mean=mean(normal200),sd=sd(normal200),lower.tail =TRUE)[1] 1.84191 In SPSS, if I use the same 200 values and select Analyze -> Descriptive Statistics -> Frequencies and under "Statistics", I type in '95' under Percentiles, then the output is Percentile 95 1.9720 I think the main difference is that SPSS only calculates critical values within the range of values in the data, while R fits a normal and calculates the critical value using the fitted distribution. This is more obvious if the size of the data is much lower:> normal20[1] 0.27549020 0.87994304 -0.23737370 0.04565484 -1.10207183 -0.68035949 0.01698773 -2.15812038 0.26296513 0.21873981 0.03266598 -0.01318572 [13] 0.83492830 0.54652613 0.73993948 -0.31937556 -0.03060194 -0.96028421 0.27745331 -1.01292410> max(normal20)[1] 0.879943> qnorm(0.95,mean=mean(normal20),sd=sd(normal20),lower.tail =TRUE)[1] 1.118065 And in SPSS Percentile 95 0.8777 Can anyone comment on my statement? and thus, is R more exact? The differences are quite large and this is important for setting thresholds. Cheers, Dave [[alternative HTML version deleted]]
On Thu, Nov 8, 2012 at 12:17 PM, David A. <dasolexa at hotmail.com> wrote:> In R: > >> normal200<-rnorm(200,0,1)You forgot set.seed(310366) so we can reproduce your random numbers exactly.> I think the main difference is that SPSS only calculates critical values within the range of values in the data, while R fits a normal and calculates the critical value using the fitted distribution. This is more obvious if the size of the data is much lower:Is SPSS just estimating the 95th percentile from your data? Regardless of any distribution? Like R's quantile(normal20,0.95)? I get much closer answers to your SPSS using R for that, and I suspect one of the 9 quantile algorithms will give an exact answer (unless SPSS uses something else entirely). Whereas qnorm in R is giving you the 95th percentile of a Normal distribution with a given mean and sd. Barry
On 12-11-08 7:17 AM, David A. wrote:> > Dear list, > > I am calculating the 95th percentile of a set of values with R and with SPSS > > In R: > >> normal200<-rnorm(200,0,1) >> qnorm(0.95,mean=mean(normal200),sd=sd(normal200),lower.tail =TRUE) > [1] 1.84191 > > In SPSS, if I use the same 200 values and select Analyze -> Descriptive Statistics -> Frequencies > > and under "Statistics", I type in '95' under Percentiles, then the output is > > Percentile 95 1.9720 > > > > I think the main difference is that SPSS only calculates critical values within the range of values in the data, while R fits a normal and calculates the critical value using the fitted distribution. This is more obvious if the size of the data is much lower: > >> normal20 > [1] 0.27549020 0.87994304 -0.23737370 0.04565484 -1.10207183 -0.68035949 0.01698773 -2.15812038 0.26296513 0.21873981 0.03266598 -0.01318572 > [13] 0.83492830 0.54652613 0.73993948 -0.31937556 -0.03060194 -0.96028421 0.27745331 -1.01292410 >> max(normal20) > [1] 0.879943 >> qnorm(0.95,mean=mean(normal20),sd=sd(normal20),lower.tail =TRUE) > [1] 1.118065 > > And in SPSS > > Percentile 95 0.8777 > > > > Can anyone comment on my statement? and thus, is R more exact? The differences are quite large and this is important for setting thresholds.The part of your statement where you say "R fits a normal and calculates the critical value using the fitted distribution" is false. *You* did that (in your call to qnorm, rather than using the quantile function), it's not something R would normally do. Is R "more exact"? It's possible, but I doubt it. I imagine SPSS could do what R did, and R could do what SPSS did. Duncan Murdoch
Hi, David, I think you're confusing the q-th percentile of your data, i. e., the empirical q-th percentile, which is -- roughly -- the value x_q for which q * 100 % of the data are less than or equal to x_q, with the q-th percentile of a distribution (here the normal distribution) that has as population mean the arithmetic mean of the data and as population standard deviation the standard deviation of the data. Those are different things. Your SPSS code seems to compute the empirical quantile, but you R code produces the other quantile. To get empirical quantiles of your data in R see ?quantile Hth -- Gerrit On Thu, 8 Nov 2012, David A. wrote:> > Dear list, > > I am calculating the 95th percentile of a set of values with R and with SPSS > > In R: > >> normal200<-rnorm(200,0,1) >> qnorm(0.95,mean=mean(normal200),sd=sd(normal200),lower.tail =TRUE) > [1] 1.84191 > > In SPSS, if I use the same 200 values and select Analyze -> Descriptive Statistics -> Frequencies > > and under "Statistics", I type in '95' under Percentiles, then the output is > > Percentile 95 1.9720 > > > > I think the main difference is that SPSS only calculates critical values within the range of values in the data, while R fits a normal and calculates the critical value using the fitted distribution. This is more obvious if the size of the data is much lower: > >> normal20 > [1] 0.27549020 0.87994304 -0.23737370 0.04565484 -1.10207183 -0.68035949 0.01698773 -2.15812038 0.26296513 0.21873981 0.03266598 -0.01318572 > [13] 0.83492830 0.54652613 0.73993948 -0.31937556 -0.03060194 -0.96028421 0.27745331 -1.01292410 >> max(normal20) > [1] 0.879943 >> qnorm(0.95,mean=mean(normal20),sd=sd(normal20),lower.tail =TRUE) > [1] 1.118065 > > And in SPSS > > Percentile 95 0.8777 > > > > Can anyone comment on my statement? and thus, is R more exact? The differences are quite large and this is important for setting thresholds. > > > Cheers, > > Dave > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.--------------------------------------------------------------------- Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eichner at math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/cms/eichner