Hello R-dev, Yesterday, while I was testing the newly implemented function pmean in package kit, I noticed a mismatch in the output of the below R expressions. set.seed(123) n=1e3L idx=5 x=rnorm(n) y=rnorm(n) z=rnorm(n) a=(x[idx]+y[idx]+z[idx])/3 b=mean(c(x[idx],y[idx],z[idx])) a==b # [1] FALSE For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and many others the difference is small but still. Is that expected or is it a bug? Thank you Best Regards Morgan Jacob [[alternative HTML version deleted]]
Expected, see FAQ 7.31. You just can't trust == on FP operations. Notice also> a2=(z[idx]+x[idx]+y[idx])/3 > a2==a[1] FALSE> a2==b[1] TRUE -pd> On 20 May 2020, at 12:40 , Morgan Morgan <morgan.emailbox at gmail.com> wrote: > > Hello R-dev, > > Yesterday, while I was testing the newly implemented function pmean in > package kit, I noticed a mismatch in the output of the below R expressions. > > set.seed(123) > n=1e3L > idx=5 > x=rnorm(n) > y=rnorm(n) > z=rnorm(n) > a=(x[idx]+y[idx]+z[idx])/3 > b=mean(c(x[idx],y[idx],z[idx])) > a==b > # [1] FALSE > > For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and many > others the difference is small but still. > Is that expected or is it a bug? > > Thank you > Best Regards > Morgan Jacob > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
> On Wednesday, May 20, 2020, 7:00:09 AM EDT, peter dalgaard <pdalgd at gmail.com> wrote: > > Expected, see FAQ 7.31. > > You just can't trust == on FP operations. Notice alsoAdditionally, since you're implementing a "mean" function you are testing against R's mean, you might want to consider that R uses a two-pass calculation[1] to reduce floating point precision error. Best, Brodie. [1] https://github.com/wch/r-source/blob/tags/R-4-0-0/src/main/summary.c#L482> > a2=(z[idx]+x[idx]+y[idx])/3 > > a2==a > [1] FALSE > > a2==b > [1] TRUE > > -pd > > > On 20 May 2020, at 12:40 , Morgan Morgan <morgan.emailbox at gmail.com> wrote: > > > > Hello R-dev, > > > > Yesterday, while I was testing the newly implemented function pmean in > > package kit, I noticed a mismatch in the output of the below R expressions. > > > > set.seed(123) > > n=1e3L > > idx=5 > > x=rnorm(n) > > y=rnorm(n) > > z=rnorm(n) > > a=(x[idx]+y[idx]+z[idx])/3 > > b=mean(c(x[idx],y[idx],z[idx])) > > a==b > > # [1] FALSE > > > > For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and many > > others the difference is small but still. > > Is that expected or is it a bug?