Ivan Krylov
2022-May-12 20:05 UTC
[R] result of mean(v1, v2, v3) of three real number not the same as sum(v1, v2, v3)/3
Eric Berger and Marc Schwartz and David K Stevens probably said it better. I was trying to illustrate the way mean() takes its arguments using the match.call function. The sum() function can take individual numbers or vectors and sum all their elements, so sum(c(1, 2, 3)) is the same as sum(1, 2, 3), or even sum(c(1, 2), 3): they all do what you mean them to do. The mean() function is different. It may accept many arguments, but only the first of them is the vector of numbers you're interested in: mean(c(1, 2, 3)) is the correct way to call it. Unfortunately, when you give it more arguments and they aren't what mean() expects them to be (the second one should be a number in [0; 0.5] and the third one should be TRUE or FALSE, see help(mean) if you're curious), R doesn't warn you or raise an error condition. My use of match.call() was supposed to show that by calling mean(a, b, c), I pass the number "b" as the "trim" argument to mean() and the number "c" as the "na.rm" argument to mean(), which is not what was intended here. -- Best regards, Ivan
Henrik Bengtsson
2022-May-12 20:39 UTC
[R] result of mean(v1, v2, v3) of three real number not the same as sum(v1, v2, v3)/3
There's actually another reason why mean(x) and sum(x)/length(x) may differ, e.g. x <- c(rnorm(1e6, sd=.Machine$double.eps), rnorm(1e6, sd=1)) mean(x) - sum(x)/length(x) #> [1] 1.011781e-18 The mean() function calculates the sample mean using a two pass scan through the data. The first scan calculates the total sum and divides by the number of (non-missing) values. In the second scan, this average is refined by adding the residuals towards the first average. This way numerical precision of mean(x) is higher than sum(x)/length(x) when there spread of 'x' is large. It also means that the processing time of mean(x) is roughly twice that of sum(x)/length(x). /Henrik On Thu, May 12, 2022 at 1:22 PM Ivan Krylov <krylov.r00t at gmail.com> wrote:> > Eric Berger and Marc Schwartz and David K Stevens probably said it > better. I was trying to illustrate the way mean() takes its arguments > using the match.call function. > > The sum() function can take individual numbers or vectors and > sum all their elements, so sum(c(1, 2, 3)) is the same as sum(1, 2, 3), > or even sum(c(1, 2), 3): they all do what you mean them to do. > > The mean() function is different. It may accept many arguments, but > only the first of them is the vector of numbers you're interested in: > mean(c(1, 2, 3)) is the correct way to call it. Unfortunately, when you > give it more arguments and they aren't what mean() expects them to be > (the second one should be a number in [0; 0.5] and the third one should > be TRUE or FALSE, see help(mean) if you're curious), R doesn't warn you > or raise an error condition. > > My use of match.call() was supposed to show that by calling mean(a, b, > c), I pass the number "b" as the "trim" argument to mean() and the > number "c" as the "na.rm" argument to mean(), which is not what was > intended here. > > -- > Best regards, > Ivan > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.