On Tue, 18 Jul 2006, Jim Frederick wrote:
> Hi,
>
> When I used
> x <- c(10,10,10,10,5,5,5,5)
> sum(x)
> x <- x - .Last.value/8
> the result was zero, as I expected.
> However, when I used
> x <- rnorm(101,0,10)
> sum(x)
> x <- x - .Last.value/101
> sum(x)
> I did not get zero, but -2.664535e-15. OK, that's fairly close to
> zero, but I tried to improve the figure. Each time I repeated the last
> two commands, the sum got closer to zero. The series seemed to converge
> on -1.30e-15, so changed the 101 to 50, then to 20, and then to 10. I
> had sum(x) down to -5.551115e-17 before I quit.
>
> If R can store values of x which sum to -5.551115e-17, then why can't I
> get that value on the first try? Is this a bug or just rounding error?
>
It's just rounding error. The *relative* error is always a multiple of
2^-52, or about 2e-16 but as sum(x) gets smaller a given relative error is
a smaller absolute error.
Comparing values from doing this once or twice is not very reliable. To
see how much variation there is from sample to sample, do something like
a<-replicate(10000, {x<-rnorm(101,0,10); y<-sum(x);
sum(x-y/101)/.Machine$double.eps})
A histogram or qqplot shows an approximately Normal distribution, with
mean close to zero and standard deviation about 70. Zero error does
happen: I got 20/10000 exactly zero.
There aren't any simple answers to exactly how big the error is for
particular computations. The exact answers are complicated and the simple
answers are approximate. For example, in your case a reasonable upper
bound for the error might be 102 computations multiplied by 10 as a
typical value of the numbers, multiplied by .Machine$double.eps. This is
much larger than the typical error, of course. A crude estimate of the
typical error might be sqrt(102)*10*.Machine$double.eps, treating the
errors as a random walk, and this is not far off (though perhaps only
coincidentally so).
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle