thr3ads.net - R devel - [Rd] bug with var(rep(1e30, 3)) (PR#1228) [Dec 2001]

If this information is useful, please help other people find it:
Share via:

paradis@isem.univ-montp2.fr

2001-Dec-26 12:07 UTC

[Rd] bug with var(rep(1e30, 3)) (PR#1228)

There seems to be a bug with var() when the argument is a vector with
exactly three values of 1e30 (or close to this value). This does not happen
with twice, four (or more) times this value, or another value.

> var(rep(1e30, 3))
[1] 2.971056e+28> var(rep(1.2e30, 3))
[1] 2.971056e+28> var(rep(0.9e30, 3))
[1] 2.971056e+28> var(rep(0.8e30, 3))
[1] 0> var(rep(1e29, 3))
[1] 0> var(rep(1e30, 2))
[1] 0> var(rep(1e30, 4))
[1] 0> var(rep(1e31, 3))[1] 0


The bug is repeatable, and I got the same results with R 1.3.1 (binary from
CRAN) and R 1.4.0 (binary from BD Ripley's site) both on Windows NT, and R
1.4.0 on Solaris (compiled from sources).



Emmanuel Paradis
Laboratoire de Paléontologie
Institut des Sciences de l'Évolution
Université Montpellier II
F-34095 Montpellier cédex 05
France
   phone: +33  4 67 14 39 64
     fax: +33  4 67 14 36 10
  mailto:paradis@isem.univ-montp2.fr

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Peter Dalgaard BSA

2001-Dec-26 14:31 UTC

head link

[Rd] bug with var(rep(1e30, 3)) (PR#1228)

paradis@isem.univ-montp2.fr writes:
> There seems to be a bug with var() when the argument is a vector with
> exactly three values of 1e30 (or close to this value). This does not happen
> with twice, four (or more) times this value, or another value.
> 
> 
> > var(rep(1e30, 3))
> [1] 2.971056e+28
> > var(rep(1.2e30, 3))
> [1] 2.971056e+28
> > var(rep(0.9e30, 3))
> [1] 2.971056e+28
> > var(rep(0.8e30, 3))
> [1] 0
> > var(rep(1e29, 3))
> [1] 0
> > var(rep(1e30, 2))
> [1] 0
> > var(rep(1e30, 4))
> [1] 0
> > var(rep(1e31, 3))
> [1] 0
> 
> 
> The bug is repeatable, and I got the same results with R 1.3.1 (binary from
> CRAN) and R 1.4.0 (binary from BD Ripley's site) both on Windows NT,
and R
> 1.4.0 on Solaris (compiled from sources).
Not a bug, roundoff:
> (1e30+1e30+1e30)/3-1e30
[1] 1.407375e+14> 3*((1e30+1e30+1e30)/3-1e30)^2/2[1] 2.971056e+28

Relative errors of the order 1e-16 are completely expectable in
floating point arithmetic.

For illustration, look at it in octal, after removing a bunch of
trailing zeroes:
> x<-1e30/(8^16)
> structure(x, class="octmode")
[1] "144762623464043165"> structure(x+x+x, class="octmode")[1] "456730272634151540"

If you look carefully, you will see that a bit is lost in the process
since 5+5+5 is 17 octal.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

ripley@stats.ox.ac.uk

2001-Dec-26 15:52 UTC

head link

[Rd] bug with var(rep(1e30, 3)) (PR#1228)

But there are better algorithms that are not subject to this sort
of rounding error.

I reckon that computing a variance this way is not the quality of
algorithm one should expect from R.

On 26 Dec 2001, Peter Dalgaard BSA wrote:
> paradis@isem.univ-montp2.fr writes:
>
> > There seems to be a bug with var() when the argument is a vector with
> > exactly three values of 1e30 (or close to this value). This does not
happen
> > with twice, four (or more) times this value, or another value.
> >
> >
> > > var(rep(1e30, 3))
> > [1] 2.971056e+28
> > > var(rep(1.2e30, 3))
> > [1] 2.971056e+28
> > > var(rep(0.9e30, 3))
> > [1] 2.971056e+28
> > > var(rep(0.8e30, 3))
> > [1] 0
> > > var(rep(1e29, 3))
> > [1] 0
> > > var(rep(1e30, 2))
> > [1] 0
> > > var(rep(1e30, 4))
> > [1] 0
> > > var(rep(1e31, 3))
> > [1] 0
> >
> >
> > The bug is repeatable, and I got the same results with R 1.3.1 (binary
from
> > CRAN) and R 1.4.0 (binary from BD Ripley's site) both on Windows
NT, and R
> > 1.4.0 on Solaris (compiled from sources).
>
> Not a bug, roundoff:
>
> > (1e30+1e30+1e30)/3-1e30
> [1] 1.407375e+14
> > 3*((1e30+1e30+1e30)/3-1e30)^2/2
> [1] 2.971056e+28
>
> Relative errors of the order 1e-16 are completely expectable in
> floating point arithmetic.
>
> For illustration, look at it in octal, after removing a bunch of
> trailing zeroes:
>
> > x<-1e30/(8^16)
> > structure(x, class="octmode")
> [1] "144762623464043165"
> > structure(x+x+x, class="octmode")
> [1] "456730272634151540"
>
> If you look carefully, you will see that a bit is lost in the process
> since 5+5+5 is 17 octal.
>
> --
>    O__  ---- Peter Dalgaard             Blegdamsvej 3
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-devel mailing list -- Read
http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
>
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

p.dalgaard@biostat.ku.dk

2001-Dec-26 17:01 UTC

head link

[Rd] bug with var(rep(1e30, 3)) (PR#1228)

Prof Brian Ripley <ripley@stats.ox.ac.uk> writes:
> But there are better algorithms that are not subject to this sort
> of rounding error.
> 
> I reckon that computing a variance this way is not the quality of
> algorithm one should expect from R.
Hmm. Which algorithms are there? Surely almost any kind of provisional
means subtraction could avoid the the extreme case where the mean of a
bunch of identical numbers is different from all of them, but will it
also improve precision in the general case?

Essentially our current algorithm in cov.c is

xm<- sum(x)/n
v <- 1/(n-1)*sum((x-xm)^2)
> > > x<-1e30/(8^16)
> > > structure(x, class="octmode")
> > [1] "144762623464043165"
> > > structure(x+x+x, class="octmode")
> > [1] "456730272634151540"
> >
> > If you look carefully, you will see that a bit is lost in the process
> > since 5+5+5 is 17 octal.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Possibly Parallel Threads

Search for more possibly parallel threads

R devel - Dec 2001 - bug with var(rep(1e30, 3)) (PR#1228)

[Rd] bug with var(rep(1e30, 3)) (PR#1228)

[Rd] bug with var(rep(1e30, 3)) (PR#1228)

[Rd] bug with var(rep(1e30, 3)) (PR#1228)

[Rd] bug with var(rep(1e30, 3)) (PR#1228)

Possibly Parallel Threads