Prompted by Peter Dalgard's recent elegant "intbin" function, I have been playing with the extension to converting reals to binary representation. The decimal part can be done like this: decbase <- function(x, n=52, base=2) { if(n) { x <- x*base paste(trunc(x), decbase(x%%1, n-1, base), sep="") } } n=52 default because that's the number of bits in the significand of a 64-bit float. Now, `decbase(0.1)` is a bit curious in that the 0.1 is going to be converted to a binary float by the interpreter ... and then re-converted by `decbase`, so really I should insist on character format for the number I want to convert. But anyway I do get the right answer up to the point of truncation:> decbase(.1)[1] "0001100110011001100110011001100110011001100110011001"> decbase(.2)[1] "0011001100110011001100110011001100110011001100110011"> decbase(.3)[1] "0100110011001100110011001100110011001100110011001100" That is to say, decbase(.1) + decbase(.2) really does equal decbase(.3). But not if R does its own arithmetic first:> decbase(.1+.2)[1] "0100110011001100110011001100110011001100110011001101" What has gone on here? Why does R apparently get it's internal representation of one of .1 or .2 "wrong" ? Does the end of the internal binary for .1 get rounded up instead of truncated ? Why wouldn't that show in decbase(.1) ? Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 644449 Fax: +44 (0) 1379 644445 email: Simon.Fear at synequanon.com web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}}
On Fri, 6 Feb 2004 12:55:05 -0000, "Simon Fear" <Simon.Fear at synequanon.com> wrote :>Prompted by Peter Dalgard's recent elegant "intbin" function, >I have been playing with the extension to converting reals to binary >representation. The decimal part can be done like this: > >decbase <- function(x, n=52, base=2) { > if(n) { > x <- x*base > paste(trunc(x), decbase(x%%1, n-1, base), sep="") > } >} > >n=52 default because that's the number of bits in the significand of >a 64-bit float.Remember that IEEE double formats are complicated, they're not fixed point formats. Both 0.1 and 0.2 are less than 1, so the n=52 count is wrong. I think 0.1 would be stored as (1 + 0.6)*2^(-4) and 0.2 would be stored as (1 + 0.6)*2^(-3), whereas 0.3 would be stored as (1 + 0.2)*2^(-2). You should expect 56 decimal (binary?) place accuracy on 0.1, 55 place accuracy on 0.2, and 54 place accuracy on 0.3. It's not surprising weird things happen! Duncan Murdoch