thr3ads.net - R help - [R] rounding down with as.integer [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Duncan Murdoch

2015-Jan-01 19:06 UTC

[R] rounding down with as.integer

On 01/01/2015 1:21 PM, Mike Miller wrote:> On Thu, 1 Jan 2015, Duncan Murdoch wrote:
> 
>> On 31/12/2014 8:44 PM, David Winsemius wrote:
>>>
>>> On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:
>>>
>>>> This is probably a FAQ, and I don't really have a question
about it, but I just ran across this in something I was working on:
>>>>
>>>>> as.integer(1000*1.003)
>>>> [1] 1002
>>>>
>>>> I didn't expect it, but maybe I should have.  I guess
it's about the machine precision added to the fact that as.integer always
rounds down:
>>>>
>>>>
>>>>> as.integer(1000*1.003 + 255 * .Machine$double.eps)
>>>> [1] 1002
>>>>
>>>>> as.integer(1000*1.003 + 256 * .Machine$double.eps)
>>>> [1] 1003
>>>>
>>>>
>>>> This does it right...
>>>>
>>>>> as.integer( round( 1000*1.003 ) )
>>>> [1] 1003
>>>>
>>>> ...but this seems to always give the same answer and it is a
little faster in my application:
>>>>
>>>>> as.integer( 1000*1.003 + .1 )
>>>> [1] 1003
>>>>
>>>>
>>>> FYI - I'm reading in a long vector of numbers from a text
file with no more than three digits to the right of the decimal.  I'm
converting them to integers and saving them in binary format.
>>>>
>>>
>>> So just add 0.0001 or even .0000001 to all of them and coerce to
integer.
>>
>> I don't think the original problem was stated clearly, so I'm
not sure
>> whether this is a solution, but it looks wrong to me.  If you want to 
>> round to the nearest integer, why not use round() (without the 
>> as.integer afterwards)?  Or if you really do want an integer, why add 
>> 0.1 or 0.0001, why not add 0.5 before calling as.integer()?  This is
the
>> classical way to implement round().
>>
>> To state the problem clearly, I'd like to know what result is
expected
>> for any real number x.  Since R's numeric type only approximates
the
>> real numbers we might not be able to get a perfect match, but at least 
>> we could quantify how close we get.  Or is the input really character 
>> data?  The original post mentioned reading numbers from a text file.
> 
> 
> Maybe you'd like to know what I'm really doing.  I have 1600 text
files
> each with up to 16,000 lines with 3100 numbers per line, delimited by a 
> single space.  The numbers are between 0 and 2, inclusive, and they have 
> up to three digits to the right of the decimal.  Every possible value in 
> that range will occur in the data.  Some examples numbers: 0 1 2 0.325 
> 1.12 1.9.  I want to multiply by 1000 and store them as 16-bit integers 
> (uint16).
> 
> I've been reading in the data like so:
> 
>> data <- scan( file=FILE, what=double(), nmax=3100*16000)
> 
> At first I tried making the integers like so:
> 
>> ptm <- proc.time() ; ints <- as.integer( 1000 * data ) ;
proc.time()-ptm
>     user  system elapsed
>    0.187   0.387   0.574
> 
> I decided I should compare with the result I got using round():
> 
>> ptm <- proc.time() ; ints2 <- as.integer( round( 1000 * data ) )
; proc.time()-ptm
>     user  system elapsed
>    1.595   0.757   2.352
> 
> It is a curious fact that only a few of the values from 0 to 2000 disagree 
> between the two methods:
> 
>> table( ints2[ ints2 != ints ] )
> 
>   1001  1003  1005  1007  1009  1011  1013  1015  1017  1019  1021  1023
> 35651 27020 15993 11505  8967  7549  6885  6064  5512  4828  4533  4112
> 
> I understand that it's all about the problem of representing digital 
> numbers in binary, but I still find some of the results a little 
> surprising, like that list of numbers from the table() output.  For 
> another example:
> 
>> 1000+3 - 1000*(1+3/1000)
> [1] 1.136868e-13
> 
>> 3 - 1000*(0+3/1000)
> [1] 0
> 
>> 2000+3 - 1000*(2+3/1000)
> [1] 0
> 
> See what I mean?  So there is something special about the numbers around 
> 1000.
I think it's really that there is something special about the numbers
near 1, and you're multiplying that by 1000.

Numbers from 1 to just below 2 are stored as their fractional part, with
52 bit precision.  Some intermediate calculations will store them with
64 bit precision.  52 bits gives about 15 or 16 decimal places.

If your number x is close to 3/1000, it is stored as the fractional part
of 2^9 * x.  This gives it an extra 2 or 3 decimal digits of precision,
so that's why these values are accurate.

If your number x is close to 2.003, it is stored as the fractional part
of x/2, i.e. with errors like 1.0015 would have.  So I would have
guessed that 2.006 would have the same problems as 1.003, but I thought
you didn't see that.  So I tried it myself, and I do see that:
> 1000+3 - 1000*(1+3/1000)
[1] 1.136868e-13> 2000+6 - 1000*(2+6/1000)[1] 2.273737e-13

Reading more closely, I see that you didn't test this particular case,
so there's no contradiction here.

The one thing I couldn't think of an explanation for is why other
numbers between 1 and 2 don't have the same sorts of problems.  So I
tried the following:

# Set data to 1.000 thru 1.999
data <- 1 + 0:999/1000

# Find the errors
errors <- 1000 + 0:999 - 1000*data

# Plot them
plot(data, errors)

The plot doesn't show a uniform distribution, but much more uniform than
yours:  so I think your data doesn't really cover all possible values
from 0.000 to 1.999.  (I get a similar plot if I look at cases where
ints != ints2 with my data.)

Duncan Murdoch
> 
> Back to the quesion at hand:  I can avoid use of round() and speed things 
> up a little bit by just adding a small number after multiplying by 1000:
> 
>> ptm <- proc.time() ; R3 <- as.integer( 1000 * data + .1 ) ;
proc.time()-ptm
>     user  system elapsed
>    0.224   0.594   0.818
> 
> You point out that adding .5 makes sense.  That is probably a better idea 
> and I should take that approach under most conditions, but in this case we 
> can add anything between 2e-13 and about 0.99999999999 and always get the 
> same answer.  We also have to remember that if a number might be negative 
> (not a problem for me in this application), we need to subtract 0.5 
> instead of adding it.
> 
> Anyway, right now this is what I'm actually doing:
> 
>> con <- file( paste0(FILE, ".uint16"), "wb" )
>> ptm <- proc.time() ; writeBin( as.integer( 1000 * scan( file=FILE,
what=double(), nmax=3100*16000 ) + .1 ), con, size=2 ) ; proc.time()-ptm
> Read 48013406 items
>     user  system elapsed
>   10.263   0.733  10.991
>> close(con)
> 
> By the way, writeBin() is something that I learned about here, from you, 
> Duncan.  Thanks for that, too.
> 
> Mike
>

Mike Miller

2015-Jan-01 19:43 UTC

head link

[R] rounding down with as.integer

On Thu, 1 Jan 2015, Duncan Murdoch wrote:
> On 01/01/2015 1:21 PM, Mike Miller wrote:
>> On Thu, 1 Jan 2015, Duncan Murdoch wrote:
>>
>>> On 31/12/2014 8:44 PM, David Winsemius wrote:
>>>>
>>>> On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:
>>>>
>>>>> This is probably a FAQ, and I don't really have a
question about it, but I just ran across this in something I was working on:
>>>>>
>>>>>> as.integer(1000*1.003)
>>>>> [1] 1002
>>>>>
>>>>> I didn't expect it, but maybe I should have.  I guess
it's about the machine precision added to the fact that as.integer always
rounds down:
>>>>>
>>>>>
>>>>>> as.integer(1000*1.003 + 255 * .Machine$double.eps)
>>>>> [1] 1002
>>>>>
>>>>>> as.integer(1000*1.003 + 256 * .Machine$double.eps)
>>>>> [1] 1003
>>>>>
>>>>>
>>>>> This does it right...
>>>>>
>>>>>> as.integer( round( 1000*1.003 ) )
>>>>> [1] 1003
>>>>>
>>>>> ...but this seems to always give the same answer and it is
a little faster in my application:
>>>>>
>>>>>> as.integer( 1000*1.003 + .1 )
>>>>> [1] 1003
>>>>>
>>>>>
>>>>> FYI - I'm reading in a long vector of numbers from a
text file with no more than three digits to the right of the decimal.  I'm
converting them to integers and saving them in binary format.
>>>>>
>>>>
>>>> So just add 0.0001 or even .0000001 to all of them and coerce
to integer.
>>>
>>> I don't think the original problem was stated clearly, so
I'm not sure
>>> whether this is a solution, but it looks wrong to me.  If you want
to
>>> round to the nearest integer, why not use round() (without the
>>> as.integer afterwards)?  Or if you really do want an integer, why
add
>>> 0.1 or 0.0001, why not add 0.5 before calling as.integer()?  This
is the
>>> classical way to implement round().
>>>
>>> To state the problem clearly, I'd like to know what result is
expected
>>> for any real number x.  Since R's numeric type only
approximates the
>>> real numbers we might not be able to get a perfect match, but at
least
>>> we could quantify how close we get.  Or is the input really
character
>>> data?  The original post mentioned reading numbers from a text
file.
>>
>>
>> Maybe you'd like to know what I'm really doing.  I have 1600
text files
>> each with up to 16,000 lines with 3100 numbers per line, delimited by a
>> single space.  The numbers are between 0 and 2, inclusive, and they
have
>> up to three digits to the right of the decimal.  Every possible value
in
>> that range will occur in the data.  Some examples numbers: 0 1 2 0.325
>> 1.12 1.9.  I want to multiply by 1000 and store them as 16-bit integers
>> (uint16).
>>
>> I've been reading in the data like so:
>>
>>> data <- scan( file=FILE, what=double(), nmax=3100*16000)
>>
>> At first I tried making the integers like so:
>>
>>> ptm <- proc.time() ; ints <- as.integer( 1000 * data ) ;
proc.time()-ptm
>>     user  system elapsed
>>    0.187   0.387   0.574
>>
>> I decided I should compare with the result I got using round():
>>
>>> ptm <- proc.time() ; ints2 <- as.integer( round( 1000 * data
) ) ; proc.time()-ptm
>>     user  system elapsed
>>    1.595   0.757   2.352
>>
>> It is a curious fact that only a few of the values from 0 to 2000
disagree
>> between the two methods:
>>
>>> table( ints2[ ints2 != ints ] )
>>
>>   1001  1003  1005  1007  1009  1011  1013  1015  1017  1019  1021 
1023
>> 35651 27020 15993 11505  8967  7549  6885  6064  5512  4828  4533  4112
>>
>> I understand that it's all about the problem of representing
digital
>> numbers in binary, but I still find some of the results a little
>> surprising, like that list of numbers from the table() output.  For
>> another example:
>>
>>> 1000+3 - 1000*(1+3/1000)
>> [1] 1.136868e-13
>>
>>> 3 - 1000*(0+3/1000)
>> [1] 0
>>
>>> 2000+3 - 1000*(2+3/1000)
>> [1] 0
>>
>> See what I mean?  So there is something special about the numbers
around
>> 1000.
>
> I think it's really that there is something special about the numbers
> near 1, and you're multiplying that by 1000.
>
> Numbers from 1 to just below 2 are stored as their fractional part, with
> 52 bit precision.  Some intermediate calculations will store them with
> 64 bit precision.  52 bits gives about 15 or 16 decimal places.
>
> If your number x is close to 3/1000, it is stored as the fractional part
> of 2^9 * x.  This gives it an extra 2 or 3 decimal digits of precision,
> so that's why these values are accurate.
>
> If your number x is close to 2.003, it is stored as the fractional part
> of x/2, i.e. with errors like 1.0015 would have.  So I would have
> guessed that 2.006 would have the same problems as 1.003, but I thought
> you didn't see that.  So I tried it myself, and I do see that:
>
>> 1000+3 - 1000*(1+3/1000)
> [1] 1.136868e-13
>> 2000+6 - 1000*(2+6/1000)
> [1] 2.273737e-13
>
> Reading more closely, I see that you didn't test this particular case,
> so there's no contradiction here.
>
> The one thing I couldn't think of an explanation for is why other
> numbers between 1 and 2 don't have the same sorts of problems.  So I
> tried the following:
>
> # Set data to 1.000 thru 1.999
> data <- 1 + 0:999/1000
>
> # Find the errors
> errors <- 1000 + 0:999 - 1000*data
>
> # Plot them
> plot(data, errors)
>
> The plot doesn't show a uniform distribution, but much more uniform
than
> yours:  so I think your data doesn't really cover all possible values 
> from 0.000 to 1.999.  (I get a similar plot if I look at cases where 
> ints != ints2 with my data.)
No, my data do cover all of the values in the range (I checked that by 
listing all of them in a file:

tr ' ' '\n' < FILE | uniq | sort -n | uniq >| uniq.txt

They are definitely all there -- all 2001 of them.  The thing is, your way 
of making the numbers is a little different from reading them from a file. 
You construct the number through an arithmetic operation and you get the 
error:
> as.integer( 1000 * (1+118/1000) )[1] 1117

(Your first error-producing value was 1.118.)  But when the number is read 
froma file, it starts as character and is translated to numeric.  So I 
start with the arithmetic, convert to character, back to numeric, and then 
I do not see the error:
> as.integer( 1000 * as.numeric( as.character( 1+118/1000 ) ) )[1] 1118

That reflects what was happening in my work.  To see what I'm seeing, you 
have to do this:

data <- 1 + 0:999/1000
errors <- 1000 + 0:999 - 1000 * as.numeric( as.character( data ) )
plot(data, errors)

Or you could cover he full range from 0 to 2:

data <- 0:2000/1000
errors <- 0:2000 - 1000 * as.numeric( as.character( data ) )
plot(data, errors)

Mike

Duncan Murdoch

2015-Jan-01 20:06 UTC

head link

[R] rounding down with as.integer

On 01/01/2015 2:43 PM, Mike Miller wrote:> On Thu, 1 Jan 2015, Duncan Murdoch wrote:
> 
>> On 01/01/2015 1:21 PM, Mike Miller wrote:
>>> On Thu, 1 Jan 2015, Duncan Murdoch wrote:
>>>
>>>> On 31/12/2014 8:44 PM, David Winsemius wrote:
>>>>>
>>>>> On Dec 31, 2014, at 3:24 PM, Mike Miller wrote:
>>>>>
>>>>>> This is probably a FAQ, and I don't really have a
question about it, but I just ran across this in something I was working on:
>>>>>>
>>>>>>> as.integer(1000*1.003)
>>>>>> [1] 1002
>>>>>>
>>>>>> I didn't expect it, but maybe I should have.  I
guess it's about the machine precision added to the fact that as.integer
always rounds down:
>>>>>>
>>>>>>
>>>>>>> as.integer(1000*1.003 + 255 * .Machine$double.eps)
>>>>>> [1] 1002
>>>>>>
>>>>>>> as.integer(1000*1.003 + 256 * .Machine$double.eps)
>>>>>> [1] 1003
>>>>>>
>>>>>>
>>>>>> This does it right...
>>>>>>
>>>>>>> as.integer( round( 1000*1.003 ) )
>>>>>> [1] 1003
>>>>>>
>>>>>> ...but this seems to always give the same answer and it
is a little faster in my application:
>>>>>>
>>>>>>> as.integer( 1000*1.003 + .1 )
>>>>>> [1] 1003
>>>>>>
>>>>>>
>>>>>> FYI - I'm reading in a long vector of numbers from
a text file with no more than three digits to the right of the decimal.  I'm
converting them to integers and saving them in binary format.
>>>>>>
>>>>>
>>>>> So just add 0.0001 or even .0000001 to all of them and
coerce to integer.
>>>>
>>>> I don't think the original problem was stated clearly, so
I'm not sure
>>>> whether this is a solution, but it looks wrong to me.  If you
want to
>>>> round to the nearest integer, why not use round() (without the
>>>> as.integer afterwards)?  Or if you really do want an integer,
why add
>>>> 0.1 or 0.0001, why not add 0.5 before calling as.integer()? 
This is the
>>>> classical way to implement round().
>>>>
>>>> To state the problem clearly, I'd like to know what result
is expected
>>>> for any real number x.  Since R's numeric type only
approximates the
>>>> real numbers we might not be able to get a perfect match, but
at least
>>>> we could quantify how close we get.  Or is the input really
character
>>>> data?  The original post mentioned reading numbers from a text
file.
>>>
>>>
>>> Maybe you'd like to know what I'm really doing.  I have
1600 text files
>>> each with up to 16,000 lines with 3100 numbers per line, delimited
by a
>>> single space.  The numbers are between 0 and 2, inclusive, and they
have
>>> up to three digits to the right of the decimal.  Every possible
value in
>>> that range will occur in the data.  Some examples numbers: 0 1 2
0.325
>>> 1.12 1.9.  I want to multiply by 1000 and store them as 16-bit
integers
>>> (uint16).
>>>
>>> I've been reading in the data like so:
>>>
>>>> data <- scan( file=FILE, what=double(), nmax=3100*16000)
>>>
>>> At first I tried making the integers like so:
>>>
>>>> ptm <- proc.time() ; ints <- as.integer( 1000 * data ) ;
proc.time()-ptm
>>>     user  system elapsed
>>>    0.187   0.387   0.574
>>>
>>> I decided I should compare with the result I got using round():
>>>
>>>> ptm <- proc.time() ; ints2 <- as.integer( round( 1000 *
data ) ) ; proc.time()-ptm
>>>     user  system elapsed
>>>    1.595   0.757   2.352
>>>
>>> It is a curious fact that only a few of the values from 0 to 2000
disagree
>>> between the two methods:
>>>
>>>> table( ints2[ ints2 != ints ] )
>>>
>>>   1001  1003  1005  1007  1009  1011  1013  1015  1017  1019  1021 
1023
>>> 35651 27020 15993 11505  8967  7549  6885  6064  5512  4828  4533 
4112
>>>
>>> I understand that it's all about the problem of representing
digital
>>> numbers in binary, but I still find some of the results a little
>>> surprising, like that list of numbers from the table() output.  For
>>> another example:
>>>
>>>> 1000+3 - 1000*(1+3/1000)
>>> [1] 1.136868e-13
>>>
>>>> 3 - 1000*(0+3/1000)
>>> [1] 0
>>>
>>>> 2000+3 - 1000*(2+3/1000)
>>> [1] 0
>>>
>>> See what I mean?  So there is something special about the numbers
around
>>> 1000.
>>
>> I think it's really that there is something special about the
numbers
>> near 1, and you're multiplying that by 1000.
>>
>> Numbers from 1 to just below 2 are stored as their fractional part,
with
>> 52 bit precision.  Some intermediate calculations will store them with
>> 64 bit precision.  52 bits gives about 15 or 16 decimal places.
>>
>> If your number x is close to 3/1000, it is stored as the fractional
part
>> of 2^9 * x.  This gives it an extra 2 or 3 decimal digits of precision,
>> so that's why these values are accurate.
>>
>> If your number x is close to 2.003, it is stored as the fractional part
>> of x/2, i.e. with errors like 1.0015 would have.  So I would have
>> guessed that 2.006 would have the same problems as 1.003, but I thought
>> you didn't see that.  So I tried it myself, and I do see that:
>>
>>> 1000+3 - 1000*(1+3/1000)
>> [1] 1.136868e-13
>>> 2000+6 - 1000*(2+6/1000)
>> [1] 2.273737e-13
>>
>> Reading more closely, I see that you didn't test this particular
case,
>> so there's no contradiction here.
>>
>> The one thing I couldn't think of an explanation for is why other
>> numbers between 1 and 2 don't have the same sorts of problems.  So
I
>> tried the following:
>>
>> # Set data to 1.000 thru 1.999
>> data <- 1 + 0:999/1000
>>
>> # Find the errors
>> errors <- 1000 + 0:999 - 1000*data
>>
>> # Plot them
>> plot(data, errors)
>>
>> The plot doesn't show a uniform distribution, but much more uniform
than
>> yours:  so I think your data doesn't really cover all possible
values
>> from 0.000 to 1.999.  (I get a similar plot if I look at cases where 
>> ints != ints2 with my data.)
> 
> No, my data do cover all of the values in the range (I checked that by 
> listing all of them in a file:
> 
> tr ' ' '\n' < FILE | uniq | sort -n | uniq >|
uniq.txt
> 
> They are definitely all there -- all 2001 of them.  The thing is, your way 
> of making the numbers is a little different from reading them from a file. 
> You construct the number through an arithmetic operation and you get the 
> error:
> 
>> as.integer( 1000 * (1+118/1000) )
> [1] 1117
> 
> (Your first error-producing value was 1.118.)  But when the number is read 
> froma file, it starts as character and is translated to numeric.  So I 
> start with the arithmetic, convert to character, back to numeric, and then 
> I do not see the error:
> 
>> as.integer( 1000 * as.numeric( as.character( 1+118/1000 ) ) )
> [1] 1118
> 
> That reflects what was happening in my work.  To see what I'm seeing,
you
> have to do this:
> 
> data <- 1 + 0:999/1000
> errors <- 1000 + 0:999 - 1000 * as.numeric( as.character( data ) )
> plot(data, errors)
That's really interesting!  I guess the differences basically cancel out.

Duncan Murdoch
> 
> Or you could cover he full range from 0 to 2:
> 
> data <- 0:2000/1000
> errors <- 0:2000 - 1000 * as.numeric( as.character( data ) )
> plot(data, errors)
> 
> Mike
>

Mike Miller

2015-Jan-02 03:05 UTC

head link

[R] rounding down with as.integer

On Thu, 1 Jan 2015, Duncan Murdoch wrote:
> On 01/01/2015 1:21 PM, Mike Miller wrote:
>
>> I understand that it's all about the problem of representing
digital
>> numbers in binary, but I still find some of the results a little 
>> surprising, like that list of numbers from the table() output.  For 
>> another example:
>>
>>> 1000+3 - 1000*(1+3/1000)
>> [1] 1.136868e-13
>>
>>> 3 - 1000*(0+3/1000)
>> [1] 0
>>
>>> 2000+3 - 1000*(2+3/1000)
>> [1] 0
>>
>> See what I mean?  So there is something special about the numbers 
>> around 1000.
>
> I think it's really that there is something special about the numbers 
> near 1, and you're multiplying that by 1000.
>
> Numbers from 1 to just below 2 are stored as their fractional part, with 
> 52 bit precision.  Some intermediate calculations will store them with 
> 64 bit precision.  52 bits gives about 15 or 16 decimal places.

This is how big those errors are:
> 512*.Machine$double.eps[1] 1.136868e-13

Under other conditions you also were seeing errors of twice that, or 
1024*.Machine$double.eps.  It might not be a coincidence that the largest 
number giving me an error was 1023.
> 2^-43[1] 1.136868e-13
> .Machine$double.eps[1] 2.220446e-16
> 2^-52[1] 2.220446e-16

I guess the 52 comes from the IEEE floating point spec...

http://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64

...but why are we seeing errors so much bigger than the machine precision? 
Why does it change at 2?

It doesn't really matter to my work, but it is a curious thing, so I would 
be interested to learn about it.

Mike

Duncan Murdoch

2015-Jan-02 09:52 UTC

head link

[R] rounding down with as.integer

On 01/01/2015 10:05 PM, Mike Miller wrote:> On Thu, 1 Jan 2015, Duncan Murdoch wrote:
> 
>> On 01/01/2015 1:21 PM, Mike Miller wrote:
>>
>>> I understand that it's all about the problem of representing
digital
>>> numbers in binary, but I still find some of the results a little 
>>> surprising, like that list of numbers from the table() output.  For
>>> another example:
>>>
>>>> 1000+3 - 1000*(1+3/1000)
>>> [1] 1.136868e-13
>>>
>>>> 3 - 1000*(0+3/1000)
>>> [1] 0
>>>
>>>> 2000+3 - 1000*(2+3/1000)
>>> [1] 0
>>>
>>> See what I mean?  So there is something special about the numbers 
>>> around 1000.
>>
>> I think it's really that there is something special about the
numbers
>> near 1, and you're multiplying that by 1000.
>>
>> Numbers from 1 to just below 2 are stored as their fractional part,
with
>> 52 bit precision.  Some intermediate calculations will store them with 
>> 64 bit precision.  52 bits gives about 15 or 16 decimal places.
> 
> 
> This is how big those errors are:
> 
>> 512*.Machine$double.eps
> [1] 1.136868e-13
> 
> Under other conditions you also were seeing errors of twice that, or 
> 1024*.Machine$double.eps.  It might not be a coincidence that the largest 
> number giving me an error was 1023.
> 
>> 2^-43
> [1] 1.136868e-13
> 
>> .Machine$double.eps
> [1] 2.220446e-16
> 
>> 2^-52
> [1] 2.220446e-16
> 
> I guess the 52 comes from the IEEE floating point spec...
> 
>
http://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64
> 
> ...but why are we seeing errors so much bigger than the machine precision? 
You are multiplying by 1000.  That magnifies the error.
> Why does it change at 2?
Because (most) floating point numbers are stored as (1 + x) * 2^y, where
x is a number between 0 and 1, and y is an integer value between -1023
and 1023. The value of y changes at 2, and this means errors in x become
twice as big.  (The exceptions are 0, Inf, NaN, etc., as well as
"denormals", where y is -1024 and the format changes to x * 2^(y+1).)

Duncan Murdoch
> It doesn't really matter to my work, but it is a curious thing, so I
would
> be interested to learn about it.
> 
> Mike
>

peter dalgaard

2015-Jan-02 09:59 UTC

head link

[R] rounding down with as.integer

On 02 Jan 2015, at 04:05 , Mike Miller <mbmiller+l at gmail.com> wrote:
> 
> ...but why are we seeing errors so much bigger than the machine precision?
Why does it change at 2?

Because relative errors in the one-thousands part are roughly a thousand times
bigger than errors in the number itself? Put differently: the 1.136868e-13 error
is an absolute error on 1003, not on 3

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

R help - Jan 2015 - rounding down with as.integer

[R] rounding down with as.integer

[R] rounding down with as.integer

[R] rounding down with as.integer

[R] rounding down with as.integer

[R] rounding down with as.integer

[R] rounding down with as.integer