thr3ads.net - R devel - [Rd] Bug in sample() [Mar 2017]

If this information is useful, please help other people find it:
Share via:

Kellie Ottoboni

2017-Mar-07 19:06 UTC

[Rd] Bug in sample()

Dear all,

Philip Stark and I think we have found a problem with how R generates
random samples, resulting from how it generates random integers between 1
and n. (If we are reading the code correctly, the method is to multiply a
pseudo-random binary fraction by n, take the floor, and add 1; this suffers
from quantization effects that can get quite large when n is just below
2^31).

A better method, used in Python, is to generate ceil(log_2(n))
pseudo-random bits, add 1, and discard values bigger than n.

Attached is a short document explaining the issue in more detail.

Best,
Kellie

-- 
Kellie Ottoboni
Ph.D. Statistics '19, University of California, Berkeley
Fellow at Berkeley Institute for Data Science

Mobile: (650) 520-5056
Website: www.stat.berkeley.edu/~kellieotto
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sample-bug.pdf
Type: application/pdf
Size: 230126 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-devel/attachments/20170307/33af608a/attachment.pdf>

Maybe Matching Threads

Search for more possibly parallel threads

R devel - Mar 2017 - Bug in sample()

[Rd] Bug in sample()

Maybe Matching Threads

Wisdom of the Ancients