similar to: Bias in R's random integers?

Displaying 20 results from an estimated 1000 matches similar to: "Bias in R's random integers?"

2018 Sep 19
2
Bias in R's random integers?
El mi?., 19 sept. 2018 a las 14:43, Duncan Murdoch (<murdoch.duncan at gmail.com>) escribi?: > > On 18/09/2018 5:46 PM, Carl Boettiger wrote: > > Dear list, > > > > It looks to me that R samples random integers using an intuitive but biased > > algorithm by going from a random number on [0,1) from the PRNG to a random > > integer, e.g. > >
2018 Sep 19
2
Bias in R's random integers?
The 53 bits only encode at most 2^{32} possible values, because the source of the float is the output of a 32-bit PRNG (the obsolete version of MT). 53 bits isn't the relevant number here. The selection ratios can get close to 2. Computer scientists don't do it the way R does, for a reason. Regards, Philip On Wed, Sep 19, 2018 at 9:05 AM Duncan Murdoch <murdoch.duncan at
2018 Sep 19
2
Bias in R's random integers?
No, the 2nd call only happens when m > 2**31. Here's the code: (RNG.c, lines 793ff) double R_unif_index(double dn) { double cut = INT_MAX; switch(RNG_kind) { case KNUTH_TAOCP: case USER_UNIF: case KNUTH_TAOCP2: cut = 33554431.0; /* 2^25 - 1 */ break; default: break; } double u = dn > cut ? ru() : unif_rand(); return floor(dn * u); } On Wed, Sep
2018 Sep 19
4
Bias in R's random integers?
Hi Duncan-- Nice simulation! The absolute difference in probabilities is small, but the maximum relative difference grows from something negligible to almost 2 as m approaches 2**31. Because the L_1 distance between the uniform distribution on {1, ..., m} and what you actually get is large, there have to be test functions whose expectations are quite different under the two distributions.
2018 Sep 19
2
Bias in R's random integers?
It doesn't seem too hard to come up with plausible ways in which this could give bad results. Suppose I sample rows from a large dataset, maybe for bootstrapping. Suppose the rows are non-randomly ordered, e.g. odd rows are males, even rows are females. Oops! Very non-representative sample, bootstrap p values are garbage. David On Wed, 19 Sep 2018 at 21:20, Duncan Murdoch <murdoch.duncan
2018 Sep 19
2
Bias in R's random integers?
A quick point of order here: arguing with Duncan in this forum is helpful to expose ideas, but probably neither side will convince the other; eventually, if you want this adopted in core R, you'll need to convince an R-core member to pursue this fix. In the meantime, a good, well-tested implementation in a user-contributed package (presumably written in C for speed) would be enormously
2018 Sep 19
0
Bias in R's random integers?
On 18/09/2018 5:46 PM, Carl Boettiger wrote: > Dear list, > > It looks to me that R samples random integers using an intuitive but biased > algorithm by going from a random number on [0,1) from the PRNG to a random > integer, e.g. > https://github.com/wch/r-source/blob/tags/R-3-5-1/src/main/RNG.c#L808 > > Many other languages use various rejection sampling approaches
2018 Sep 19
0
Bias in R's random integers?
On 19/09/2018 9:09 AM, I?aki Ucar wrote: > El mi?., 19 sept. 2018 a las 14:43, Duncan Murdoch > (<murdoch.duncan at gmail.com>) escribi?: >> >> On 18/09/2018 5:46 PM, Carl Boettiger wrote: >>> Dear list, >>> >>> It looks to me that R samples random integers using an intuitive but biased >>> algorithm by going from a random number on [0,1)
2018 Sep 19
0
Bias in R's random integers?
On 19/09/2018 12:09 PM, Philip B. Stark wrote: > The 53 bits only encode at most 2^{32} possible values, because the > source of the float is the output of a 32-bit PRNG (the obsolete version > of MT). 53 bits isn't the relevant number here. No, two calls to unif_rand() are used. There are two 32 bit values, but some of the bits are thrown away. Duncan Murdoch > > The
2018 Sep 19
0
Bias in R's random integers?
On 19/09/2018 12:23 PM, Philip B. Stark wrote: > No, the 2nd call only happens when m > 2**31. Here's the code: Yes, you're right. Sorry! So the ratio really does come close to 2. However, the difference in probabilities between outcomes is still at most 2^-32 when m is less than that cutoff. That's not feasible to detect; the only detectable difference would happen if
2018 Sep 19
0
Bias in R's random integers?
On 19/09/2018 3:52 PM, Philip B. Stark wrote: > Hi Duncan-- > > Nice simulation! > > The absolute difference in probabilities is small, but the maximum > relative difference grows from something negligible to almost 2 as m > approaches 2**31. > > Because the L_1 distance between the uniform distribution on {1, ..., m} > and what you actually get is large, there
2018 Sep 19
0
Bias in R's random integers?
For a well-tested C algorithm, based on my reading of Lemire, the unbiased "algorithm 3" in https://arxiv.org/abs/1805.10941 is part already of the C standard library in OpenBSD and macOS (as arc4random_uniform), and in the GNU standard library. Lemire also provides C++ code in the appendix of his piece for both this and the faster "nearly divisionless" algorithm. It would be
2018 Sep 19
0
Bias in R's random integers?
On 19/09/2018 5:57 PM, David Hugh-Jones wrote: > > It doesn't seem too hard to come up with plausible ways in which this > could give bad results. Suppose I sample rows from a large dataset, > maybe for bootstrapping. Suppose the rows are non-randomly ordered, e.g. > odd rows are males, even rows are females. Oops! Very non-representative > sample, bootstrap p values are
2018 Sep 20
5
Bias in R's random integers?
On 9/20/18 1:43 AM, Carl Boettiger wrote: > For a well-tested C algorithm, based on my reading of Lemire, the unbiased > "algorithm 3" in https://arxiv.org/abs/1805.10941 is part already of the C > standard library in OpenBSD and macOS (as arc4random_uniform), and in the > GNU standard library. Lemire also provides C++ code in the appendix of his > piece for both this and
2018 Sep 20
4
Bias in R's random integers?
Hello, On Thursday, September 20, 2018 11:15:04 AM EDT Duncan Murdoch wrote: > On 20/09/2018 6:59 AM, Ralf Stubner wrote: > > On 9/20/18 1:43 AM, Carl Boettiger wrote: > >> For a well-tested C algorithm, based on my reading of Lemire, the > >> unbiased "algorithm 3" in https://arxiv.org/abs/1805.10941 is part > >> already of the C standard library in
2018 Sep 20
0
Bias in R's random integers?
On 20/09/2018 6:59 AM, Ralf Stubner wrote: > On 9/20/18 1:43 AM, Carl Boettiger wrote: >> For a well-tested C algorithm, based on my reading of Lemire, the unbiased >> "algorithm 3" in https://arxiv.org/abs/1805.10941 is part already of the C >> standard library in OpenBSD and macOS (as arc4random_uniform), and in the >> GNU standard library. Lemire also
2018 Sep 19
4
Bias in R's random integers?
On Wed, 19 Sep 2018 at 13:43, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > I think the analyses are correct, but I doubt if a change to the default > is likely to be accepted as it would make it more difficult to reproduce > older results. I'm a bit alarmed by the logic here. Unbiased sampling seems basic for a statistical language. As a consumer of R I'd
2018 Sep 21
0
Bias in R's random integers?
Hello, Top posting. Several people have asked about the code to replicate my results. I have cleaned up the code to remove an x/y coordinate bias for displaying the results directly on a 640 x 480 VGA adapter. You can find the code here: http://people.redhat.com/sgrubb/files/vseq.c To collect R samples: X <- runif(10000, min = 0, max = 65535) write.table(X, file =
2018 Sep 21
3
Bias in R's random integers?
Not sure what should happen theoretically for the code in vseq.c, but I see the same pattern with the R generators I tried (default, Super-Duper, and L'Ecuyer) and with with bash $RANDOM using N <- 10000 X1 <- replicate(N, as.integer(system("bash -c 'echo $RANDOM'", intern = TRUE))) X2 <- replicate(N, as.integer(system("bash -c 'echo $RANDOM'",
2018 Sep 21
1
Bias in R's random integers?
On 9/20/18 5:15 PM, Duncan Murdoch wrote: > On 20/09/2018 6:59 AM, Ralf Stubner wrote: >> It is difficult to do this in a package, since R does not provide access >> to the random bits generated by the RNG. Only a float in (0,1) is >> available via unif_rand(). > > I believe it is safe to multiply the unif_rand() value by 2^32, and take > the whole number part as an