Jonathan Minton
2012-Dec-06 11:00 UTC
[R] Anomalous outputs from rbeta when using two different random number seeds
Hi, in the code below, I am drawing 1000 samples from two beta distributions, each time using the same random number seed. Using set.seed(80) produces results I expect, in that the differences between the distributions are very small. Using set.seed(20) produces results I can't make sense of. Around half of the time, it behaves as with set.seed(80), but around half of the time, it behaves very differently, with a much wider distribution of differences between the two distributions. # Beta parameters #distribution 1 u1.a <- 285.14 u1.b <- 190.09 # distribution 2 u2.a <- 223.79 u2.b <- 189.11 #Good example: output is as expected set.seed(80); u1.good <- rbeta(1000, u1.a, u1.b) set.seed(80); u2.good <- rbeta(1000, u2.a, u2.b) #Bad example: output is different to expected set.seed(20); u1.bad <- rbeta(1000, u1.a, u1.b) set.seed(20); u2.bad <- rbeta(1000, u2.a, u2.b) # plot of distributions using set.seed(80), which behaves as expected plot(u2.good ~ u1.good, ylim=c(0.45, 0.70), xlim=c(0.45, 0.70)) abline(0,1) # plot of distributions using set.seed(20), which is different to expected plot(u2.bad ~ u1.bad, ylim=c(0.45, 0.70), xlim=c(0.45, 0.70)) abline(0,1) # plot of differences when using set.seed(80) plot(u1.good - u2.good, ylim=c(-0.2, 0.2)) abline(h=0) # plot of differences when using set.seed(20) plot(u1.bad - u2.bad, ylim=c(-0.2, 0.2)) abline(h=0) Could you explain why using set.seed(20) produces this chaotic pattern of behaviour? Many thanks, Jon -- Dr Jon Minton Research Associate Health Economics & Decision Science School of Health and Related Research University of Sheffield Times Higher Education University of the Year Tel: +44(0)114 222 0836 email: j.minton@sheffield.ac.uk http://www.shef.ac.uk/scharr/sections/heds http://scharrheds.blogspot.co.uk/ [[alternative HTML version deleted]]
Prof Brian Ripley
2012-Dec-06 17:48 UTC
[R] Anomalous outputs from rbeta when using two different random number seeds
On 06/12/2012 11:00, Jonathan Minton wrote:> Hi, in the code below, I am drawing 1000 samples from two beta > distributions, each time using the same random number seed. > > Using set.seed(80) produces results I expect, in that the differences > between the distributions are very small. > > Using set.seed(20) produces results I can't make sense of. Around half of > the time, it behaves as with set.seed(80), but around half of the time, it > behaves very differently, with a much wider distribution of differences > between the two distributions.The 'anomaly' is in your expectation. There is no reason why random variate streams for similar but different distributions started at the same seed should be similar or dissimilar. They will be deterministically related if inversion is used (for continuous distributions), but not if rejection is used. If you consider the sequential order of your examples you will see that over parts of the period the two generators are in step, and for parts they are not. R is Open Source so you can read the algorithm and work out why for yourself.> > > # Beta parameters > > #distribution 1 > u1.a <- 285.14 > u1.b <- 190.09 > > # distribution 2 > u2.a <- 223.79 > u2.b <- 189.11 > > #Good example: output is as expected > > set.seed(80); u1.good <- rbeta(1000, u1.a, u1.b) > set.seed(80); u2.good <- rbeta(1000, u2.a, u2.b) > > > #Bad example: output is different to expected > set.seed(20); u1.bad <- rbeta(1000, u1.a, u1.b) > set.seed(20); u2.bad <- rbeta(1000, u2.a, u2.b) > > > # plot of distributions using set.seed(80), which behaves as expected > plot(u2.good ~ u1.good, ylim=c(0.45, 0.70), xlim=c(0.45, 0.70)) > abline(0,1) > > # plot of distributions using set.seed(20), which is different to expected > plot(u2.bad ~ u1.bad, ylim=c(0.45, 0.70), xlim=c(0.45, 0.70)) > abline(0,1) > > # plot of differences when using set.seed(80) > plot(u1.good - u2.good, ylim=c(-0.2, 0.2)) > abline(h=0) > > # plot of differences when using set.seed(20) > plot(u1.bad - u2.bad, ylim=c(-0.2, 0.2)) > abline(h=0) > > > Could you explain why using set.seed(20) produces this chaotic pattern of > behaviour? > > > Many thanks, > Jon >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Possibly Parallel Threads
- optimal control, maximization with several variables?
- grid 4.0 generates wrong results when adding two complex units by sum()
- A function that can modify an object? Or at least shows principles how to modify an object?
- You have not permission to view content of this location
- subsetting comparison problem