thr3ads.net - R help - [R] random number generator? [Jan 2003]

If this information is useful, please help other people find it:
Share via:

Charles Annis, P.E.

2003-Jan-28 20:15 UTC

[R] random number generator?

Dear R-Aficionados:

I realize that no random number generator is perfect, so what I report
below may be a result of that simple fact.  However, if I have made an
error in my thinking I would greatly appreciate being corrected.

I wish to illustrate the behavior of small samples (n=10) and so
generate 100,000 of them.

n.samples <- 1000000
sample.size = 10
p <- 0.0001
z.normal <- qnorm(p)
# generate n.samples of sample.size each from a normal(mean=0, sd=1)
density
#
small.sample <- matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1),
nrow=n.samples, ncol=sample.size)
# Verify that from the entire small.sample matrix, p sampled values are
below, p above.
#
observed.fraction.below <- sum(small.sample <
z.normal)/length(small.sample)
observed.fraction.above <- sum(small.sample >
-z.normal)/length(small.sample)
> observed.fraction.below 
[1] 6.3e-05> observed.fraction.above 
[1] 0.000142> 
I've checked the behavior of the entire sample's mean and median and
they seem fine.  The total fraction in both tails is 0.0002, as it
should be.  However in every instance about 1/3 are in the lower tail,
2/3 in the upper.  I also observe the same 1/3:2/3 ratio for one million
samples of ten.

Is this simply because random number generators aren't perfect?  Or have
I stepped in something?

Thank you for your kind counsel.


Charles Annis, P.E.

Charles.Annis at StatisticalEngineering.com
phone: 561-352-9699
eFAX: 503-217-5849
http://www.StatisticalEngineering.com

Peter Dalgaard BSA

2003-Jan-28 20:36 UTC

head link

[R] random number generator?

"Charles Annis, P.E." <AnnisC at asme.org> writes:
> Dear R-Aficionados:
> 
> I realize that no random number generator is perfect, so what I report
> below may be a result of that simple fact.  However, if I have made an
> error in my thinking I would greatly appreciate being corrected.
> 
> I wish to illustrate the behavior of small samples (n=10) and so
> generate 100,000 of them.
> 
> n.samples <- 1000000
> sample.size = 10
> p <- 0.0001
> z.normal <- qnorm(p)
> # generate n.samples of sample.size each from a normal(mean=0, sd=1)
> density
> #
> small.sample <- matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1),
> nrow=n.samples, ncol=sample.size)
> # Verify that from the entire small.sample matrix, p sampled values are
> below, p above.
> #
> observed.fraction.below <- sum(small.sample <
> z.normal)/length(small.sample)
> observed.fraction.above <- sum(small.sample >
> -z.normal)/length(small.sample)
> 
> > observed.fraction.below 
> [1] 6.3e-05
> > observed.fraction.above 
> [1] 0.000142
> > 
> 
> I've checked the behavior of the entire sample's mean and median
and
> they seem fine.  The total fraction in both tails is 0.0002, as it
> should be.  However in every instance about 1/3 are in the lower tail,
> 2/3 in the upper.  I also observe the same 1/3:2/3 ratio for one million
> samples of ten.
> 
> Is this simply because random number generators aren't perfect?  Or
have
> I stepped in something?
> 
> Thank you for your kind counsel.
You stepped in something, I think, but I probably shouldn't elaborate
on the metaphor ... There's an unfortunate interaction between the two
methods that are used for generating uniform and normal variables (the
latter uses the former). This has been reported a couple of times
before and typically gives anomalous tail behaviour. Changing one of
the generators (see help(RNGkind)) usually helps.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907

Liaw, Andy

2003-Jan-28 22:04 UTC

head link

[R] random number generator?

Might I suggest taking a poll (even though unscientific) of how many people
will be affected by a change in default RNG?  My totally arbitrary guess is
very few, if any.

If I'm not mistaken, Python had only recently changed the default RNG to
Mersenne-Twister.  If Python can do it, I should think R can, too, without
too much pain...

Just my $0.02...

Andy
> -----Original Message-----
> From: ripley at stats.ox.ac.uk [mailto:ripley at stats.ox.ac.uk]
> Sent: Tuesday, January 28, 2003 3:53 PM
> To: Charles Annis, P.E.
> Cc: r-help at stat.math.ethz.ch
> Subject: RE: [R] random number generator?
> 
> 
> Can I suggest
> 
> RNGkind("Mersenne-Twister", "Inversion")
> 
> and especially the use of Inversion where tail behaviour of 
> the normal is
> important.
> 
> Were it not for concerns about reproducibility we would have 
> switched to 
> Inversion a while back.
> 
> On Tue, 28 Jan 2003, Charles Annis, P.E. wrote:
> 
> > 
> > Earlier today I reported finding an unbalanced number of 
> observations in
> > the p=0.0001 tails of rnorm.
> > 
> > Many thanks to Peter Dalgaard who suggested changing the normal.kind
> > generator.  
> > 
> > Using  RNGkind(kind = NULL, normal.kind ="Box-Muller")
> > seems to have provided the remedy.  For example:
> > 
> > > observed.fraction.below 
> > [1] 0.000103
> > > observed.fraction.above 
> > [1] 0.000101
> > >  
> > 
> > Thank you, Peter!
> > 
> > 
> > Charles Annis, P.E.
> > 
> > Charles.Annis at StatisticalEngineering.com
> > phone: 561-352-9699
> > eFAX: 503-217-5849
> > http://www.StatisticalEngineering.com
> > 
> > 
> > -----Original Message-----
> > From: r-help-admin at stat.math.ethz.ch
> > [mailto:r-help-admin at stat.math.ethz.ch] On Behalf Of Peter 
> Dalgaard BSA
> > Sent: Tuesday, January 28, 2003 2:36 PM
> > To: Charles Annis, P.E.
> > Cc: r-help at stat.math.ethz.ch
> > Subject: Re: [R] random number generator?
> > 
> > "Charles Annis, P.E." <AnnisC at asme.org> writes:
> > 
> > > Dear R-Aficionados:
> > > 
> > > I realize that no random number generator is perfect, so 
> what I report
> > > below may be a result of that simple fact.  However, if I 
> have made an
> > > error in my thinking I would greatly appreciate being corrected.
> > > 
> > > I wish to illustrate the behavior of small samples (n=10) and so
> > > generate 100,000 of them.
> > > 
> > > n.samples <- 1000000
> > > sample.size = 10
> > > p <- 0.0001
> > > z.normal <- qnorm(p)
> > > # generate n.samples of sample.size each from a 
> normal(mean=0, sd=1)
> > > density
> > > #
> > > small.sample <- matrix(rnorm(n=sample.size*n.samples, 
> mean=0, sd=1),
> > > nrow=n.samples, ncol=sample.size)
> > > # Verify that from the entire small.sample matrix, p 
> sampled values
> > are
> > > below, p above.
> > > #
> > > observed.fraction.below <- sum(small.sample <
> > > z.normal)/length(small.sample)
> > > observed.fraction.above <- sum(small.sample >
> > > -z.normal)/length(small.sample)
> > > 
> > > > observed.fraction.below 
> > > [1] 6.3e-05
> > > > observed.fraction.above 
> > > [1] 0.000142
> > > > 
> > > 
> > > I've checked the behavior of the entire sample's mean and
> median and
> > > they seem fine.  The total fraction in both tails is 0.0002, as
it
> > > should be.  However in every instance about 1/3 are in 
> the lower tail,
> > > 2/3 in the upper.  I also observe the same 1/3:2/3 ratio for one
> > million
> > > samples of ten.
> > > 
> > > Is this simply because random number generators aren't 
> perfect?  Or
> > have
> > > I stepped in something?
> > > 
> > > Thank you for your kind counsel.
> > 
> > You stepped in something, I think, but I probably shouldn't 
> elaborate
> > on the metaphor ... There's an unfortunate interaction 
> between the two
> > methods that are used for generating uniform and normal 
> variables (the
> > latter uses the former). This has been reported a couple of times
> > before and typically gives anomalous tail behaviour. Changing one of
> > the generators (see help(RNGkind)) usually helps.
> > 
> > 
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 

------------------------------------------------------------------------------

Liaw, Andy

2003-Jan-29 15:25 UTC

head link

[R] random number generator?

> From: rossini at blindglobe.net [mailto:rossini at blindglobe.net]
> 
> >>>>> "AL" == Andy Liaw <Liaw> writes:
> 
>     AL> Might I suggest taking a poll (even though 
> unscientific) of how many people
>     AL> will be affected by a change in default RNG?  My 
> totally arbitrary guess is
>     AL> very few, if any.
> 
> But they may tend to be important (i.e. those with large stat software
> projects/tools) doing heavy regression testing.  
> 
But if those who want to use better RNG in R can now put RNGkind() in their
.Rprofile,  surely those who really need the old generator can do similar,
if the default is changed?  As Prof. Dalgaard said (privately), I don't
think is that painful if the current default generator is kept as an option
in RNGkind().  (I.e., all I'm asking is to change the default RNG, not
getting rid of the current default.)

Cheers,
Andy
>     AL> If I'm not mistaken, Python had only recently changed 
> the default RNG to
>     AL> Mersenne-Twister.  If Python can do it, I should 
> think R can, too, without
>     AL> too much pain...
> 
> Depends on which L_p norm you measure pain by.
> 
> best,
> -tony
> 
> -- 
> A.J. Rossini				Rsrch. Asst. Prof. of 
> Biostatistics
> U. of Washington Biostatistics		
> rossini at u.washington.edu	
> FHCRC/SCHARP/HIV Vaccine Trials Net	rossini at scharp.org
> -------------- http://software.biostat.washington.edu/ 
> ----------------
> FHCRC: M: 206-667-7025 (fax=4812)|Voicemail is pretty 
> sketchy/use Email
> UW:   Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX
> (my tuesday/wednesday/friday locations are completely unpredictable.)
> 

------------------------------------------------------------------------------

Bruce D McCullough

2003-Jan-29 17:19 UTC

head link

[R] random number generator?

With respect to the method used to produce
normals, Professor Ripley noted:

>Were it not for concerns about reproducibility we would have switched to 
>Inversion a while back.
Presumably this refers to reproducibility across platforms.  

I searched the archive and could find nothing on this point.
Gentle's text on RNGs provided no answer, either.

Might Professor Ripley or someone else briefly explain why the
inversion method is not reproducible but Box-Muller is, or 
perhaps offer a citation?

Thanks,

Bruce

ripley@stats.ox.ac.uk

2003-Jan-29 17:57 UTC

head link

[R] random number generator?

Reproducibility of existing R scripts was meant.  Changing the default
behaviour of R is going to surprise a large number of users, who may not
even know which was the default generator when they did their runs.

On Wed, 29 Jan 2003, Bruce D McCullough wrote:
> With respect to the method used to produce
> normals, Professor Ripley noted:
> 
> 
> >Were it not for concerns about reproducibility we would have switched
to
> >Inversion a while back.
> 
> Presumably this refers to reproducibility across platforms.  
Presumption seems rather out of place.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Jan 2003 - random number generator?

[R] random number generator?

[R] random number generator?

[R] random number generator?

[R] random number generator?

[R] random number generator?

[R] random number generator?

Seemingly Similar Threads