thr3ads.net - R help - [R] testing randomness of random number generators with student t-test? [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Carl Witthoft

2011-Feb-02 23:01 UTC

[R] testing randomness of random number generators with student t-test?

Hi, subject more or less says it all.

I freely admit to not having bothered to find some of the online papers 
about method of testing the quality of random number generators -- but 
in an idle moment I wondered what to expect from something like the 
following:


randa<-runif(1000)
randb<-runif(1000)
t.test(randa,randb)$p.value
var.test(randa,randb)$p.value

[repeat ad nauseum]


Is the range of p-values I get in any way related tothe "quality" of
the
random number generator?

thanks
Carl

Barry Rowlingson

2011-Feb-02 23:45 UTC

head link

[R] testing randomness of random number generators with student t-test?

On Wed, Feb 2, 2011 at 11:01 PM, Carl Witthoft <carl at witthoft.com>
wrote:> Hi, subject more or less says it all.
>
> I freely admit to not having bothered to find some of the online papers
> about method of testing the quality of random number generators -- but in
an
> idle moment I wondered what to expect from something like the following:
>
>
> randa<-runif(1000)
> randb<-runif(1000)
> t.test(randa,randb)$p.value
> var.test(randa,randb)$p.value
>
> [repeat ad nauseum]
>
>
> Is the range of p-values I get in any way related tothe "quality"
of the
> random number generator?
 Well yes. All pseudo random number generators have a period, after
which they come back to the start and begin churning out the same
sequence again. Good PRNGs have a sequence length that is
astronomically high. If you have a PRNG that has a sequence of 1000,
or 500, or 200 etc your two sets will be perfectly correlated...

 You might want to read up on RANDU, the infamous poor PRNG:

 http://en.wikipedia.org/wiki/RANDU

?We guarantee that each number is random individually, but we don?t
guarantee that more than one of them is random.?

The other things to look at are the DieHard tests:
http://en.wikipedia.org/wiki/Diehard_test

Barry

Phil Spector

2011-Feb-03 00:18 UTC

head link

[R] testing randomness of random number generators with student t-test?

Carl -
    Under the null hypothesis, the distribution of p-values for
any statistical test should be uniform over the range from 0 to
1.   So while the individual p-values you see in an experiment
like the one you carried out aren't really meaningful, their
ensemble behaviour is.  So if you did something like
> pvals = replicate(10000,     
{randa<-runif(1000);randb<-runif(1000);t.test(randa,randb)$p.value})> ks.test(pvals,'punif')
you'd expect the ks.test to support the hypothesis that the pvals 
follow a U(0,1) distribution.  As others have pointed out, there are
many other issues regarding random number generation, but I think what
I've described addresses the issue of the t.test probabilities.

 					- Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu

On Wed, 2 Feb 2011, Carl Witthoft wrote:
> Hi, subject more or less says it all.
>
> I freely admit to not having bothered to find some of the online papers
about
> method of testing the quality of random number generators -- but in an idle
> moment I wondered what to expect from something like the following:
>
>
> randa<-runif(1000)
> randb<-runif(1000)
> t.test(randa,randb)$p.value
> var.test(randa,randb)$p.value
>
> [repeat ad nauseum]
>
>
> Is the range of p-values I get in any way related tothe "quality"
of the
> random number generator?
>
> thanks
> Carl
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Dirk Eddelbuettel

2011-Feb-03 01:08 UTC

head link

[R] testing randomness of random number generators with student t-test?

On 2 February 2011 at 23:45, Barry Rowlingson wrote:
| On Wed, Feb 2, 2011 at 11:01 PM, Carl Witthoft <carl at witthoft.com>
wrote:
| > Hi, subject more or less says it all.
| >
| > I freely admit to not having bothered to find some of the online papers
| > about method of testing the quality of random number generators -- but in
an
| > idle moment I wondered what to expect from something like the following:
| >
| >
| > randa<-runif(1000)
| > randb<-runif(1000)
| > t.test(randa,randb)$p.value
| > var.test(randa,randb)$p.value
| >
| > [repeat ad nauseum]
| >
| >
| > Is the range of p-values I get in any way related tothe
"quality" of the
| > random number generator?
| 
|  Well yes. All pseudo random number generators have a period, after
| which they come back to the start and begin churning out the same
| sequence again. Good PRNGs have a sequence length that is
| astronomically high. If you have a PRNG that has a sequence of 1000,
| or 500, or 200 etc your two sets will be perfectly correlated...
| 
|  You might want to read up on RANDU, the infamous poor PRNG:
| 
|  http://en.wikipedia.org/wiki/RANDU
| 
| ?We guarantee that each number is random individually, but we don?t
| guarantee that more than one of them is random.?
| 
| The other things to look at are the DieHard tests:
| http://en.wikipedia.org/wiki/Diehard_test

And/or the DieHarder test by Robert G Brown et al -- and with that the
RDieHarder package on CRAN which wraps.  (And I need to catch up to the fresh
development in DieHarder

Dirk 

| Barry
| 
| ______________________________________________
| R-help at r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-help
| PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
| and provide commented, minimal, self-contained, reproducible code.

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com

Petr Savicky

2011-Feb-03 12:29 UTC

head link

[R] testing randomness of random number generators with student t-test?

On Wed, Feb 02, 2011 at 06:01:36PM -0500, Carl Witthoft
wrote:> Hi, subject more or less says it all.
> 
> I freely admit to not having bothered to find some of the online papers 
> about method of testing the quality of random number generators -- but 
> in an idle moment I wondered what to expect from something like the 
> following:
> 
> 
> randa<-runif(1000)
> randb<-runif(1000)
> t.test(randa,randb)$p.value
> var.test(randa,randb)$p.value
> 
> [repeat ad nauseum]
> 
> 
> Is the range of p-values I get in any way related tothe "quality"
of the
> random number generator?
Hi.

As already explained, the result of t.test() in this case confirms
good quality of Mersenne Twister generator used in R.

The situation is slightly more complicated with ks.test() due to
the 32-bit precision of the random numbers as discussed in
section Note of ?RNGkind. For example

  n <- 100000
  ks.test(runif(n), runif(n))

typically produces a warning due to ties. This is not related to the
quality of the randomness. The reason is that the random numbers
have 32 bits and due to birthday paradox we get collisions already 
for 2^16 numbers with probability about 0.39. The null hypothesis
should be changed to assume uniform distribution on the numbers in
(0, 1) with at most 32 bits.

See section Random Number Generators of CRAN Task View Probability
Distributions by Christophe Dutang for information on CRAN packages
related to random numbers.

As far as i know, the only tests, which can distinguish Mersenne Twister 
numbers from truly random ones are linear complexity tests mod 2. This
is discussed, for example, in section 7 Conclusion, Future Work, and
Open Issues in
  http://www.iro.umontreal.ca/~lecuyer/myftp/papers/horms.pdf
by P. L'Ecuyer.

Applications, which do not use the bitwise mod 2 (XOR) operations, are
very unlikely to interfere with the linear tests mod 2. On the other hand,
if bitwise XOR is used, then Mersenne Twister numbers may be predicted
due to the fact that it is defined using XOR operation and the history of
the last 624 numbers. A simple demonstration of this known predictability
is contained in
  http://www.cs.cas.cz/~savicky/predict_MT/predict_MT.R

At the first glance, this may look as very bad. On the other hand, if there
is a relatively simple smooth function of 625 real variables, which has
a measurable difference of expected value on Mersenne Twister numbers and
truly random ones, then this is likely to be an interesting mathematical
discovery.

Petr Savicky.

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Feb 2011 - testing randomness of random number generators with student t-test?

[R] testing randomness of random number generators with student t-test?

[R] testing randomness of random number generators with student t-test?

[R] testing randomness of random number generators with student t-test?

[R] testing randomness of random number generators with student t-test?

[R] testing randomness of random number generators with student t-test?

Seemingly Similar Threads