thr3ads.net - R help - [R] Random numbers negatively correlated? [Jun 2006]

If this information is useful, please help other people find it:
Share via:

Christian Hennig

2006-Jun-27 17:42 UTC

[R] Random numbers negatively correlated?

Dear list,

I did simulations in which I generated 10000
independent Bernoulli(0.5)-sequences of length 100. I estimated
p for each sequence and I also estimated the conditional probability that 
a one is followed by another one (which should be p as well).
However, the second probability is significantly smaller than 0.5 (namely
about 0.494, see below) and of course smaller than the direct estimates of 
p as well, indicating negative correlation between the random numbers.

See below the code and the results.
Did I do something wrong or are the numbers in 
fact negatively correlated? (A type I error is quite unlikely with a
p-value below 2.2e-16.)

Best,
Christian

set.seed(123456)
n <- 100
p <- 0.5
simruns <- 10000
est <- est11 <- numeric(0)
for (i in 1:simruns){
#    if (i/100==round(i/100)) print(i)
     x <- rbinom(n,1,p)
     est[i] <- mean(x)
     x11 <- 3*x[2:n]-x[1:(n-1)]
     est11[i] <- sum(x11==2)/sum(x11==2 | x11==(-1))
     # x11==(-1): 0 follows 1, x11==2: 1 follows 1.
}
> print(mean(est))
[1] 0.499554> print(sd(est)/sqrt(simruns))[1] 0.0004958232
# OK
> print(mean(est11))
[1] 0.4935211> print(sd(est11)/sqrt(simruns))[1] 0.0007136213
# mean(est11)+2*sd(mean) < 0.495
> print(sum(est>est11))
[1] 5575> binom.test(5575,10000)
 	Exact binomial test

data:  5575 and 10000
number of successes = 5575, number of trials = 10000, p-value <
2.2e-16


*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

Thomas Lumley

2006-Jun-27 18:07 UTC

head link

[R] Random numbers negatively correlated?

On Tue, 27 Jun 2006, Christian Hennig wrote:
> Dear list,
>
> I did simulations in which I generated 10000
> independent Bernoulli(0.5)-sequences of length 100. I estimated
> p for each sequence and I also estimated the conditional probability that
> a one is followed by another one (which should be p as well).
> However, the second probability is significantly smaller than 0.5 (namely
> about 0.494, see below) and of course smaller than the direct estimates of
> p as well, indicating negative correlation between the random numbers.
>
> See below the code and the results.
> Did I do something wrong or are the numbers in
> fact negatively correlated? (A type I error is quite unlikely with a
> p-value below 2.2e-16.)
I think you did something wrong, and that there is a problem with 
overlapping blocks of two.

If you do
   x<-matrix(rbinom(1e6,1,p),ncol=2)
   tt<-table(x[,1],x[,2])

you get much better looking results. In this case you have 500,000 
independent pairs of numbers that can be 01, 10, 11, 00. A test for 
independence seems fine.
> tt
          0      1
   0 125246 124814
   1 125140 124800> fisher.test(tt)
         Fisher's Exact Test for Count Data

data:  tt
p-value = 0.8987
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  0.9896745 1.0119211
sample estimates:
odds ratio
   1.000735


In your case the deficit in est11 is suspiciously close to 0.5/n. Changing 
n to 1000 and using the same seed I get> mean(est11)-0.5[1] -0.0005534743
10 times smaller, and still close to 0.5/n.

Now, consider what happens in a case where we can see all the 
possibilities, n=3

x    1/0  1/1 
000   -   -
001   -   -
010   1   0
011   0   1
100   1   0
101   1   0
110   1   1
111   2   0

So that if each of these three triplets has the same probability your 
est11 would be 2/8 rather than 4/8, and est11 is not an unbiased estimate 
of the long-run conditional probability. The bias is of order 1/n, so you 
need n to be of larger order than  sqrt(simruns).


 	-thomas

Thomas Lumley

2006-Jun-27 18:20 UTC

head link

[R] Random numbers negatively correlated?

On Tue, 27 Jun 2006, Thomas Lumley wrote:

I got the table wrong, it should read

  x    1/0  1/1  est11
  000   -   -     -
  001   -   -     -
  010   1   0     0
  011   0   1     1
  100   1   0     0
  101   1   0     0
  110   1   1     0.5
  111   0   2     1

So the explanation is slightly more complicated. The problem is that 
although sum(x11==2)/n and sum(x11==2 | x11==(-1))/n are unbiased 
estimators their ratio is not an unbiased estimator, but has mean 5/12 
(which is 0.5-0.5/n).

 	-thomas

Maybe Matching Threads

Search for more possibly parallel threads

R help - Jun 2006 - Random numbers negatively correlated?

[R] Random numbers negatively correlated?

[R] Random numbers negatively correlated?

[R] Random numbers negatively correlated?

Maybe Matching Threads