hi all i would like to generate independent vectors. i have included the code below. i display the correlation matrix of the n*p (n=the number of samples, p= the number of variables) matrix. what i find is that as n increases, the correlation matrix tends to an identity matrix. i.e. independence. for small samples (see examples below, for n=20, p=5 ; n=50,p=5, n=100,p=5; n=10000,p=5) the variables does not appear to be independent. note that i have not tested this statement statistically. ARE these variables independent? by setting the seed for each variable run i HOPE that the variables are now independent. IS this true??? if not does anyone know how to generate these independent variables? how does R generate its random variables? does it use the box muller technique? if so how does it generate the random uniform variables? thanking you in advance!!! *** allan IND<-function(rows.=10,col.=3) { a<-matrix(nrow=rows.,ncol=col.) for (j in 1:col.) { set.seed(j) r<-rnorm(rows.) a[,j]<-r } #list(a=a,cor=cor(a)) cor(a) } IND(rows.=10,col.=3)> IND(rows.=20,col.=5)[,1] [,2] [,3] [,4] [,5] [1,] 1.00000000 0.25677165 -0.14882130 -0.09190797 0.1562481 [2,] 0.25677165 1.00000000 0.02585515 0.13735712 -0.1443301 [3,] -0.14882130 0.02585515 1.00000000 -0.11311416 0.1437001 [4,] -0.09190797 0.13735712 -0.11311416 1.00000000 0.1833647 [5,] 0.15624807 -0.14433011 0.14370006 0.18336467 1.0000000> IND(rows.=50,col.=5)[,1] [,2] [,3] [,4] [,5] [1,] 1.0000000000 0.07915025 0.0009239851 -0.14102117 -0.07335342 [2,] 0.0791502463 1.00000000 -0.1764530631 0.10021081 0.19742285 [3,] 0.0009239851 -0.17645306 1.0000000000 0.02968062 0.14543350 [4,] -0.1410211698 0.10021081 0.0296806188 1.00000000 0.07234953 [5,] -0.0733534183 0.19742285 0.1454335014 0.07234953 1.00000000>> IND(rows.=100,col.=5)[,1] [,2] [,3] [,4] [,5] [1,] 1.00000000 -0.1537208 -0.023741715 -0.135245915 0.01961224 [2,] -0.15372076 1.0000000 -0.141796984 0.157219334 0.15518443 [3,] -0.02374171 -0.1417970 1.000000000 0.005865698 0.19118563 [4,] -0.13524592 0.1572193 0.005865698 1.000000000 0.07345299 [5,] 0.01961224 0.1551844 0.191185627 0.073452993 1.00000000>> IND(rows.=10000,col.=5)[,1] [,2] [,3] [,4] [,5] [1,] 1.000000000 0.015928444 -0.008288940 -0.005646904 0.006936662 [2,] 0.015928444 1.000000000 -0.005444611 0.005242395 -0.008246009 [3,] -0.008288940 -0.005444611 1.000000000 0.007277489 0.012299247 [4,] -0.005646904 0.005242395 0.007277489 1.000000000 0.001918704 [5,] 0.006936662 -0.008246009 0.012299247 0.001918704 1.000000000
You might try mvrnorm() in MASS. library(MASS) mvrnorm(n=10, mu=rep(0, 3), Sigma=diag(3), empirical=TRUE) Clark Allan wrote:> hi all > > i would like to generate independent vectors. i have included the code > below. i display the correlation matrix of the n*p (n=the number of > samples, p= the number of variables) matrix. > what i find is that as n increases, the correlation matrix tends to an > identity matrix. i.e. independence. for small samples (see examples > below, for n=20, p=5 ; n=50,p=5, n=100,p=5; n=10000,p=5) the variables > does not appear to be independent. note that i have not tested this > statement statistically. > > ARE these variables independent? by setting the seed for each variable > run i HOPE that the variables are now independent. IS this true??? if > not does anyone know how to generate these independent variables? > > how does R generate its random variables? does it use the box muller > technique? if so how does it generate the random uniform variables? > > thanking you in advance!!! > *** > allan > > > > IND<-function(rows.=10,col.=3) > { > a<-matrix(nrow=rows.,ncol=col.) > for (j in 1:col.) > { > set.seed(j) > r<-rnorm(rows.) > a[,j]<-r > } > #list(a=a,cor=cor(a)) > cor(a) > } > IND(rows.=10,col.=3) > > > > >>IND(rows.=20,col.=5) > > [,1] [,2] [,3] [,4] [,5] > [1,] 1.00000000 0.25677165 -0.14882130 -0.09190797 0.1562481 > [2,] 0.25677165 1.00000000 0.02585515 0.13735712 -0.1443301 > [3,] -0.14882130 0.02585515 1.00000000 -0.11311416 0.1437001 > [4,] -0.09190797 0.13735712 -0.11311416 1.00000000 0.1833647 > [5,] 0.15624807 -0.14433011 0.14370006 0.18336467 1.0000000 > > > >>IND(rows.=50,col.=5) > > [,1] [,2] [,3] [,4] [,5] > [1,] 1.0000000000 0.07915025 0.0009239851 -0.14102117 -0.07335342 > [2,] 0.0791502463 1.00000000 -0.1764530631 0.10021081 0.19742285 > [3,] 0.0009239851 -0.17645306 1.0000000000 0.02968062 0.14543350 > [4,] -0.1410211698 0.10021081 0.0296806188 1.00000000 0.07234953 > [5,] -0.0733534183 0.19742285 0.1454335014 0.07234953 1.00000000 > > >>IND(rows.=100,col.=5) > > [,1] [,2] [,3] [,4] [,5] > [1,] 1.00000000 -0.1537208 -0.023741715 -0.135245915 0.01961224 > [2,] -0.15372076 1.0000000 -0.141796984 0.157219334 0.15518443 > [3,] -0.02374171 -0.1417970 1.000000000 0.005865698 0.19118563 > [4,] -0.13524592 0.1572193 0.005865698 1.000000000 0.07345299 > [5,] 0.01961224 0.1551844 0.191185627 0.073452993 1.00000000 > > >>IND(rows.=10000,col.=5) > > [,1] [,2] [,3] [,4] [,5] > [1,] 1.000000000 0.015928444 -0.008288940 -0.005646904 0.006936662 > [2,] 0.015928444 1.000000000 -0.005444611 0.005242395 -0.008246009 > [3,] -0.008288940 -0.005444611 1.000000000 0.007277489 0.012299247 > [4,] -0.005646904 0.005242395 0.007277489 1.000000000 0.001918704 > [5,] 0.006936662 -0.008246009 0.012299247 0.001918704 1.000000000 > > > ------------------------------------------------------------------------ > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894
thanx. this function works and does exactly what i want Chuck Cleland wrote:> > You might try mvrnorm() in MASS. > > library(MASS) > mvrnorm(n=10, mu=rep(0, 3), Sigma=diag(3), empirical=TRUE) > > Clark Allan wrote: > > hi all > > > > i would like to generate independent vectors. i have included the code > > below. i display the correlation matrix of the n*p (n=the number of > > samples, p= the number of variables) matrix. > > what i find is that as n increases, the correlation matrix tends to an > > identity matrix. i.e. independence. for small samples (see examples > > below, for n=20, p=5 ; n=50,p=5, n=100,p=5; n=10000,p=5) the variables > > does not appear to be independent. note that i have not tested this > > statement statistically. > > > > ARE these variables independent? by setting the seed for each variable > > run i HOPE that the variables are now independent. IS this true??? if > > not does anyone know how to generate these independent variables? > > > > how does R generate its random variables? does it use the box muller > > technique? if so how does it generate the random uniform variables? > > > > thanking you in advance!!! > > *** > > allan > > > > > > > > IND<-function(rows.=10,col.=3) > > { > > a<-matrix(nrow=rows.,ncol=col.) > > for (j in 1:col.) > > { > > set.seed(j) > > r<-rnorm(rows.) > > a[,j]<-r > > } > > #list(a=a,cor=cor(a)) > > cor(a) > > } > > IND(rows.=10,col.=3) > > > > > > > > > >>IND(rows.=20,col.=5) > > > > [,1] [,2] [,3] [,4] [,5] > > [1,] 1.00000000 0.25677165 -0.14882130 -0.09190797 0.1562481 > > [2,] 0.25677165 1.00000000 0.02585515 0.13735712 -0.1443301 > > [3,] -0.14882130 0.02585515 1.00000000 -0.11311416 0.1437001 > > [4,] -0.09190797 0.13735712 -0.11311416 1.00000000 0.1833647 > > [5,] 0.15624807 -0.14433011 0.14370006 0.18336467 1.0000000 > > > > > > > >>IND(rows.=50,col.=5) > > > > [,1] [,2] [,3] [,4] [,5] > > [1,] 1.0000000000 0.07915025 0.0009239851 -0.14102117 -0.07335342 > > [2,] 0.0791502463 1.00000000 -0.1764530631 0.10021081 0.19742285 > > [3,] 0.0009239851 -0.17645306 1.0000000000 0.02968062 0.14543350 > > [4,] -0.1410211698 0.10021081 0.0296806188 1.00000000 0.07234953 > > [5,] -0.0733534183 0.19742285 0.1454335014 0.07234953 1.00000000 > > > > > >>IND(rows.=100,col.=5) > > > > [,1] [,2] [,3] [,4] [,5] > > [1,] 1.00000000 -0.1537208 -0.023741715 -0.135245915 0.01961224 > > [2,] -0.15372076 1.0000000 -0.141796984 0.157219334 0.15518443 > > [3,] -0.02374171 -0.1417970 1.000000000 0.005865698 0.19118563 > > [4,] -0.13524592 0.1572193 0.005865698 1.000000000 0.07345299 > > [5,] 0.01961224 0.1551844 0.191185627 0.073452993 1.00000000 > > > > > >>IND(rows.=10000,col.=5) > > > > [,1] [,2] [,3] [,4] [,5] > > [1,] 1.000000000 0.015928444 -0.008288940 -0.005646904 0.006936662 > > [2,] 0.015928444 1.000000000 -0.005444611 0.005242395 -0.008246009 > > [3,] -0.008288940 -0.005444611 1.000000000 0.007277489 0.012299247 > > [4,] -0.005646904 0.005242395 0.007277489 1.000000000 0.001918704 > > [5,] 0.006936662 -0.008246009 0.012299247 0.001918704 1.000000000 > > > > > > ------------------------------------------------------------------------ > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > -- > Chuck Cleland, Ph.D. > NDRI, Inc. > 71 West 23rd Street, 8th floor > New York, NY 10010 > tel: (212) 845-4495 (Tu, Th) > tel: (732) 452-1424 (M, W, F) > fax: (917) 438-0894