Ned Dochtermann
2011-Jun-02 23:42 UTC
[R] generating random covariance matrices (with a uniform distribution of correlations)
List members, Via searches I've seen similar discussion of this topic but have not seen resolution of the particular issue I am experiencing. If my search on this topic failed, I apologize for the redundancy. I am attempting to generate random covariance matrices but would like the corresponding correlations to be uniformly distributed between -1 and 1. The approach I have been using is: > st.dev<-.5 #this and k are aspects I would like to vary > k<-6 #number of... things > y1<-matrix(rnorm(k^2,mean=0,sd=st.dev),k) > y1<-t(y1)%*%y1 This produces positive definite symmetrical matrices, which is what I want. However, the following: > k<-6; kk<-(k*(k-1))/2; st.dev<-.5 > x<-matrix(0,5000,kk) > for(i in 1:5000){ > y1<-matrix(rnorm(k^2,mean=0,sd=st.dev),k) > y1<-t(y1)%*%y1 > y1<-y1/k #this keeps the variances similar across different k's > y1<-cov2cor(y1) > x[i,]<- y1[lower.tri(y1)] > } > hist(c(x)) demonstrates that the distribution of corresponding correlations is not uniformly distributed between -1 and 1 (of course there is no reason to suspect that it would be). I have tried using both rcorrmatrix and genPositiveDefMat in the "clusterGeneration" package but even when picking the "unifcorrmat" option and setting "alphad" to 1, the distribution of correlations is not uniform. Any recommendations on how to generate the desired covariance matrices would be appreciated. Ned -- Ned A. Dochtermann Department of Biology University of Nevada, Reno ned.dochtermann at gmail.com http://wolfweb.unr.edu/homepage/mpeacock/Ned.Dochtermann/ http://www.researcherid.com/rid/A-7146-2010 --
Petr Savicky
2011-Jun-03 07:01 UTC
[R] generating random covariance matrices (with a uniform distribution of correlations)
On Thu, Jun 02, 2011 at 04:42:59PM -0700, Ned Dochtermann wrote:> List members, > > Via searches I've seen similar discussion of this topic but have not seen > resolution of the particular issue I am experiencing. If my search on this > topic failed, I apologize for the redundancy. I am attempting to generate > random covariance matrices but would like the corresponding correlations to > be uniformly distributed between -1 and 1. > > The approach I have been using is: > > > st.dev<-.5 #this and k are aspects I would like to vary > > k<-6 #number of... things > > y1<-matrix(rnorm(k^2,mean=0,sd=st.dev),k) > > y1<-t(y1)%*%y1 > > This produces positive definite symmetrical matrices, which is what I want. > However, the following: > > > k<-6; kk<-(k*(k-1))/2; st.dev<-.5 > > x<-matrix(0,5000,kk) > > for(i in 1:5000){ > > y1<-matrix(rnorm(k^2,mean=0,sd=st.dev),k) > > y1<-t(y1)%*%y1 > > y1<-y1/k #this keeps the variances similar across different k's > > y1<-cov2cor(y1) > > x[i,]<- y1[lower.tri(y1)] > > } > > hist(c(x)) > > demonstrates that the distribution of corresponding correlations is not > uniformly distributed between -1 and 1 (of course there is no reason to > suspect that it would be). I have tried using both rcorrmatrix and > genPositiveDefMat in the "clusterGeneration" package but even when picking > the "unifcorrmat" option and setting "alphad" to 1, the distribution of > correlations is not uniform. > > Any recommendations on how to generate the desired covariance matrices would > be appreciated.Hello. Let me suggest the following procedure. 1. Generate a symmetric matrix A with the desired distribution of the non-diagonal elements and with zeros on the diagonal. 2. Compute the smallest eigenvalue lambda_1 of A. 3. Replace A by A + t I, where I is the identity matrix and t is a number such that t + lambda_1 > 0. The resulting matrix will have the same non-diagonal elements as A, but will be positive definite. Petr Savicky.
Ned Dochtermann
2011-Jun-03 20:54 UTC
[R] generating random covariance matrices (with a uniform distribution of correlations)
Petr, This is the code I used for your suggestion: k<-6;kk<-(k*(k-1))/2 x<-matrix(0,5000,kk) for(i in 1:5000){ A.1<-matrix(0,k,k) rs<-runif(kk,min=-1,max=1) A.1[lower.tri(A.1)]<-rs A.1[upper.tri(A.1)]<-t(A.1)[upper.tri(A.1)] cors.i<-diag(k) t<-.001-min(Re(eigen(A.1)$values)) new.cor<-cov2cor(A.1+(t*cors.i)) x[i,]<-new.cor[lower.tri(new.cor)]} hist(c(x)); max(c(x)); median(c(x)) This, unfortunately, does not maintain the desired distribution of correlations. I did, however, learn some neat coding tricks (that were new for me) along the way. Ned -- On Thu, Jun 02, 2011 at 04:42:59PM -0700, Ned Dochtermann wrote:> List members, > > Via searches I've seen similar discussion of this topic but have not seen > resolution of the particular issue I am experiencing. If my search on this > topic failed, I apologize for the redundancy. I am attempting to generate > random covariance matrices but would like the corresponding correlationsto> be uniformly distributed between -1 and 1. >...> > Any recommendations on how to generate the desired covariance matriceswould> be appreciated.Hello. Let me suggest the following procedure. 1. Generate a symmetric matrix A with the desired distribution of the non-diagonal elements and with zeros on the diagonal. 2. Compute the smallest eigenvalue lambda_1 of A. 3. Replace A by A + t I, where I is the identity matrix and t is a number such that t + lambda_1 > 0. The resulting matrix will have the same non-diagonal elements as A, but will be positive definite. Petr Savicky.
Petr Savicky
2011-Jun-03 22:26 UTC
[R] generating random covariance matrices (with a uniform distribution of correlations)
On Fri, Jun 03, 2011 at 01:54:33PM -0700, Ned Dochtermann wrote:> Petr, > This is the code I used for your suggestion: > > k<-6;kk<-(k*(k-1))/2 > x<-matrix(0,5000,kk) > for(i in 1:5000){ > A.1<-matrix(0,k,k) > rs<-runif(kk,min=-1,max=1) > A.1[lower.tri(A.1)]<-rs > A.1[upper.tri(A.1)]<-t(A.1)[upper.tri(A.1)] > cors.i<-diag(k) > t<-.001-min(Re(eigen(A.1)$values)) > new.cor<-cov2cor(A.1+(t*cors.i)) > x[i,]<-new.cor[lower.tri(new.cor)]} > hist(c(x)); max(c(x)); median(c(x)) > > This, unfortunately, does not maintain the desired distribution of > correlations.I see. I overlooked that you require correlations and not covariances. The distribution of correlations cannot be chosen arbitrarily, since it is not possible to have k variables, such that each two of them have correlation less than -1/(k-1). For example, it is not possible that all pairs among 3 variables have correlation less than -0.5. The reason is as follows. If X_1, ..., X_k have mean 0, variance 1 and all pairs have correlation at most c, then E (X_1 + ... + X_k)^2 <= k(1 + (k-1)c) If c < -1/(k-1), then the right hand side is negative, which is not possible. Can you relax the requirement on the negative correlations? Petr.
Ned Dochtermann
2011-Jun-06 19:18 UTC
[R] generating random covariance matrices (with a uniform distribution of correlations)
Thank you very much, this does help quite a bit. Ned From: Petr Savicky <savicky_at_praha1.ff.cuni.cz> Date: Sat, 04 Jun 2011 11:44:52 +0200 On Fri, Jun 03, 2011 at 01:54:33PM -0700, Ned Dochtermann wrote:> Petr, > This is the code I used for your suggestion: > > k<-6;kk<-(k*(k-1))/2 > x<-matrix(0,5000,kk) > for(i in 1:5000){ > A.1<-matrix(0,k,k) > rs<-runif(kk,min=-1,max=1) > A.1[lower.tri(A.1)]<-rs > A.1[upper.tri(A.1)]<-t(A.1)[upper.tri(A.1)] > cors.i<-diag(k) > t<-.001-min(Re(eigen(A.1)$values)) > new.cor<-cov2cor(A.1+(t*cors.i)) > x[i,]<-new.cor[lower.tri(new.cor)]} > hist(c(x)); max(c(x)); median(c(x)) > > This, unfortunately, does not maintain the desired distribution of > correlations.Hello. On the contrary to what i thought originally, there are solutions also for the case of the correlation matrix. The first solution creates a singular correlation matrix (of rank 3), but the nondiagonal entries have exactly the uniform distribution on [-1, 1], since the scalar product of two independent uniformly distributed unit vectors in R^3 has the uniform distribution on [-1, 1]. x <- matrix(rnorm(18), nrow=6, ncol=3) x <- x/sqrt(rowSums(x^2)) a <- x %*% t(x) The next solution produces a correlation matrix of full rank, whose non-diagonal entries have distribution very close to the uniform on [-1, 1]. KS test finds a difference only with sample size more than 50'000. w <- c(0.01459422, 0.01830718, 0.04066405, 0.50148488, 0.60330865, 0.61832829) x <- matrix(rnorm(36), nrow=6, ncol=6) %*% diag(w) x <- x/sqrt(rowSums(x^2)) a <- x %*% t(x) Hope this helps. Petr Savicky. -----Original Message----- From: Ned Dochtermann [mailto:ned.dochtermann at gmail.com] Sent: Friday, June 03, 2011 1:55 PM To: 'r-help at r-project.org'; 'savicky at praha1.fff.cuni.cz' Subject: Re: [R] generating random covariance matrices (with a uniform distribution of correlations) Petr, This is the code I used for your suggestion: k<-6;kk<-(k*(k-1))/2 x<-matrix(0,5000,kk) for(i in 1:5000){ A.1<-matrix(0,k,k) rs<-runif(kk,min=-1,max=1) A.1[lower.tri(A.1)]<-rs A.1[upper.tri(A.1)]<-t(A.1)[upper.tri(A.1)] cors.i<-diag(k) t<-.001-min(Re(eigen(A.1)$values)) new.cor<-cov2cor(A.1+(t*cors.i)) x[i,]<-new.cor[lower.tri(new.cor)]} hist(c(x)); max(c(x)); median(c(x)) This, unfortunately, does not maintain the desired distribution of correlations. I did, however, learn some neat coding tricks (that were new for me) along the way. Ned -- On Thu, Jun 02, 2011 at 04:42:59PM -0700, Ned Dochtermann wrote:> List members, > > Via searches I've seen similar discussion of this topic but have not seen > resolution of the particular issue I am experiencing. If my search on this > topic failed, I apologize for the redundancy. I am attempting to generate > random covariance matrices but would like the corresponding correlationsto> be uniformly distributed between -1 and 1. >...> > Any recommendations on how to generate the desired covariance matriceswould> be appreciated.Hello. Let me suggest the following procedure. 1. Generate a symmetric matrix A with the desired distribution of the non-diagonal elements and with zeros on the diagonal. 2. Compute the smallest eigenvalue lambda_1 of A. 3. Replace A by A + t I, where I is the identity matrix and t is a number such that t + lambda_1 > 0. The resulting matrix will have the same non-diagonal elements as A, but will be positive definite. Petr Savicky.