Rick DeShon
2011-Feb-09 16:06 UTC
[R] Generate multivariate normal data with a random correlation matrix
Hi All.
I'd like to generate a sample of n observations from a k dimensional
multivariate normal distribution with a random correlation matrix.
My solution:
The lower (or upper) triangle of the correlation matrix has
n.tri=(d/2)(d+1)-d entries.
Take a uniform sample of n.tri possible correlations (runi(n.tr,-.99,.99)
Populate a triangle of the matrix with the sampled correlations
Mirror the triangle to populate the other triangle forming a symmetric
matrix, cormat
Sample n observations from a multivariate normal distribution with
mean vector=0 and varcov=cormat
Problem:
This approach violates the triangle inequality property of correlation
matrices. ?So, the matrix I've constructed is certainly a valid matrix
but it is not a valid correlation matrix and it blows up when you
submit it to a random number generator such as rmnorm. ?With a small
matrix you sometimes get lucky and generate a valid correlation matrix
but as you increase d the probability of obtaining a valid correlation
matrix drops off quickly.
So, any ideas on how to construct a correlation matrix with random
entries that cover the range (or most of the range) or the correlation
[-1,1]?
Here's the code I've used that won't work.
************************************************
library(mnormt)
n <- 1000
d <- 50
n.tri <- ((d*(d+1))/2)-d
r ? ? ? <- runif(n.tri, min=-.5, max=.5)
cormat <- diag(c)
count1=1
for (i in 1:c){
? ? ? ?for (j in 1:c){
? ? ? ? ? ? ? ?if (i<j) {
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?cormat[i,j]=r[count1]
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?cormat[j,i]=cormat[i,j]
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?count1=count1+1
? ? ? ? ? ? ? ? ? ? ? ? ? ? }
? ? ? ?}
}
eigen(cormat) ? ? # if negative eigenvalue, then the matrix violates
the triangle inequality
x <- ?rmnorm(n, rep(0, c), cormat) ?# Sample the data
Thanks in advance,
Rick DeShon
Rick DeShon
2011-Feb-09 16:19 UTC
[R] Generate multivariate normal data with a random correlation matrix
Hi All.
I'd like to generate a sample of n observations from a k dimensional
multivariate normal distribution with a random correlation matrix.
My solution:
1) The lower (or upper) triangle of the correlation matrix has
n.tri=(d/2)(d+1)-d entries.
2) Take a uniform sample of n.tri possible correlations (runi(n.tr,-.99,.99)
3) Populate a triangle of the matrix with the sampled correlations
4) Mirror the triangle to populate the other triangle forming a
symmetric matrix, cormat
5) Sample n observations from a multivariate normal distribution with
mean vector=0 and varcov=cormat
Problem:
This approach violates the triangle inequality property of correlation
matrices. ?So, the matrix I've constructed is certainly a valid matrix
but it is not a valid correlation matrix and it blows up when you
submit it to a random number generator such as rmnorm. ?With a small
matrix you sometimes get lucky and generate a valid correlation matrix
but as you increase d the probability of obtaining a valid correlation
matrix drops off quickly.
So, any ideas on how to construct a correlation matrix with random
entries that cover the range (or most of the range) or the correlation
[-1,1]?
Here's the code I've used that won't work.
************************************************
library(mnormt)
n <- 1000
d <- 50
n.tri <- ((d*(d+1))/2)-d
r ? ? ? <- runif(n.tri, min=-.5, max=.5)
cormat <- diag(c)
count1=1
for (i in 1:c){
? ? ? ?for (j in 1:c){
? ? ? ? ? ? ? ?if (i<j) {
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?cormat[i,j]=r[count1]
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?cormat[j,i]=cormat[i,j]
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?count1=count1+1
? ? ? ? ? ? ? ? ? ? ? ? ? ? }
? ? ? ?}
}
eigen(cormat) ? ? # if negative eigenvalue, then the matrix violates
the triangle inequality
x <- ?rmnorm(n, rep(0, c), cormat) ?# Sample the data
Thanks in advance,
Rick DeShon
Szumiloski, John
2011-Feb-09 16:30 UTC
[R] Generate multivariate normal data with a random correlation matrix
The knee jerk thought I had was to express the correlation matrix as a generic
Choleski decomposition, then randomly populate the triangular decomposed matrix.
When you remultiply, you can simply rescale to 1s on the diagonals. Then rmnorm
as usual.
In R, see ?chol
If you want to get fancy, you could look at the random distribution you would
use for the triangular matrix and play with that, including different
distributions for different elements, elements' distributions being
conditional on values of previously randomized elements, etc.
John
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Rick DeShon
Sent: Wednesday, 09 February, 2011 11:06 AM
To: r-help at stat.math.ethz.ch
Subject: [R] Generate multivariate normal data with a random correlation matrix
Hi All.
I'd like to generate a sample of n observations from a k dimensional
multivariate normal distribution with a random correlation matrix.
My solution:
The lower (or upper) triangle of the correlation matrix has n.tri=(d/2)(d+1)-d
entries.
Take a uniform sample of n.tri possible correlations (runi(n.tr,-.99,.99)
Populate a triangle of the matrix with the sampled correlations Mirror the
triangle to populate the other triangle forming a symmetric matrix, cormat
Sample n observations from a multivariate normal distribution with mean vector=0
and varcov=cormat
Problem:
This approach violates the triangle inequality property of correlation matrices.
?So, the matrix I've constructed is certainly a valid matrix but it is not a
valid correlation matrix and it blows up when you submit it to a random number
generator such as rmnorm. ?With a small matrix you sometimes get lucky and
generate a valid correlation matrix but as you increase d the probability of
obtaining a valid correlation matrix drops off quickly.
So, any ideas on how to construct a correlation matrix with random entries that
cover the range (or most of the range) or the correlation [-1,1]?
Here's the code I've used that won't work.
************************************************
library(mnormt)
n <- 1000
d <- 50
n.tri <- ((d*(d+1))/2)-d
r ? ? ? <- runif(n.tri, min=-.5, max=.5)
cormat <- diag(c)
count1=1
for (i in 1:c){
? ? ? ?for (j in 1:c){
? ? ? ? ? ? ? ?if (i<j) {
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?cormat[i,j]=r[count1]
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?cormat[j,i]=cormat[i,j]
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?count1=count1+1
? ? ? ? ? ? ? ? ? ? ? ? ? ? }
? ? ? ?}
}
eigen(cormat) ? ? # if negative eigenvalue, then the matrix violates the
triangle inequality
x <- ?rmnorm(n, rep(0, c), cormat) ?# Sample the data
Thanks in advance,
Rick DeShon
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice: This e-mail message, together with any attachme...{{dropped:11}}