Nguyen Dinh Nguyen
2007-Apr-02 01:47 UTC
[R] Generate a serie of new vars that correlate with existing var
Dear R helpers, I have a var (let call X1) with approximately Normal distribution (say, mean=15, SD=5). I want to generate a series of additional vars X2, X3, X4?such that the correlation between X2 and X1 is o.4, X3 and X1 is 0.5, X4 and X1 is 0.6 and so on with the condition all variables X2, X3, X4?.have the same mean and SD with X1. Any help should be appreciated Regards Nguyen
Greg Snow
2007-Apr-03 15:45 UTC
[R] Generate a serie of new vars that correlate with existing var
Here is one way to do it: # create the initial x variable x1 <- rnorm(100, 15, 5) # x2, x3, and x4 in a matrix, these will be modified to meet the criteria x234 <- scale(matrix( rnorm(300), ncol=3 )) # put all into 1 matrix for simplicity x1234 <- cbind(scale(x1),x234) # find the current correlation matrix c1 <- var(x1234) # cholesky decomposition to get independence chol1 <- solve(chol(c1)) newx <- x1234 %*% chol1 # check that we have independence and x1 unchanged zapsmall(cor(newx)) all.equal( x1234[,1], newx[,1] ) # create new correlation structure (zeros can be replaced with other r vals) newc <- matrix( c(1 , 0.4, 0.5, 0.6, 0.4, 1 , 0 , 0 , 0.5, 0 , 1 , 0 , 0.6, 0 , 0 , 1 ), ncol=4 ) # check that it is positive definite eigen(newc) chol2 <- chol(newc) finalx <- newx %*% chol2 * sd(x1) + mean(x1) # verify success mean(x1) colMeans(finalx) sd(x1) apply(finalx, 2, sd) zapsmall(cor(finalx)) pairs(finalx) all.equal(x1, finalx[,1]) Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at intermountainmail.org (801) 408-8111> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Nguyen > Dinh Nguyen > Sent: Sunday, April 01, 2007 7:47 PM > To: r-help at stat.math.ethz.ch > Subject: [R] Generate a serie of new vars that correlate with > existing var > > Dear R helpers, > I have a var (let call X1) with approximately Normal > distribution (say, mean=15, SD=5). > I want to generate a series of additional vars X2, X3, > X4...such that the correlation between X2 and X1 is o.4, X3 and > X1 is 0.5, X4 and X1 is 0.6 and so on with the condition all > variables X2, X3, X4....have the same mean and SD with X1. > Any help should be appreciated > Regards > Nguyen > >
Nguyen Dinh Nguyen
2007-Apr-03 22:51 UTC
[R] Generate a serie of new vars that correlate with existing var
Dear Greg, Thanks million! "As good as it gets" :) All the best Nguyen -----Original Message----- From: Greg Snow [mailto:Greg.Snow at intermountainmail.org] Sent: Wednesday, April 04, 2007 1:46 AM To: Nguyen Dinh Nguyen; r-help at stat.math.ethz.ch Subject: RE: [R] Generate a serie of new vars that correlate with existing var Here is one way to do it: # create the initial x variable x1 <- rnorm(100, 15, 5) # x2, x3, and x4 in a matrix, these will be modified to meet the criteria x234 <- scale(matrix( rnorm(300), ncol=3 )) # put all into 1 matrix for simplicity x1234 <- cbind(scale(x1),x234) # find the current correlation matrix c1 <- var(x1234) # cholesky decomposition to get independence chol1 <- solve(chol(c1)) newx <- x1234 %*% chol1 # check that we have independence and x1 unchanged zapsmall(cor(newx)) all.equal( x1234[,1], newx[,1] ) # create new correlation structure (zeros can be replaced with other r vals) newc <- matrix( c(1 , 0.4, 0.5, 0.6, 0.4, 1 , 0 , 0 , 0.5, 0 , 1 , 0 , 0.6, 0 , 0 , 1 ), ncol=4 ) # check that it is positive definite eigen(newc) chol2 <- chol(newc) finalx <- newx %*% chol2 * sd(x1) + mean(x1) # verify success mean(x1) colMeans(finalx) sd(x1) apply(finalx, 2, sd) zapsmall(cor(finalx)) pairs(finalx) all.equal(x1, finalx[,1]) Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at intermountainmail.org (801) 408-8111> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Nguyen > Dinh Nguyen > Sent: Sunday, April 01, 2007 7:47 PM > To: r-help at stat.math.ethz.ch > Subject: [R] Generate a serie of new vars that correlate with > existing var > > Dear R helpers, > I have a var (let call X1) with approximately Normal > distribution (say, mean=15, SD=5). > I want to generate a series of additional vars X2, X3, > X4...such that the correlation between X2 and X1 is o.4, X3 and > X1 is 0.5, X4 and X1 is 0.6 and so on with the condition all > variables X2, X3, X4....have the same mean and SD with X1. > Any help should be appreciated > Regards > Nguyen > >