Lars from space [sic] asks:> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of downunder
> Sent: Monday, 25 December 2006 12:31 PM To: r-help at stat.math.ethz.ch
> Subject: [R] Higher Dimensional Matrices
>
>
> Hi all.
>
> I want to calculate partial correlations while controlling for one
> or more variables. That works already fine when I control for
> example just for x[,1] and x[,2] that gives me one single
> correlation matrix and i have to to it for x [,1]...x[,10]. That
> would give me out 10 matrices. Controlling for 2 Variables 100
> matrices. how can I run a loop to get f.e the 10 or 100 matrices at
> once? I appreciate for every hint. have nice holidays.
I don't quite understand this. You have 10 variables and you want to
find the partial correlations controlling for two of them at a time.
If you take each possible set of two variables to condition upon at a
time, this would give you choose(10, 2) = 45 matrices, wouldn't it?
Where do you get '10 or 100' matrices from?
>
> greetings lars
>
> x <- read.table("C:/.....dat")
> dim(x) #200x10
> a <- matrix(0,200,10)
> for (i in 1:10)
> a[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2]))
> b <- matrix(0,200,10)
> for (i in 1:10)
> b[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2]))
> #a=round(a,5)
> #b=round(b,5)
> d <- cor(a,b)
> d
But a and b are the same, aren't they? Why do you need to compute
them twice? Why not just use cor(a, a) [which is the same as cor(a),
of course]?
There is a more serious problem here, though. Your residuals are
after regression on x[, 1:2] so if you *select* x[, 1:2] as the
y-variables in your regression then the residuals are going to be
zero, but in practice round-off error. so the first two rows and
columns of d will be correlations with round-off error,
i.e. misleading junk. It doesn't make sense to include the
conditioning variables in the correlation matrix *conditioning on
them*. Only the 8 x 8 matrix of the others among themselves is
defined, really.
So how do we do it? Here are a few pointers.
To start, here is a function that uses a somewhat more compact way of
finding the partial correlations than your method. Sorting out how it
works should be a useful exercise.
partialCorr <- function (cond, mat)
cor(qr.resid(qr(cbind(1, mat[, cond])), mat[, -cond]))
To find the matrix of partial correlations conditioning on x[, 1:2]
you would use
d <- partialCorr(c(1,2), x)
So how to do it for all possible conditioning pairs of variables.
Well you could do it in an obvious loop:
cmats <- array(0, c(8,8,45))
k <- 0
for(i in 1:9) for(j in (i+1):10) {
k <- k+1
cmats[, , k] <- partialCorr(c(i, j), x)
}
Now the results are in a 3-way array, but without any kind of labels.
Perhaps you should think about how to fix that one yourself...
With more than two conditioning variables you should probably use a
function to generate the subsets of the appropriate size rather than
trying to write ever more deeply nested loops. There are plenty of
functions around to do this.
Bill Venables.