Hi John
I am currently traveling and have sporadic net access, I therefore can only
answer briefly. It's also quite late, I hope what follows still makes
sense...
> For regular PCA by prcomp(), we can easily calculate the percent of total
variance explained by the first k PCs by using cumsum(obj$sdev^2) because these
PCs are independent of each other so you can simply add up the variance of these
PCs. For sparse PCA, as far as I understand, the generated PCs are not
independent of each other anymore, so you can not simply add up variances to
calculate percentage of variance explained by the first k PCs. For example, in
the package of elasticnet where spca() also performs sparse PCA, one of the
output from spca() is "pev" for percent explained variation which is
based on so-called "adjusted" variance that adjusted for the fact that
these variances of PCs are not independent anymore.
You are correct that measuring explained variance is more involved with sparse
(or non-negative) PCA, because the principal axes no longer correspond to
eigenvectors of the covariance matrix, and are usually not even orthogonal.
The next update for the 'nsprcomp' package is almost done, and one of
the changes will concern the reported standard deviations. In the current
version (0.3), the standard deviations are computed from the scores matrix X*W,
where X is the data matrix and W is the (pseudo-)rotation matrix consisting of
the sparse loadings. Computing variance this way has the advantage that
'sdev' is consistent with the scores matrix, but it has the disadvantage
that some of the explained variance is counted more than once because of the
non-orthogonality of the principal axes. One of the symptoms of this counting is
that the variance of a later component can actually exceed the variance of an
earlier component, which is not possible in regular PCA.
In the new version of the package, 'sdev' will report the _additional_
standard deviation of each component, i.e. the variance not explained by the
previous components. Given a basis of the space spanned by the previous PAs, the
variance of the PC is computed after projecting the current PA to the
ortho-complement space of the basis. This procedure reverts back to standard PCA
if no sparsity or non-negativity constraints are enforced on the PAs.
> My question is for nsprcomp, how can I calculate percent explained
variation by using "sdev" when I know these PCs are not independent of
each other?
The new version of the package will do it for you. Until then, you can use
something like the following function
asdev <- function(X, W) {
nc <- ncol(W)
sdev <- numeric(nc)
Q <- qr.Q(qr(W))
Xp <- X
for (cc in seq_len(nc)) {
sdev[cc] <- sd(Xp%*%W[ , cc])
Xp <- Xp - Xp%*%Q[ , cc]%*%t(Q[ , cc])
}
return(sdev)
}
to compute the additional variances for given X and W.
The package documentation will explain the above in some more detail, and I will
also have a small blog post which compares the 'nsprcomp' and
'spca' routine from the 'elasticnet' package on the
'marty' data from the EMA package.
Best regards
Christian