In comparing the results of princomp and prcomp I find:
1. The reported standard deviations are similar but about 1% from
each other, which seems well above round-off error.
2. princomp returns what I understand are variances and cumulative
variances accounted for by each principal component which are
all equal. "SS loadings" is always 1.
3. Same happens after the loadings are varimax-rotated, which in
general should alter the proportions of variance accounted by
each component.
It looks as if the loadings() function were expecting the eigenvectors
to be normalized to the corresponding eigenvalue.
Transcript and version information follow signature. Thank you for any
clues.
ft.
--
Fernando TUSELL e-mail:
Departamento de Econometr??a y Estad??stica etptupaf at bs.ehu.es
Facultad de CC.EE. y Empresariales Tel: (+34)94.601.3733
Avenida Lendakari Aguirre, 83 Fax: (+34)94.601.3754
E-48015 BILBAO (Spain) Secr: (+34)94.601.3740
----------------------------------------------------------------------
> pca.1 <- prcomp(USArrests)
> pca.1
Standard deviations:
[1] 83.732400 14.212402 6.489426 2.482790
Rotation:
PC1 PC2 PC3 PC4
Murder 0.04170432 -0.04482166 0.07989066 -0.99492173
Assault 0.99522128 -0.05876003 -0.06756974 0.03893830
UrbanPop 0.04633575 0.97685748 -0.20054629 -0.05816914
Rape 0.07515550 0.20071807 0.97408059 0.07232502> pca.2 <- princomp(USArrests)
> pca.2
Call:
princomp(x = USArrests)
Standard deviations:
Comp.1 Comp.2 Comp.3 Comp.4
82.890847 14.069560 6.424204 2.457837
4 variables and 50 observations.> summary(pca.2)
Importance of components:
Comp.1 Comp.2 Comp.3 Comp.4
Standard deviation 82.8908472 14.06956001 6.424204055 2.4578367034
Proportion of Variance 0.9655342 0.02781734 0.005799535 0.0008489079
Cumulative Proportion 0.9655342 0.99335156 0.999151092
1.0000000000> loadings(pca.2)
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4
Murder 0.995
Assault -0.995
UrbanPop -0.977 -0.201
Rape -0.201 0.974
Comp.1 Comp.2 Comp.3 Comp.4
SS loadings 1.00 1.00 1.00 1.00
Proportion Var 0.25 0.25 0.25 0.25
Cumulative Var 0.25 0.50 0.75 1.00
> varimax(pca.2$loadings[,1:3])
$loadings
Loadings:
Comp.1 Comp.2 Comp.3
Murder
Assault -0.998
UrbanPop -0.997
Rape 0.995
Comp.1 Comp.2 Comp.3
SS loadings 1.00 1.00 1.00
Proportion Var 0.25 0.25 0.25
Cumulative Var 0.25 0.50 0.75
$rotmat
[,1] [,2] [,3]
[1,] 0.99211386 0.03604908 -0.1200439
[2,] -0.05442524 0.98664663 -0.1535132
[3,] 0.11290692 0.15883603 0.9808278
> R.Version()
$platform
[1] "i386-pc-linux-gnu"
$arch
[1] "i386"
$os
[1] "linux-gnu"
$system
[1] "i386, linux-gnu"
$status
[1] ""
$major
[1] "2"
$minor
[1] "0.0"
$year
[1] "2004"
$month
[1] "10"
$day
[1] "04"
$language
[1] "R"
F.Tusell wrote:> In comparing the results of princomp and prcomp I find: > > 1. The reported standard deviations are similar but about 1% from > each other, which seems well above round-off error. > 2. princomp returns what I understand are variances and cumulative > variances accounted for by each principal component which are > all equal. "SS loadings" is always 1. > 3. Same happens after the loadings are varimax-rotated, which in > general should alter the proportions of variance accounted by > each component. > > It looks as if the loadings() function were expecting the eigenvectors > to be normalized to the corresponding eigenvalue. > > Transcript and version information follow signature. Thank you for any > clues. >Did you read the corresponding help files? from ?prcomp: <quote> Details: The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using 'eigen' on the covariance matrix. This is generally the preferred method for numerical accuracy. </quote> from ?princomp: <quote> Details: The calculation is done using 'eigen' on the correlation or covariance matrix, as determined by 'cor'. This is done for compatibility with the S-PLUS result. A preferred method of calculation is to use 'svd' on 'x', as is done in 'prcomp'. </quote> HTH, --sundar
On Wed, 3 Nov 2004, F.Tusell wrote:> In comparing the results of princomp and prcomp I find: > > 1. The reported standard deviations are similar but about 1% from > each other, which seems well above round-off error.That is explained on the help page! E.g. Note that the default calculation uses divisor 'N' for the covariance matrix. and there is even an example: princomp(USArrests, cor = TRUE) # =^= prcomp(USArrests, scale=TRUE) ## Similar, but different: ## The standard deviations differ by a factor of sqrt(49/50)> 2. princomp returns what I understand are variances and cumulative > variances accounted for by each principal component which are > all equal. "SS loadings" is always 1. > 3. Same happens after the loadings are varimax-rotated, which in > general should alter the proportions of variance accounted by > each component.Hmmm. Varimax rotation of PCA (not factor analysis) is not supported in base R, so this is not surprising. Please do as the posting guide asks, and read the help page (even its title!) before posting.> It looks as if the loadings() function were expecting the eigenvectors > to be normalized to the corresponding eigenvalue. > > Transcript and version information follow signature. Thank you for any > clues.The best clue is that the help pages are a very useful resource, but need to be read as carefully as they were written. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595