Hi, I am using R to do a principal components analysis for a class which is generally using SPSS - so some of my question relates to SPSS output (and this might not be the right place). I have scoured the mailing list and the web but can't get a feel for this. It is annoying because they will be marking to the SPSS output. Basically I'm getting different values for the component loadings in SPSS and in R - I suspect that there is some normalization or scaling going on that I don't understand (and there is plenty I don't understand). The scree-plots (and thus eigen values for each component) and Proportion of Variance figures are identical - but the factor loadings are an order of magnitude different. Basically the SPSS loadings are much higher than those shown by R. Should the loadings returned by the R princomp function and the SPSS "Component Matrix" be the same? And subsidiary question would be: How does one approximate the "Kaiser's little jiffy" test for extracting the components (SPSS by default eliminates those components with eigen values below 1)? I've been doing this by loadings(DV.prcomped)[,1:x] after inspecting the scree plot (to set x) - but is there another way? The full R commands and SPSS syntax follow below along with the differing output. Thanks, James http://freelancepropaganda.com R analysis ==========I run: > library(mva) > DVfmla ~webeval1 + webeval2 + webeval3 + webeval4 + webeval5 + webeval6 + webeval7 + webeval8 > loadings(DV.pca <- princomp(DVfmla, scale=T, cor=T)) Loadings: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 webeval1 -0.357 0.258 -0.202 0.458 0.629 -0.350 0.112 -0.159 webeval2 -0.340 0.510 0.255 -0.305 0.651 0.136 -0.143 webeval3 -0.319 0.316 -0.276 -0.797 0.244 -0.145 webeval4 0.247 0.633 0.681 -0.248 webeval5 0.391 0.150 -0.357 -0.183 -0.158 -0.185 0.584 -0.513 webeval6 0.392 0.252 -0.282 0.140 -0.756 -0.334 webeval7 -0.382 0.128 -0.162 -0.651 -0.596 -0.114 0.121 webeval8 0.377 0.268 -0.428 0.158 0.143 0.746 <snip SS loadings> >plot(DV.pca) # This is exactly the same as the SPSS scree-plot. SPSS Analysis ============ FACTOR /VARIABLES webeval1 webeval2 webeval3 webeval4 webeval5 webeval6 webeval7 webeval8 /MISSING LISTWISE /ANALYSIS webeval1 webeval2 webeval3 webeval4 webeval5 webeval6 webeval7 webeval8 /PRINT INITIAL EXTRACTION /PLOT EIGEN /CRITERIA FACTORS(8) ITERATE(25) /EXTRACTION PC /ROTATION NOROTATE /METHOD=CORRELATION . As mentioned the proportions of varience explained and the scree plot are identical. However SPSS produces this "Component Matrix" which we, in class, have been calling "the loadings": WEBEVAL1 -0.798 0.253 0.178 0.317 -0.370 0.167 -0.033 -0.037 WEBEVAL2 -0.764 0.487 0.026 0.188 0.186 -0.309 -0.108 -0.043 WEBEVAL3 -0.719 0.309 0.217 -0.564 -0.125 -0.040 0.043 0.052 WEBEVAL4 0.558 0.591 -0.563 -0.063 -0.029 0.131 0.030 -0.019 WEBEVAL5 0.864 0.161 0.313 -0.128 0.075 0.138 -0.221 -0.200 WEBEVAL6 0.876 0.252 0.237 0.100 0.008 0.017 -0.088 0.308 WEBEVAL7 -0.858 0.128 0.133 0.054 0.349 0.308 0.090 0.037 WEBEVAL8 0.847 0.256 0.316 0.111 0.000 -0.087 0.296 -0.094 Can anyone tell me why these are different (It seems likely that this is a scaling of some kind as the SPSS ones just look to have been made larger in some way). Or is it that SPSS is reporting cumulatively while R is not? Thanks in advance, James
On Mon, 5 May 2003, James Howison wrote:> I am using R to do a principal components analysis for a class > which is generally using SPSS - so some of my question relates to > SPSS output (and this might not be the right place). I have > scoured the mailing list and the web but can't get a feel for this. > It is annoying because they will be marking to the SPSS output. > > Basically I'm getting different values for the component loadings > in SPSS and in R - I suspect that there is some normalization or > scaling going on that I don't understand (and there is plenty I > don't understand). The scree-plots (and thus eigen values for each > component) and Proportion of Variance figures are identical - but > the factor loadings are an order of magnitude different. Basically > the SPSS loadings are much higher than those shown by R. > > Should the loadings returned by the R princomp function and the > SPSS "Component Matrix" be the same?Only if they are defined the same. The length of a PCA loading is arbitrary. R's are of length (sum of squares of coefficients) one: how are SPSS's defined?> And subsidiary question would be: How does one approximate the > "Kaiser's little jiffy" test for extracting the components (SPSS > by default eliminates those components with eigen values below 1)? > I've been doing this by loadings(DV.prcomped)[,1:x] after inspecting > the scree plot (to set x) - but is there another way?eigen values of what exactly? The component sdev is the aquare roots of the eigenvalues of the (possibly scaled) covariance matrix: maybe you intend this only for a correlation matrix? In R you have the source code, so if you know what you want you can find the pieces. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Hi, I compared the R's results with those given by MINITAB and SAS and they are OK. Your problem is with SPSS that unfortunately I have never used it. Edgar On Mon, 5 May 2003, James Howison wrote:> Hi, > > I am using R to do a principal components analysis for a class > which is generally using SPSS - so some of my question relates to > SPSS output (and this might not be the right place). I have > scoured the mailing list and the web but can't get a feel for this. > It is annoying because they will be marking to the SPSS output. > > Basically I'm getting different values for the component loadings > in SPSS and in R - I suspect that there is some normalization or > scaling going on that I don't understand (and there is plenty I > don't understand). The scree-plots (and thus eigen values for each > component) and Proportion of Variance figures are identical - but > the factor loadings are an order of magnitude different. Basically > the SPSS loadings are much higher than those shown by R. > > Should the loadings returned by the R princomp function and the > SPSS "Component Matrix" be the same? > > And subsidiary question would be: How does one approximate the > "Kaiser's little jiffy" test for extracting the components (SPSS > by default eliminates those components with eigen values below 1)? > I've been doing this by loadings(DV.prcomped)[,1:x] after inspecting > the scree plot (to set x) - but is there another way? > > The full R commands and SPSS syntax follow below along with the > differing output. > > Thanks, James > http://freelancepropaganda.com > > R analysis > ==========> I run: > > > library(mva) > > DVfmla > ~webeval1 + webeval2 + webeval3 + webeval4 + webeval5 + webeval6 + > webeval7 + webeval8 > > loadings(DV.pca <- princomp(DVfmla, scale=T, cor=T)) > > Loadings: > Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 > webeval1 -0.357 0.258 -0.202 0.458 0.629 -0.350 0.112 -0.159 > webeval2 -0.340 0.510 0.255 -0.305 0.651 0.136 -0.143 > webeval3 -0.319 0.316 -0.276 -0.797 0.244 -0.145 > webeval4 0.247 0.633 0.681 -0.248 > webeval5 0.391 0.150 -0.357 -0.183 -0.158 -0.185 0.584 -0.513 > webeval6 0.392 0.252 -0.282 0.140 -0.756 -0.334 > webeval7 -0.382 0.128 -0.162 -0.651 -0.596 -0.114 0.121 > webeval8 0.377 0.268 -0.428 0.158 0.143 0.746 > > <snip SS loadings> > > >plot(DV.pca) # This is exactly the same as the SPSS scree-plot. > > SPSS Analysis > ============> > FACTOR > /VARIABLES webeval1 webeval2 webeval3 webeval4 > webeval5 webeval6 webeval7 webeval8 > /MISSING LISTWISE > /ANALYSIS webeval1 webeval2 webeval3 webeval4 > webeval5 webeval6 webeval7 webeval8 > /PRINT INITIAL EXTRACTION > /PLOT EIGEN > /CRITERIA FACTORS(8) ITERATE(25) > /EXTRACTION PC > /ROTATION NOROTATE > /METHOD=CORRELATION . > > As mentioned the proportions of varience explained and the scree > plot are identical. However SPSS produces this "Component Matrix" > which we, in class, have been calling "the loadings": > > WEBEVAL1 -0.798 0.253 0.178 0.317 -0.370 0.167 -0.033 -0.037 > WEBEVAL2 -0.764 0.487 0.026 0.188 0.186 -0.309 -0.108 -0.043 > WEBEVAL3 -0.719 0.309 0.217 -0.564 -0.125 -0.040 0.043 0.052 > WEBEVAL4 0.558 0.591 -0.563 -0.063 -0.029 0.131 0.030 -0.019 > WEBEVAL5 0.864 0.161 0.313 -0.128 0.075 0.138 -0.221 -0.200 > WEBEVAL6 0.876 0.252 0.237 0.100 0.008 0.017 -0.088 0.308 > WEBEVAL7 -0.858 0.128 0.133 0.054 0.349 0.308 0.090 0.037 > WEBEVAL8 0.847 0.256 0.316 0.111 0.000 -0.087 0.296 -0.094 > > Can anyone tell me why these are different (It seems likely that > this is a scaling of some kind as the SPSS ones just look to have > been made larger in some way). Or is it that SPSS is reporting > cumulatively while R is not? > > Thanks in advance, > James > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >