Hi,
I am using R to do a principal components analysis for a class
which is generally using SPSS - so some of my question relates to
SPSS output (and this might not be the right place). I have
scoured the mailing list and the web but can't get a feel for this.
It is annoying because they will be marking to the SPSS output.
Basically I'm getting different values for the component loadings
in SPSS and in R - I suspect that there is some normalization or
scaling going on that I don't understand (and there is plenty I
don't understand). The scree-plots (and thus eigen values for each
component) and Proportion of Variance figures are identical - but
the factor loadings are an order of magnitude different. Basically
the SPSS loadings are much higher than those shown by R.
Should the loadings returned by the R princomp function and the
SPSS "Component Matrix" be the same?
And subsidiary question would be: How does one approximate the
"Kaiser's little jiffy" test for extracting the components (SPSS
by default eliminates those components with eigen values below 1)?
I've been doing this by loadings(DV.prcomped)[,1:x] after inspecting
the scree plot (to set x) - but is there another way?
The full R commands and SPSS syntax follow below along with the
differing output.
Thanks, James
http://freelancepropaganda.com
R analysis
==========I run:
> library(mva)
> DVfmla
~webeval1 + webeval2 + webeval3 + webeval4 + webeval5 + webeval6 +
webeval7 + webeval8
> loadings(DV.pca <- princomp(DVfmla, scale=T, cor=T))
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
webeval1 -0.357 0.258 -0.202 0.458 0.629 -0.350 0.112 -0.159
webeval2 -0.340 0.510 0.255 -0.305 0.651 0.136 -0.143
webeval3 -0.319 0.316 -0.276 -0.797 0.244 -0.145
webeval4 0.247 0.633 0.681 -0.248
webeval5 0.391 0.150 -0.357 -0.183 -0.158 -0.185 0.584 -0.513
webeval6 0.392 0.252 -0.282 0.140 -0.756 -0.334
webeval7 -0.382 0.128 -0.162 -0.651 -0.596 -0.114 0.121
webeval8 0.377 0.268 -0.428 0.158 0.143 0.746
<snip SS loadings>
>plot(DV.pca) # This is exactly the same as the SPSS scree-plot.
SPSS Analysis
============
FACTOR
/VARIABLES webeval1 webeval2 webeval3 webeval4
webeval5 webeval6 webeval7 webeval8
/MISSING LISTWISE
/ANALYSIS webeval1 webeval2 webeval3 webeval4
webeval5 webeval6 webeval7 webeval8
/PRINT INITIAL EXTRACTION
/PLOT EIGEN
/CRITERIA FACTORS(8) ITERATE(25)
/EXTRACTION PC
/ROTATION NOROTATE
/METHOD=CORRELATION .
As mentioned the proportions of varience explained and the scree
plot are identical. However SPSS produces this "Component Matrix"
which we, in class, have been calling "the loadings":
WEBEVAL1 -0.798 0.253 0.178 0.317 -0.370 0.167 -0.033 -0.037
WEBEVAL2 -0.764 0.487 0.026 0.188 0.186 -0.309 -0.108 -0.043
WEBEVAL3 -0.719 0.309 0.217 -0.564 -0.125 -0.040 0.043 0.052
WEBEVAL4 0.558 0.591 -0.563 -0.063 -0.029 0.131 0.030 -0.019
WEBEVAL5 0.864 0.161 0.313 -0.128 0.075 0.138 -0.221 -0.200
WEBEVAL6 0.876 0.252 0.237 0.100 0.008 0.017 -0.088 0.308
WEBEVAL7 -0.858 0.128 0.133 0.054 0.349 0.308 0.090 0.037
WEBEVAL8 0.847 0.256 0.316 0.111 0.000 -0.087 0.296 -0.094
Can anyone tell me why these are different (It seems likely that
this is a scaling of some kind as the SPSS ones just look to have
been made larger in some way). Or is it that SPSS is reporting
cumulatively while R is not?
Thanks in advance,
James
On Mon, 5 May 2003, James Howison wrote:> I am using R to do a principal components analysis for a class > which is generally using SPSS - so some of my question relates to > SPSS output (and this might not be the right place). I have > scoured the mailing list and the web but can't get a feel for this. > It is annoying because they will be marking to the SPSS output. > > Basically I'm getting different values for the component loadings > in SPSS and in R - I suspect that there is some normalization or > scaling going on that I don't understand (and there is plenty I > don't understand). The scree-plots (and thus eigen values for each > component) and Proportion of Variance figures are identical - but > the factor loadings are an order of magnitude different. Basically > the SPSS loadings are much higher than those shown by R. > > Should the loadings returned by the R princomp function and the > SPSS "Component Matrix" be the same?Only if they are defined the same. The length of a PCA loading is arbitrary. R's are of length (sum of squares of coefficients) one: how are SPSS's defined?> And subsidiary question would be: How does one approximate the > "Kaiser's little jiffy" test for extracting the components (SPSS > by default eliminates those components with eigen values below 1)? > I've been doing this by loadings(DV.prcomped)[,1:x] after inspecting > the scree plot (to set x) - but is there another way?eigen values of what exactly? The component sdev is the aquare roots of the eigenvalues of the (possibly scaled) covariance matrix: maybe you intend this only for a correlation matrix? In R you have the source code, so if you know what you want you can find the pieces. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Hi, I compared the R's results with those given by MINITAB and SAS and they are OK. Your problem is with SPSS that unfortunately I have never used it. Edgar On Mon, 5 May 2003, James Howison wrote:> Hi, > > I am using R to do a principal components analysis for a class > which is generally using SPSS - so some of my question relates to > SPSS output (and this might not be the right place). I have > scoured the mailing list and the web but can't get a feel for this. > It is annoying because they will be marking to the SPSS output. > > Basically I'm getting different values for the component loadings > in SPSS and in R - I suspect that there is some normalization or > scaling going on that I don't understand (and there is plenty I > don't understand). The scree-plots (and thus eigen values for each > component) and Proportion of Variance figures are identical - but > the factor loadings are an order of magnitude different. Basically > the SPSS loadings are much higher than those shown by R. > > Should the loadings returned by the R princomp function and the > SPSS "Component Matrix" be the same? > > And subsidiary question would be: How does one approximate the > "Kaiser's little jiffy" test for extracting the components (SPSS > by default eliminates those components with eigen values below 1)? > I've been doing this by loadings(DV.prcomped)[,1:x] after inspecting > the scree plot (to set x) - but is there another way? > > The full R commands and SPSS syntax follow below along with the > differing output. > > Thanks, James > http://freelancepropaganda.com > > R analysis > ==========> I run: > > > library(mva) > > DVfmla > ~webeval1 + webeval2 + webeval3 + webeval4 + webeval5 + webeval6 + > webeval7 + webeval8 > > loadings(DV.pca <- princomp(DVfmla, scale=T, cor=T)) > > Loadings: > Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 > webeval1 -0.357 0.258 -0.202 0.458 0.629 -0.350 0.112 -0.159 > webeval2 -0.340 0.510 0.255 -0.305 0.651 0.136 -0.143 > webeval3 -0.319 0.316 -0.276 -0.797 0.244 -0.145 > webeval4 0.247 0.633 0.681 -0.248 > webeval5 0.391 0.150 -0.357 -0.183 -0.158 -0.185 0.584 -0.513 > webeval6 0.392 0.252 -0.282 0.140 -0.756 -0.334 > webeval7 -0.382 0.128 -0.162 -0.651 -0.596 -0.114 0.121 > webeval8 0.377 0.268 -0.428 0.158 0.143 0.746 > > <snip SS loadings> > > >plot(DV.pca) # This is exactly the same as the SPSS scree-plot. > > SPSS Analysis > ============> > FACTOR > /VARIABLES webeval1 webeval2 webeval3 webeval4 > webeval5 webeval6 webeval7 webeval8 > /MISSING LISTWISE > /ANALYSIS webeval1 webeval2 webeval3 webeval4 > webeval5 webeval6 webeval7 webeval8 > /PRINT INITIAL EXTRACTION > /PLOT EIGEN > /CRITERIA FACTORS(8) ITERATE(25) > /EXTRACTION PC > /ROTATION NOROTATE > /METHOD=CORRELATION . > > As mentioned the proportions of varience explained and the scree > plot are identical. However SPSS produces this "Component Matrix" > which we, in class, have been calling "the loadings": > > WEBEVAL1 -0.798 0.253 0.178 0.317 -0.370 0.167 -0.033 -0.037 > WEBEVAL2 -0.764 0.487 0.026 0.188 0.186 -0.309 -0.108 -0.043 > WEBEVAL3 -0.719 0.309 0.217 -0.564 -0.125 -0.040 0.043 0.052 > WEBEVAL4 0.558 0.591 -0.563 -0.063 -0.029 0.131 0.030 -0.019 > WEBEVAL5 0.864 0.161 0.313 -0.128 0.075 0.138 -0.221 -0.200 > WEBEVAL6 0.876 0.252 0.237 0.100 0.008 0.017 -0.088 0.308 > WEBEVAL7 -0.858 0.128 0.133 0.054 0.349 0.308 0.090 0.037 > WEBEVAL8 0.847 0.256 0.316 0.111 0.000 -0.087 0.296 -0.094 > > Can anyone tell me why these are different (It seems likely that > this is a scaling of some kind as the SPSS ones just look to have > been made larger in some way). Or is it that SPSS is reporting > cumulatively while R is not? > > Thanks in advance, > James > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >