zhijie zhang
2007-Jan-30 14:29 UTC
[R] R and S-Plus got the different results of principal component analysis from SAS, why?
Dear Rusers,
I have met a difficult problem on explaining the differences of principal
component analysis(PCA) between R,S-PLUS and SAS/STATA/SPSS, which wasn't
met before.
Althought they have got the same eigenvalues, their coeffiecients were
different.
First, I list my results from R,S-PLUS and SAS/STATA/SPSS, and then show
the original dataset, hoping sb. to try and explain it.
SAS,STATA,and SPSS have the same results, so i put them together. From
their results, we see that the absolute values of coeffiecient are same, but
PC1,PC2,PC4,PC5 and PC6 in R have the opposite sign on the coeffiecnts
contrast with SAS, and PC4,PC5 in S-PLUS have the opposite sign on the
coeffiecnts contrast with SAS. Curiously, I got the same results amont all
these software using my another dataset.
*R's results of PCA:*
*PC1* *PC2* PC3 *PC4* *PC5 * *
PC6*
X1 -0.5152569 0.20264489 -0.2338786 0.2350876 -0.2033335 -0.736298528
X2 -0.5197856 0.08989351 -0.2068260 0.3737667 -0.3187746 0.661548469
X3 -0.5148033 0.15820613 -0.0590627 -0.3210113 0.7693052 0.107616466
X4 -0.3535798 0.08105168 0.7317188 -0.4350752 -0.3790772 0.003088541
X5 -0.1868691 -0.67517084 -0.4397442 -0.5119015 -0.2314833 -0.014886524
X6 -0.1984241 -0.68073489 0.4126112 0.5006500 0.2606219 -0.091682326
pca<-read.csv('D:\pca.csv',sep=',',header=T)
attach(pca)
pcacomp <- prcomp(pca[,-1], retx=TRUE, center=TRUE,scale.= TRUE,tol=0.0001)
*S-Plus's results of PCA:*
pc1 pc2 pc3 *pc4 pc5* pc6
X1 0.5153 -0.2026 -0.2339 0.2351 -0.2033 0.7363
X2 0.5198 -0.0899 -0.2068 0.3738 -0.3188 -0.6615
X3 0.5148 -0.1582 -0.0591 -0.3210 0.7693 -0.1076
X4 0.3536 -0.0811 0.7317 -0.4351 -0.3791 -0.0031
X5 0.1869 0.6752 -0.4397 -0.5119 -0.2315 0.0149
X6 0.1984 0.6807 0.4126 0.5007 0.2606 0.0917
*SAS/STATA/SPSS's results of PCA:*
PC1 PC2 PC3 PC4 PC5 PC6
X1 0.515257 -.202645 -.233879 -.235088 0.203334 0.736299
X2 0.519786 -.089894 -.206826 -.373767 0.318775 -.661548
X3 0.514803 -.158206 -.059063 0.321011 -.769305 -.107616
X4 0.353580 -.081052 0.731719 0.435075 0.379077 -.003089
X5 0.186869 0.675171 -.439744 0.511902 0.231483 0.014887
X6 0.198424 0.680735 0.412611 -.500650 -.260622 0.091682
My dataset used in the above results is :
X1
X2
X3
X4
X5
X6
173.28
93.62
60.1
86.72
38.97
27.51
172.09
92.83
60.38
87.39
38.62
27.82
171.46
92.73
59.74
85.59
38.83
27.46
170.08
92.25
58.04
85.92
38.33
27.29
170.61
92.36
59.67
87.46
38.38
27.14
171.69
92.85
59.44
87.45
38.19
27.1
171.46
92.93
58.7
87.06
38.58
27.36
171.6
93.28
59.75
88.03
38.68
27.22
171.6
92.26
60.5
87.63
38.79
26.63
171.16
92.62
58.72
87.11
38.19
27.18
170.04
92.17
56.95
88.08
38.24
27.65
170.27
91.94
56
84.52
37.16
26.81
170.61
92.5
57.34
85.61
38.52
27.36
171.39
92.44
58.92
85.37
38.83
26.47
171.83
92.79
56.85
85.35
38.58
27.03
171.36
92.53
58.39
87.09
38.23
27.04
171.24
92.61
57.69
83.98
39.04
27.07
170.49
92.03
57.56
87.18
38.54
27.57
169.43
91.67
55.22
83.87
38.41
26.6
168.57
91.4
55.96
83.02
38.74
26.97
170.43
92.38
57.87
84.87
38.78
27.37
169.88
91.89
56.87
86.34
38.37
27.19
167.94
90.91
55.97
86.77
38.17
27.16
168.82
91.3
56.07
85.87
37.61
26.67
168.02
91.26
55.28
85.63
39.66
28.07
167.87
90.96
55.79
84.92
38.2
26.53
168.15
91.5
54.56
84.81
38.44
27.38
168.99
91.52
55.11
86.23
38.3
27.11
Any help or suggestions are greatly appreciated.
--
With Kind Regards,
oooO:::::::::
(..):::::::::
:\.(:::Oooo::
::\_)::(..)::
:::::::)./:::
::::::(_/::::
:::::::::::::
[***********************************************************************]
Zhi Jie,Zhang ,PHD
Tel:86-21-54237149 epistat@gmail.com
Dept. of Epidemiology,school of public health,Fudan University
Address:No. 138 Yi Xue Yuan Road,Shanghai,China
Postcode:200032
[***********************************************************************]
oooO:::::::::
(..):::::::::
:\.(:::Oooo::
::\_)::(..)::
:::::::)./:::
::::::(_/::::
:::::::::::::
[[alternative HTML version deleted]]
Peter Dalgaard
2007-Jan-30 14:47 UTC
[R] R and S-Plus got the different results of principal component analysis from SAS, why?
zhijie zhang wrote:> Dear Rusers, > > I have met a difficult problem on explaining the differences of principal > component analysis(PCA) between R,S-PLUS and SAS/STATA/SPSS, which wasn't > met before. > > Althought they have got the same eigenvalues, their coeffiecients were > different. > > First, I list my results from R,S-PLUS and SAS/STATA/SPSS, and then show > the original dataset, hoping sb. to try and explain it. > > SAS,STATA,and SPSS have the same results, so i put them together. From > their results, we see that the absolute values of coeffiecient are same, but > PC1,PC2,PC4,PC5 and PC6 in R have the opposite sign on the coeffiecnts > contrast with SAS, and PC4,PC5 in S-PLUS have the opposite sign on the > coeffiecnts contrast with SAS. Curiously, I got the same results amont all > these software using my another dataset. > > *Principal components are only *defined* up to sign changes (as the help page for prcomp says quite explicitly!!!!) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Gavin Simpson
2007-Jan-30 15:08 UTC
[R] R and S-Plus got the different results of principal component analysis from SAS, why?
On Tue, 2007-01-30 at 22:29 +0800, zhijie zhang wrote:> Dear Rusers, > > I have met a difficult problem on explaining the differences of principal > component analysis(PCA) between R,S-PLUS and SAS/STATA/SPSS, which wasn't > met before. > > Althought they have got the same eigenvalues, their coeffiecients were > different.Only up to rounding of printed results and their signs, which are arbitrary. The latter is covered in both the help pages for the two main PCA functions in R. Please read the Notes section of ?prcomp and/or ?princomp (you don't say which you used in R). G> > First, I list my results from R,S-PLUS and SAS/STATA/SPSS, and then show > the original dataset, hoping sb. to try and explain it. > > SAS,STATA,and SPSS have the same results, so i put them together. From > their results, we see that the absolute values of coeffiecient are same, but > PC1,PC2,PC4,PC5 and PC6 in R have the opposite sign on the coeffiecnts > contrast with SAS, and PC4,PC5 in S-PLUS have the opposite sign on the > coeffiecnts contrast with SAS. Curiously, I got the same results amont all > these software using my another dataset. > > *R's results of PCA:* > > *PC1* *PC2* PC3 *PC4* *PC5 * * > PC6* > > X1 -0.5152569 0.20264489 -0.2338786 0.2350876 -0.2033335 -0.736298528 > > X2 -0.5197856 0.08989351 -0.2068260 0.3737667 -0.3187746 0.661548469 > > X3 -0.5148033 0.15820613 -0.0590627 -0.3210113 0.7693052 0.107616466 > > X4 -0.3535798 0.08105168 0.7317188 -0.4350752 -0.3790772 0.003088541 > > X5 -0.1868691 -0.67517084 -0.4397442 -0.5119015 -0.2314833 -0.014886524 > > X6 -0.1984241 -0.68073489 0.4126112 0.5006500 0.2606219 -0.091682326 > > > > pca<-read.csv('D:\pca.csv',sep=',',header=T) > > attach(pca) > > pcacomp <- prcomp(pca[,-1], retx=TRUE, center=TRUE,scale.= TRUE,tol=0.0001) > > > > *S-Plus's results of PCA:* > > pc1 pc2 pc3 *pc4 pc5* pc6 > > X1 0.5153 -0.2026 -0.2339 0.2351 -0.2033 0.7363 > > X2 0.5198 -0.0899 -0.2068 0.3738 -0.3188 -0.6615 > > X3 0.5148 -0.1582 -0.0591 -0.3210 0.7693 -0.1076 > > X4 0.3536 -0.0811 0.7317 -0.4351 -0.3791 -0.0031 > > X5 0.1869 0.6752 -0.4397 -0.5119 -0.2315 0.0149 > > X6 0.1984 0.6807 0.4126 0.5007 0.2606 0.0917 > > > > *SAS/STATA/SPSS's results of PCA:* > > PC1 PC2 PC3 PC4 PC5 PC6 > > X1 0.515257 -.202645 -.233879 -.235088 0.203334 0.736299 > > X2 0.519786 -.089894 -.206826 -.373767 0.318775 -.661548 > > X3 0.514803 -.158206 -.059063 0.321011 -.769305 -.107616 > > X4 0.353580 -.081052 0.731719 0.435075 0.379077 -.003089 > > X5 0.186869 0.675171 -.439744 0.511902 0.231483 0.014887 > > X6 0.198424 0.680735 0.412611 -.500650 -.260622 0.091682 > > > > My dataset used in the above results is : > > X1 > > X2 > > X3 > > X4 > > X5 > > X6 > > 173.28 > > 93.62 > > 60.1 > > 86.72 > > 38.97 > > 27.51 > > 172.09 > > 92.83 > > 60.38 > > 87.39 > > 38.62 > > 27.82 > > 171.46 > > 92.73 > > 59.74 > > 85.59 > > 38.83 > > 27.46 > > 170.08 > > 92.25 > > 58.04 > > 85.92 > > 38.33 > > 27.29 > > 170.61 > > 92.36 > > 59.67 > > 87.46 > > 38.38 > > 27.14 > > 171.69 > > 92.85 > > 59.44 > > 87.45 > > 38.19 > > 27.1 > > 171.46 > > 92.93 > > 58.7 > > 87.06 > > 38.58 > > 27.36 > > 171.6 > > 93.28 > > 59.75 > > 88.03 > > 38.68 > > 27.22 > > 171.6 > > 92.26 > > 60.5 > > 87.63 > > 38.79 > > 26.63 > > 171.16 > > 92.62 > > 58.72 > > 87.11 > > 38.19 > > 27.18 > > 170.04 > > 92.17 > > 56.95 > > 88.08 > > 38.24 > > 27.65 > > 170.27 > > 91.94 > > 56 > > 84.52 > > 37.16 > > 26.81 > > 170.61 > > 92.5 > > 57.34 > > 85.61 > > 38.52 > > 27.36 > > 171.39 > > 92.44 > > 58.92 > > 85.37 > > 38.83 > > 26.47 > > 171.83 > > 92.79 > > 56.85 > > 85.35 > > 38.58 > > 27.03 > > 171.36 > > 92.53 > > 58.39 > > 87.09 > > 38.23 > > 27.04 > > 171.24 > > 92.61 > > 57.69 > > 83.98 > > 39.04 > > 27.07 > > 170.49 > > 92.03 > > 57.56 > > 87.18 > > 38.54 > > 27.57 > > 169.43 > > 91.67 > > 55.22 > > 83.87 > > 38.41 > > 26.6 > > 168.57 > > 91.4 > > 55.96 > > 83.02 > > 38.74 > > 26.97 > > 170.43 > > 92.38 > > 57.87 > > 84.87 > > 38.78 > > 27.37 > > 169.88 > > 91.89 > > 56.87 > > 86.34 > > 38.37 > > 27.19 > > 167.94 > > 90.91 > > 55.97 > > 86.77 > > 38.17 > > 27.16 > > 168.82 > > 91.3 > > 56.07 > > 85.87 > > 37.61 > > 26.67 > > 168.02 > > 91.26 > > 55.28 > > 85.63 > > 39.66 > > 28.07 > > 167.87 > > 90.96 > > 55.79 > > 84.92 > > 38.2 > > 26.53 > > 168.15 > > 91.5 > > 54.56 > > 84.81 > > 38.44 > > 27.38 > > 168.99 > > 91.52 > > 55.11 > > 86.23 > > 38.3 > > 27.11 > > Any help or suggestions are greatly appreciated. > > > -- > With Kind Regards, > > oooO::::::::: > (..)::::::::: > :\.(:::Oooo:: > ::\_)::(..):: > :::::::)./::: > ::::::(_/:::: > ::::::::::::: > [***********************************************************************] > Zhi Jie,Zhang ,PHD > Tel:86-21-54237149 epistat at gmail.com > Dept. of Epidemiology,school of public health,Fudan University > Address:No. 138 Yi Xue Yuan Road,Shanghai,China > Postcode:200032 > [***********************************************************************] > oooO::::::::: > (..)::::::::: > :\.(:::Oooo:: > ::\_)::(..):: > :::::::)./::: > ::::::(_/:::: > ::::::::::::: > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%