zhijie zhang
2007-Jan-30 14:29 UTC
[R] R and S-Plus got the different results of principal component analysis from SAS, why?
Dear Rusers, I have met a difficult problem on explaining the differences of principal component analysis(PCA) between R,S-PLUS and SAS/STATA/SPSS, which wasn't met before. Althought they have got the same eigenvalues, their coeffiecients were different. First, I list my results from R,S-PLUS and SAS/STATA/SPSS, and then show the original dataset, hoping sb. to try and explain it. SAS,STATA,and SPSS have the same results, so i put them together. From their results, we see that the absolute values of coeffiecient are same, but PC1,PC2,PC4,PC5 and PC6 in R have the opposite sign on the coeffiecnts contrast with SAS, and PC4,PC5 in S-PLUS have the opposite sign on the coeffiecnts contrast with SAS. Curiously, I got the same results amont all these software using my another dataset. *R's results of PCA:* *PC1* *PC2* PC3 *PC4* *PC5 * * PC6* X1 -0.5152569 0.20264489 -0.2338786 0.2350876 -0.2033335 -0.736298528 X2 -0.5197856 0.08989351 -0.2068260 0.3737667 -0.3187746 0.661548469 X3 -0.5148033 0.15820613 -0.0590627 -0.3210113 0.7693052 0.107616466 X4 -0.3535798 0.08105168 0.7317188 -0.4350752 -0.3790772 0.003088541 X5 -0.1868691 -0.67517084 -0.4397442 -0.5119015 -0.2314833 -0.014886524 X6 -0.1984241 -0.68073489 0.4126112 0.5006500 0.2606219 -0.091682326 pca<-read.csv('D:\pca.csv',sep=',',header=T) attach(pca) pcacomp <- prcomp(pca[,-1], retx=TRUE, center=TRUE,scale.= TRUE,tol=0.0001) *S-Plus's results of PCA:* pc1 pc2 pc3 *pc4 pc5* pc6 X1 0.5153 -0.2026 -0.2339 0.2351 -0.2033 0.7363 X2 0.5198 -0.0899 -0.2068 0.3738 -0.3188 -0.6615 X3 0.5148 -0.1582 -0.0591 -0.3210 0.7693 -0.1076 X4 0.3536 -0.0811 0.7317 -0.4351 -0.3791 -0.0031 X5 0.1869 0.6752 -0.4397 -0.5119 -0.2315 0.0149 X6 0.1984 0.6807 0.4126 0.5007 0.2606 0.0917 *SAS/STATA/SPSS's results of PCA:* PC1 PC2 PC3 PC4 PC5 PC6 X1 0.515257 -.202645 -.233879 -.235088 0.203334 0.736299 X2 0.519786 -.089894 -.206826 -.373767 0.318775 -.661548 X3 0.514803 -.158206 -.059063 0.321011 -.769305 -.107616 X4 0.353580 -.081052 0.731719 0.435075 0.379077 -.003089 X5 0.186869 0.675171 -.439744 0.511902 0.231483 0.014887 X6 0.198424 0.680735 0.412611 -.500650 -.260622 0.091682 My dataset used in the above results is : X1 X2 X3 X4 X5 X6 173.28 93.62 60.1 86.72 38.97 27.51 172.09 92.83 60.38 87.39 38.62 27.82 171.46 92.73 59.74 85.59 38.83 27.46 170.08 92.25 58.04 85.92 38.33 27.29 170.61 92.36 59.67 87.46 38.38 27.14 171.69 92.85 59.44 87.45 38.19 27.1 171.46 92.93 58.7 87.06 38.58 27.36 171.6 93.28 59.75 88.03 38.68 27.22 171.6 92.26 60.5 87.63 38.79 26.63 171.16 92.62 58.72 87.11 38.19 27.18 170.04 92.17 56.95 88.08 38.24 27.65 170.27 91.94 56 84.52 37.16 26.81 170.61 92.5 57.34 85.61 38.52 27.36 171.39 92.44 58.92 85.37 38.83 26.47 171.83 92.79 56.85 85.35 38.58 27.03 171.36 92.53 58.39 87.09 38.23 27.04 171.24 92.61 57.69 83.98 39.04 27.07 170.49 92.03 57.56 87.18 38.54 27.57 169.43 91.67 55.22 83.87 38.41 26.6 168.57 91.4 55.96 83.02 38.74 26.97 170.43 92.38 57.87 84.87 38.78 27.37 169.88 91.89 56.87 86.34 38.37 27.19 167.94 90.91 55.97 86.77 38.17 27.16 168.82 91.3 56.07 85.87 37.61 26.67 168.02 91.26 55.28 85.63 39.66 28.07 167.87 90.96 55.79 84.92 38.2 26.53 168.15 91.5 54.56 84.81 38.44 27.38 168.99 91.52 55.11 86.23 38.3 27.11 Any help or suggestions are greatly appreciated. -- With Kind Regards, oooO::::::::: (..)::::::::: :\.(:::Oooo:: ::\_)::(..):: :::::::)./::: ::::::(_/:::: ::::::::::::: [***********************************************************************] Zhi Jie,Zhang ,PHD Tel:86-21-54237149 epistat@gmail.com Dept. of Epidemiology,school of public health,Fudan University Address:No. 138 Yi Xue Yuan Road,Shanghai,China Postcode:200032 [***********************************************************************] oooO::::::::: (..)::::::::: :\.(:::Oooo:: ::\_)::(..):: :::::::)./::: ::::::(_/:::: ::::::::::::: [[alternative HTML version deleted]]
Peter Dalgaard
2007-Jan-30 14:47 UTC
[R] R and S-Plus got the different results of principal component analysis from SAS, why?
zhijie zhang wrote:> Dear Rusers, > > I have met a difficult problem on explaining the differences of principal > component analysis(PCA) between R,S-PLUS and SAS/STATA/SPSS, which wasn't > met before. > > Althought they have got the same eigenvalues, their coeffiecients were > different. > > First, I list my results from R,S-PLUS and SAS/STATA/SPSS, and then show > the original dataset, hoping sb. to try and explain it. > > SAS,STATA,and SPSS have the same results, so i put them together. From > their results, we see that the absolute values of coeffiecient are same, but > PC1,PC2,PC4,PC5 and PC6 in R have the opposite sign on the coeffiecnts > contrast with SAS, and PC4,PC5 in S-PLUS have the opposite sign on the > coeffiecnts contrast with SAS. Curiously, I got the same results amont all > these software using my another dataset. > > *Principal components are only *defined* up to sign changes (as the help page for prcomp says quite explicitly!!!!) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Gavin Simpson
2007-Jan-30 15:08 UTC
[R] R and S-Plus got the different results of principal component analysis from SAS, why?
On Tue, 2007-01-30 at 22:29 +0800, zhijie zhang wrote:> Dear Rusers, > > I have met a difficult problem on explaining the differences of principal > component analysis(PCA) between R,S-PLUS and SAS/STATA/SPSS, which wasn't > met before. > > Althought they have got the same eigenvalues, their coeffiecients were > different.Only up to rounding of printed results and their signs, which are arbitrary. The latter is covered in both the help pages for the two main PCA functions in R. Please read the Notes section of ?prcomp and/or ?princomp (you don't say which you used in R). G> > First, I list my results from R,S-PLUS and SAS/STATA/SPSS, and then show > the original dataset, hoping sb. to try and explain it. > > SAS,STATA,and SPSS have the same results, so i put them together. From > their results, we see that the absolute values of coeffiecient are same, but > PC1,PC2,PC4,PC5 and PC6 in R have the opposite sign on the coeffiecnts > contrast with SAS, and PC4,PC5 in S-PLUS have the opposite sign on the > coeffiecnts contrast with SAS. Curiously, I got the same results amont all > these software using my another dataset. > > *R's results of PCA:* > > *PC1* *PC2* PC3 *PC4* *PC5 * * > PC6* > > X1 -0.5152569 0.20264489 -0.2338786 0.2350876 -0.2033335 -0.736298528 > > X2 -0.5197856 0.08989351 -0.2068260 0.3737667 -0.3187746 0.661548469 > > X3 -0.5148033 0.15820613 -0.0590627 -0.3210113 0.7693052 0.107616466 > > X4 -0.3535798 0.08105168 0.7317188 -0.4350752 -0.3790772 0.003088541 > > X5 -0.1868691 -0.67517084 -0.4397442 -0.5119015 -0.2314833 -0.014886524 > > X6 -0.1984241 -0.68073489 0.4126112 0.5006500 0.2606219 -0.091682326 > > > > pca<-read.csv('D:\pca.csv',sep=',',header=T) > > attach(pca) > > pcacomp <- prcomp(pca[,-1], retx=TRUE, center=TRUE,scale.= TRUE,tol=0.0001) > > > > *S-Plus's results of PCA:* > > pc1 pc2 pc3 *pc4 pc5* pc6 > > X1 0.5153 -0.2026 -0.2339 0.2351 -0.2033 0.7363 > > X2 0.5198 -0.0899 -0.2068 0.3738 -0.3188 -0.6615 > > X3 0.5148 -0.1582 -0.0591 -0.3210 0.7693 -0.1076 > > X4 0.3536 -0.0811 0.7317 -0.4351 -0.3791 -0.0031 > > X5 0.1869 0.6752 -0.4397 -0.5119 -0.2315 0.0149 > > X6 0.1984 0.6807 0.4126 0.5007 0.2606 0.0917 > > > > *SAS/STATA/SPSS's results of PCA:* > > PC1 PC2 PC3 PC4 PC5 PC6 > > X1 0.515257 -.202645 -.233879 -.235088 0.203334 0.736299 > > X2 0.519786 -.089894 -.206826 -.373767 0.318775 -.661548 > > X3 0.514803 -.158206 -.059063 0.321011 -.769305 -.107616 > > X4 0.353580 -.081052 0.731719 0.435075 0.379077 -.003089 > > X5 0.186869 0.675171 -.439744 0.511902 0.231483 0.014887 > > X6 0.198424 0.680735 0.412611 -.500650 -.260622 0.091682 > > > > My dataset used in the above results is : > > X1 > > X2 > > X3 > > X4 > > X5 > > X6 > > 173.28 > > 93.62 > > 60.1 > > 86.72 > > 38.97 > > 27.51 > > 172.09 > > 92.83 > > 60.38 > > 87.39 > > 38.62 > > 27.82 > > 171.46 > > 92.73 > > 59.74 > > 85.59 > > 38.83 > > 27.46 > > 170.08 > > 92.25 > > 58.04 > > 85.92 > > 38.33 > > 27.29 > > 170.61 > > 92.36 > > 59.67 > > 87.46 > > 38.38 > > 27.14 > > 171.69 > > 92.85 > > 59.44 > > 87.45 > > 38.19 > > 27.1 > > 171.46 > > 92.93 > > 58.7 > > 87.06 > > 38.58 > > 27.36 > > 171.6 > > 93.28 > > 59.75 > > 88.03 > > 38.68 > > 27.22 > > 171.6 > > 92.26 > > 60.5 > > 87.63 > > 38.79 > > 26.63 > > 171.16 > > 92.62 > > 58.72 > > 87.11 > > 38.19 > > 27.18 > > 170.04 > > 92.17 > > 56.95 > > 88.08 > > 38.24 > > 27.65 > > 170.27 > > 91.94 > > 56 > > 84.52 > > 37.16 > > 26.81 > > 170.61 > > 92.5 > > 57.34 > > 85.61 > > 38.52 > > 27.36 > > 171.39 > > 92.44 > > 58.92 > > 85.37 > > 38.83 > > 26.47 > > 171.83 > > 92.79 > > 56.85 > > 85.35 > > 38.58 > > 27.03 > > 171.36 > > 92.53 > > 58.39 > > 87.09 > > 38.23 > > 27.04 > > 171.24 > > 92.61 > > 57.69 > > 83.98 > > 39.04 > > 27.07 > > 170.49 > > 92.03 > > 57.56 > > 87.18 > > 38.54 > > 27.57 > > 169.43 > > 91.67 > > 55.22 > > 83.87 > > 38.41 > > 26.6 > > 168.57 > > 91.4 > > 55.96 > > 83.02 > > 38.74 > > 26.97 > > 170.43 > > 92.38 > > 57.87 > > 84.87 > > 38.78 > > 27.37 > > 169.88 > > 91.89 > > 56.87 > > 86.34 > > 38.37 > > 27.19 > > 167.94 > > 90.91 > > 55.97 > > 86.77 > > 38.17 > > 27.16 > > 168.82 > > 91.3 > > 56.07 > > 85.87 > > 37.61 > > 26.67 > > 168.02 > > 91.26 > > 55.28 > > 85.63 > > 39.66 > > 28.07 > > 167.87 > > 90.96 > > 55.79 > > 84.92 > > 38.2 > > 26.53 > > 168.15 > > 91.5 > > 54.56 > > 84.81 > > 38.44 > > 27.38 > > 168.99 > > 91.52 > > 55.11 > > 86.23 > > 38.3 > > 27.11 > > Any help or suggestions are greatly appreciated. > > > -- > With Kind Regards, > > oooO::::::::: > (..)::::::::: > :\.(:::Oooo:: > ::\_)::(..):: > :::::::)./::: > ::::::(_/:::: > ::::::::::::: > [***********************************************************************] > Zhi Jie,Zhang ,PHD > Tel:86-21-54237149 epistat at gmail.com > Dept. of Epidemiology,school of public health,Fudan University > Address:No. 138 Yi Xue Yuan Road,Shanghai,China > Postcode:200032 > [***********************************************************************] > oooO::::::::: > (..)::::::::: > :\.(:::Oooo:: > ::\_)::(..):: > :::::::)./::: > ::::::(_/:::: > ::::::::::::: > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%