Hi ! I've used the example given in the documentation for the prcomp function both in R and SPAD to compare the results obtained. Surprisingly, I do not obtain the same results for the coordinates of the principal composantes with these two softwares. using USArrests data I obtain with R :> summary(prcomp(USArrests))Importance of components: PC1 PC2 PC3 PC4 Standard deviation 83.732 14.2124 6.4894 2.48279 Proportion of Variance 0.966 0.0278 0.0058 0.00085 Cumulative Proportion 0.966 0.9933 0.9991 1.00000 And using SPAD (french editor CISIA) : Ex: sd pv cp comp1 | 2.4802 | 62.01 | 62.01 | comp2 | 0.9898 | 24.74 | 86.75 | comp3 | 0.3566 | 8.91 | 95.66 | comp4 | 0.1734 | 4.34 | 100.00 | Am I wrong using R ? Why the results are so different ? Furthemore could anyone explain me the difference between prcomp and princomp, since we do not obtain exxactly the same results using these two functions. And how to obtain the coordinates of the points on the first composante using R ? Many thanks, Christine -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> Date: Tue, 03 Oct 2000 11:09:06 +0000 > From: Christine Serres <serres at valigen.net> > > > I've used the example given in the documentation for the prcomp function > both in R and SPAD to compare the results obtained. > Surprisingly, I do not obtain the same results for the coordinates of > the principal composantes with these two softwares. > > > using USArrests data I obtain with R : > > > summary(prcomp(USArrests)) > Importance of components: > PC1 PC2 PC3 PC4 > Standard deviation 83.732 14.2124 6.4894 2.48279 > Proportion of Variance 0.966 0.0278 0.0058 0.00085 > Cumulative Proportion 0.966 0.9933 0.9991 1.00000Read on:> summary(prcomp(USArrests, scale=T))Importance of components: PC1 PC2 PC3 PC4 Standard deviation 1.57 0.995 0.5971 0.4164 Proportion of Variance 0.62 0.247 0.0891 0.0434 Cumulative Proportion 0.62 0.868 0.9566 1.0000> And using SPAD (french editor CISIA) : > > Ex: sd pv cp > comp1 | 2.4802 | 62.01 | 62.01 | > comp2 | 0.9898 | 24.74 | 86.75 | > comp3 | 0.3566 | 8.91 | 95.66 | > comp4 | 0.1734 | 4.34 | 100.00 |Also> summary(princomp(USArrests, cor=T))Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Standard deviation 1.5748783 0.9948694 0.5971291 0.41644938 Proportion of Variance 0.6200604 0.2474413 0.0891408 0.04335752 Cumulative Proportion 0.6200604 0.8675017 0.9566425 1.00000000 BTW, it looks like SPAD's `sd' are in fact variances, for the square of the first line here is Comp.1 Comp.2 Comp.3 Comp.4 2.4802416 0.9897652 0.3565632 0.1734301> Am I wrong using R ? Why the results are so different ?In this dataset you do want scaling, as the variables are not on a common scale. But SPAD has apparently scaled by default, and apparently mis-labelled its results.> Furthemore could anyone explain me the difference between prcomp and > princomp, since we do not obtain exxactly the same results using these > two functions.They differ in the definition of variance. It's on the help page for princomp! If you scale, there is no difference, otherwise there is an n vs n-1 factor. The reasons are both S-PLUS compatibility and to allow princomp to use robust principal components.> And how to obtain the coordinates of the points on the first composante > using R ?predict on a princmp fit, or retx=TRUE on a prcomp fit. You will find all this in Venables & Ripley, for example. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I am pretty ignorant but: scale: a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place. The default is `FALSE' for consistency with S, but in general scaling is advisable. Alternately, a vector of length equal the number of columns of `x' can be supplied. The value is passed to `scale'. If you use scale=TRUE you at least get agreement in the proportion of variance/cumulative proportion columns. On Tue, 3 Oct 2000, Christine Serres wrote:> Hi ! > > I've used the example given in the documentation for the prcomp function > both in R and SPAD to compare the results obtained. > Surprisingly, I do not obtain the same results for the coordinates of > the principal composantes with these two softwares. > > > using USArrests data I obtain with R : > > > summary(prcomp(USArrests)) > Importance of components: > PC1 PC2 PC3 PC4 > Standard deviation 83.732 14.2124 6.4894 2.48279 > Proportion of Variance 0.966 0.0278 0.0058 0.00085 > Cumulative Proportion 0.966 0.9933 0.9991 1.00000 > > > And using SPAD (french editor CISIA) : > > Ex: sd pv cp > comp1 | 2.4802 | 62.01 | 62.01 | > comp2 | 0.9898 | 24.74 | 86.75 | > comp3 | 0.3566 | 8.91 | 95.66 | > comp4 | 0.1734 | 4.34 | 100.00 | > > > Am I wrong using R ? Why the results are so different ? > Furthemore could anyone explain me the difference between prcomp and > princomp, since we do not obtain exxactly the same results using these > two functions. > And how to obtain the coordinates of the points on the first composante > using R ? > > Many thanks, > Christine > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > >-- 318 Carr Hall bolker at zoo.ufl.edu Zoology Department, University of Florida http://www.zoo.ufl.edu/bolker Box 118525 (ph) 352-392-5697 Gainesville, FL 32611-8525 (fax) 352-392-3704 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
PS it looks like the "sd" components from SPAD are the squares of the sd components that R returns when scale=TRUE ... -- 318 Carr Hall bolker at zoo.ufl.edu Zoology Department, University of Florida http://www.zoo.ufl.edu/bolker Box 118525 (ph) 352-392-5697 Gainesville, FL 32611-8525 (fax) 352-392-3704 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._