Dear R buddies, I’m trying to run Principal Component Analysis, package princomp: http://stat.ethz.ch/R-manual/R-patched/library/stats/html/princomp.html. My question is: why do I get different results with pca princomp (x, cor = TRUE) and pca = princomp (x, cor = FALSE) even when I standardize variables in my matrix? Best regards, Blaž Simčič [[alternative HTML version deleted]]
Hi, On Wed, Feb 29, 2012 at 9:52 AM, Blaz Simcic <blazsimcic at yahoo.com> wrote:> Dear R buddies, > I?m trying to run Principal Component Analysis, package > princomp: http://stat.ethz.ch/R-manual/R-patched/library/stats/html/princomp.html.I'm going to assume you actually mean the princomp() function.> My question is: why do I get different results with pca > princomp (x, cor = TRUE) and pca = princomp (x, cor = FALSE) even when I > standardize variables in my matrix?Because you didn't use the standardization that's used in princomp, most likely, but you don't include reproducible code so it's impossible to actually answer your question. Look at this for ideas, though. Using scale() is equivalent to using cor=TRUE.> data(iris) > iris.pcaCOR <- princomp(iris[,1:4], cor=TRUE) > iris.pcaSCALE <- princomp(scale(iris[,1:4]), cor=TRUE) > > summary(iris.pcaCOR)Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Standard deviation 1.7083611 0.9560494 0.38308860 0.143926497 Proportion of Variance 0.7296245 0.2285076 0.03668922 0.005178709 Cumulative Proportion 0.7296245 0.9581321 0.99482129 1.000000000> summary(iris.pcaSCALE)Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Standard deviation 1.7083611 0.9560494 0.38308860 0.143926497 Proportion of Variance 0.7296245 0.2285076 0.03668922 0.005178709 Cumulative Proportion 0.7296245 0.9581321 0.99482129 1.000000000 -- Sarah Goslee http://www.functionaldiversity.org
x <- data.frame(a=rnorm(100), b=rnorm(100), d=rnorm(100)) prcomp(x, scale=T) prcomp(scale(x), scale=F) The above will give you the same thing. This should be the case because the correlation matrix is the same as the covariance of the scaled and centered original data. FWIW Stephen On 02/29/2012 08:52 AM, Blaz Simcic wrote:> Dear R buddies, > IâEUR^(TM)m trying to run Principal Component Analysis, package > princomp: http://stat.ethz.ch/R-manual/R-patched/library/stats/html/princomp.html. > My question is: why do I get different results with pca > princomp (x, cor = TRUE) and pca = princomp (x, cor = FALSE) even when I > standardize variables in my matrix? > Best regards, > Blaž SimÄ?iÄ? > [[alternative HTML version deleted]] > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Stephen Sefick ************************************************** Auburn University Biological Sciences 331 Funchess Hall Auburn, Alabama 36849 ************************************************** sas0025@auburn.edu http://www.auburn.edu/~sas0025 ************************************************** Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." -Robert Gentleman [[alternative HTML version deleted]]