thr3ads.net - R help - [R] which to trust...princomp() or prcomp() or neither? [Nov 2009]

If this information is useful, please help other people find it:
Share via:

Blair Smith

2009-Nov-25 03:32 UTC

[R] which to trust...princomp() or prcomp() or neither?

According to R help: 
princomp() uses eigenvalues of covariance data.
prcomp() uses the SVD method.

yet when I run the (eg., USArrests) data example and compare with my own 
"hand-written" versions of PCA I get what looks like the opposite.
Example:
comparing the variances I see:

Using prcomp(USArrests)
-------------------------------------
Standard deviations:
[1] 83.732400 14.212402  6.489426  2.482790

Using princomp(USArrests)
--------------------------------------
   Comp.1    Comp.2    Comp.3    Comp.4
82.890847 14.069560  6.424204  2.457837

Using my custom pca_svd() --- (my PCA method using native R svd() function):
-----------------------------------------------
$stdev
[1] 82.890847 14.069560  6.424204  2.457837

Using my custom pca_cov() ---  (my PCA method using native R cov() and eigen()
functions):
-----------------------------------------------
$sdev
[1] 83.732400 14.212402  6.489426  2.482790

You see the problem: my SVD method yields results numerically similar to 
the princomp() method which supposedly uses the eigenvector calculation.
Whereas my eigenvector calculation method yields results numerically 
similar to the prcomp() method which supposedly is a SVD calculation!

Also, it surprised me that the two methods would differ so markedly (only 2
significant
figure agreement at best).  Ultimately the question is which method to trust as
most
accurate?  

When I get time I'll just put in some data with KNOWN PC stdevs to see, but
I'm
still curious to see if any of you reading this help list could explain in
advance?

If any R gurus or the writers of either of the aforementioned routines can
enlighten me I'd be most grateful.

---
Dr Blair M. Smith
Industrial Research Limited
NZ

Jari Oksanen

2009-Nov-25 08:33 UTC

head link

[R] which to trust...princomp() or prcomp() or neither?

Blair Smith <b.smith <at> irl.cri.nz> writes:
> 
> According to R help: 
> princomp() uses eigenvalues of covariance data.
> prcomp() uses the SVD method.
> 
> yet when I run the (eg., USArrests) data example and compare with my own 
> "hand-written" versions of PCA I get what looks like the
opposite.
...clip...> You see the problem: my SVD method yields results numerically similar to 
> the princomp() method which supposedly uses the eigenvector calculation.
> Whereas my eigenvector calculation method yields results numerically 
> similar to the prcomp() method which supposedly is a SVD calculation!
> 
> Also, it surprised me that the two methods would differ so markedly (only 2
significant > figure agreement at best).  Ultimately the question is which method to
trust
as most > accurate?  
> 
> When I get time I'll just put in some data with KNOWN PC stdevs to see,
but
I'm > still curious to see if any of you reading this help list could explain in
advance?> 
Blair,

A behavioural test is not the best choice here: the source code is visible and
you can look there and see that prcomp() indeed uses svd() and princomp() uses
eigen(). The easiest way to dump the code is to write the name of the function
without trailing parentheses.

The differences in the numbers you cite has nothing to do with the underlying
algorithm, but it is only due to the scaling of the results: Function
princomp()uses divisor 'N' for the covariance matrix (this is explicitly
documented in?princomp), whereas prcomp() uses divisor 'N-1' (which is
implied
in ?prcomp --
this really could be more explicit).

So you see: you did not need a guru to answer you -- reading the code and docs
was sufficient.

Cheers, Jari Oksanen

Possibly Parallel Threads

Search for more reasonably related threads

R help - Nov 2009 - which to trust...princomp() or prcomp() or neither?

[R] which to trust...princomp() or prcomp() or neither?

[R] which to trust...princomp() or prcomp() or neither?

Possibly Parallel Threads