thr3ads.net - R help - [R] Can't reproduce clusplot princomp results. [May 2005]

If this information is useful, please help other people find it:
Share via:

Thomas M. Parris

2005-May-23 18:01 UTC

[R] Can't reproduce clusplot princomp results.

Dear R folk:

Perhaps I'm just dense today, but I am having trouble reproducing the
principal components plotted and summarized by clusplot.  Here is a brief
example using the pluton dataset.  clusplot reports that the first two
principal components explain 99.7% of the variability.  But this is not what
princomp is reporting.  I would greatly appreciate any advice.

With best regards,
-- Tom
> R.version         _              
platform i386-pc-mingw32
arch     i386           
os       mingw32        
system   i386, mingw32  
status                  
major    2              
minor    0.1            
year     2004           
month    11             
day      15             
language R         
> require("cluster")
[1] TRUE> pluton.agnes <- agnes(pluton)
> clusters <- cutree(as.hclust(pluton.agnes), h=4.00)
> clusplot(pluton, clusters, lines=0)
> pca <- princomp(pluton, cor=TRUE)
> loadings(pca)
Loadings:
      Comp.1 Comp.2 Comp.3 Comp.4
Pu238  0.521  0.348  0.714  0.313
Pu239 -0.540                0.837
Pu240  0.418 -0.835         0.353
Pu241  0.512  0.418 -0.698  0.277

               Comp.1 Comp.2 Comp.3 Comp.4
SS loadings      1.00   1.00   1.00   1.00
Proportion Var   0.25   0.25   0.25   0.25
Cumulative Var   0.25   0.50   0.75   1.00

Bjørn-Helge Mevik

2005-May-24 07:19 UTC

head link

[R] Can't reproduce clusplot princomp results.

Thomas M. Parris writes:
> clusplot reports that the first two principal components explain
> 99.7% of the variability.[...]
>> loadings(pca)
[...]>                Comp.1 Comp.2 Comp.3 Comp.4
> SS loadings      1.00   1.00   1.00   1.00
> Proportion Var   0.25   0.25   0.25   0.25
> Cumulative Var   0.25   0.50   0.75   1.00
This has nothing to do with how much of the variability of the
original data that is captured by each component; it merely measures
the variability in the coefficients of the loading vectors (and they
are standardised to length one in princomp)

What you want to look at is pca$sdev, for instance something like

totvar <- sum(pca$sdev^2)
rbind("explained var" = pca$sdev^2,
      "prop. expl. var" = pca$sdev^2/totvar,
      "cum.prop.expl.var" = cumsum(pca$sdev^2)/totvar)
                     Comp.1    Comp.2      Comp.3       Comp.4
explained var     3.4093746 0.5785399 0.011560142 0.0005252824
prop. expl. var   0.8523437 0.1446350 0.002890036 0.0001313206
cum.prop.expl.var 0.8523437 0.9969786 0.999868679 1.0000000000

And as you can see, two comps "explain" 99.7%. :-)

-- 
Bj??rn-Helge Mevik

Maybe Matching Threads

Search for more possibly parallel threads

R help - May 2005 - Can't reproduce clusplot princomp results.

[R] Can't reproduce clusplot princomp results.

[R] Can't reproduce clusplot princomp results.

Maybe Matching Threads