Dear R-list users,
I'm new to principal components and factor analysis.
I thought this method can be very useful for me to find relationships
between several variables (which I know there is, only don't know which
variables exactly and what kind of relation), so as a structure
detection method.
Now, I'm experimenting with the function prcomp from the mva package.
In my source code below, I of course expect one of the column to be
useless (I provided one duplicate column). I know both avg.EDGE.etc and
avg.DEGREE have a relation with sum.delivery.penalty.
E.g. the bigger avg.DEGREE, the smaller sum.delivery.penalty.
My question is about the output of prcomp.
I understand the cumulative proportion of variance of the third
principal component is 100%. Just like I expected.
I see the components are sorted. The one that explains the most variance
is listed first.
But, how can I figure out what these principal components are exactly?
For example PC1. Was is the exact meaning of it?
I assumed it is some linear combination of the variables I provided in
the call to prcomp, but how can i obtain this linear combination?
ps > i used http://www.statsoftinc.com/textbook/stfacan.html as a
reference, and help(prcomp/princomp) of course.
Thanks for any help!
Jonne.
# Read a table
dir = "..."
file = "..." # huge file, 12 Mb
stats = read.table(paste(dir, file, sep=""), header=TRUE)
# Select several columns
data = subset(stats, select c(sum.delivery.penalty,
avg.EDGE.IN.SHORTEST.PATH.COUNT,
avg.EDGE.IN.SHORTEST.PATH.COUNT,
avg.DEGREE))
require(mva)
pc2 = prcomp(data, retx = TRUE, center = TRUE,
scale. = TRUE, tol = NULL)
pc2
summary(pc2)
--- gives the following output
> pc2
Standard deviations:
[1] 1.424074e+00 1.000000e-00 9.859080e-01 5.711682e-17
Rotation:
PC1 PC2
PC3
sum.delivery.penalty -1.627945e-01 -1.539887e-12
9.866600e-01
avg.EDGE.IN.SHORTEST.PATH.COUNT -6.976740e-01 2.413866e-16
-1.151131e-01
avg.EDGE.IN.SHORTEST.PATH.COUNT.1 -6.976740e-01 2.013413e-17
-1.151131e-01
avg.DEGREE 2.505027e-13 -1.000000e+00
-1.519375e-12
PC4
sum.delivery.penalty -1.118300e-17
avg.EDGE.IN.SHORTEST.PATH.COUNT 7.071068e-01
avg.EDGE.IN.SHORTEST.PATH.COUNT.1 -7.071068e-01
avg.DEGREE -3.253830e-18> summary(pc2)
Importance of components:
PC1 PC2 PC3 PC4
Standard deviation 1.424 1.000 0.986 5.71e-17
Proportion of Variance 0.507 0.250 0.243 0.00e+00
Cumulative Proportion 0.507 0.757 1.000 1.00e+00