thr3ads.net - R help - [R] pca vs. pfa: dimension reduction [Mar 2009]

If this information is useful, please help other people find it:
Share via:

soeren.vogel at eawag.ch

2009-Mar-25 18:06 UTC

[R] pca vs. pfa: dimension reduction

Can't make sense of calculated results and hope I'll find help here.

I've collected answers from about 600 persons concerning three  
variables. I hypothesise those three variables to be components (or  
indicators) of one latent factor. In order to reduce data (vars), I  
had the following idea: Calculate the factor underlying these three  
vars. Use the loadings and the original var values to construct an new  
(artificial) var: (B1 * X1) + (B2 * X2) + (B3 * X3) = ArtVar (brackets  
for readability). Use ArtVar for further analysis of the data, that  
is, as predictor etc.

In my (I realise, elementary) psychological statistics readings I was  
taught to use pca for these problems. Referring to Venables & Ripley  
(2002, chapter 11), I applied "princomp" to my vars. But the outcome  
shows 4 components -- which is obviously not what I want. Reading  
further I found "factanal", which produces loadings on the one  
specified factor very fine. But since this is a contradiction to  
theoretical introductions in so many texts I'm completely confused  
whether I'm right with these calculations.

(1) Is there an easy example, which explains the differences between  
pca and pfa? (2) Which R procedure should I use to get what I want?

Thank you for your help

S?ren


Refs.:

Venables, W. N., and Ripley, B. D. (2002). Modern applied statistics  
with S (4th edition). New York: Springer.

Jonathan Baron

2009-Mar-25 18:22 UTC

head link

[R] pca vs. pfa: dimension reduction

On 03/25/09 19:06, soeren.vogel at eawag.ch wrote:> Can't make sense of calculated results and hope I'll find help
here.
> 
> I've collected answers from about 600 persons concerning three  
> variables. I hypothesise those three variables to be components (or  
> indicators) of one latent factor. In order to reduce data (vars), I  
> had the following idea: Calculate the factor underlying these three  
> vars. Use the loadings and the original var values to construct an new  
> (artificial) var: (B1 * X1) + (B2 * X2) + (B3 * X3) = ArtVar (brackets  
> for readability). Use ArtVar for further analysis of the data, that  
> is, as predictor etc.
> 
> In my (I realise, elementary) psychological statistics readings I was  
> taught to use pca for these problems. Referring to Venables & Ripley  
> (2002, chapter 11), I applied "princomp" to my vars. But the
outcome
> shows 4 components -- which is obviously not what I want. Reading  
> further I found "factanal", which produces loadings on the one  
> specified factor very fine. But since this is a contradiction to  
> theoretical introductions in so many texts I'm completely confused  
> whether I'm right with these calculations.
> 
> (1) Is there an easy example, which explains the differences between  
> pca and pfa? (2) Which R procedure should I use to get what I want?
Possibly what you want is the first principal component, which the
weighted sum that accounts for the most variance of the three
variables.  It does essentially what you say in your first paragraph.
So you want something like

p1 <- princomp(cbind(X1,X2,X3),scores=TRUE)
p1$scores[,1]

The trouble with factanal is that it does a rotation, and the default
is varimax.  The first factor will usually not be the same as the
first principal component (I think).  Perhaps there is another
rotation option that will give you this, but why bother even to look?
(I didn't, obviously.)

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron

Mark Difford

2009-Mar-25 19:51 UTC

head link

[R] pca vs. pfa: dimension reduction

Hi S?ren,
>> (1) Is there an easy example, which explains the differences between  
>> pca and pfa? (2) Which R procedure should I use to get what I want?
There are a number of fundamental differences between PCA and FA (Factor
Analysis), which unfortunately are quite widely ignored. FA is explicitly
model-based, whereas PCA does not invoke an explicit model. FA is also
designed to detect structure, whereas PCA focuses on variance, to put things
simply. In more detail, the two methods "attack" the covariance matrix
in
different ways: in PCA the focus of decomposition is on the diagonal
elements, whereas in FA the focus is on the off-diagonal elements.

Take a look at Prof. Revelle's psych package (funtion omega &c). Note
also
that factanal has a rotation = "none" option.

Regards, Mark.


soeren.vogel wrote:> 
> Can't make sense of calculated results and hope I'll find help
here.
> 
> I've collected answers from about 600 persons concerning three  
> variables. I hypothesise those three variables to be components (or  
> indicators) of one latent factor. In order to reduce data (vars), I  
> had the following idea: Calculate the factor underlying these three  
> vars. Use the loadings and the original var values to construct an new  
> (artificial) var: (B1 * X1) + (B2 * X2) + (B3 * X3) = ArtVar (brackets  
> for readability). Use ArtVar for further analysis of the data, that  
> is, as predictor etc.
> 
> In my (I realise, elementary) psychological statistics readings I was  
> taught to use pca for these problems. Referring to Venables & Ripley  
> (2002, chapter 11), I applied "princomp" to my vars. But the
outcome
> shows 4 components -- which is obviously not what I want. Reading  
> further I found "factanal", which produces loadings on the one  
> specified factor very fine. But since this is a contradiction to  
> theoretical introductions in so many texts I'm completely confused  
> whether I'm right with these calculations.
> 
> (1) Is there an easy example, which explains the differences between  
> pca and pfa? (2) Which R procedure should I use to get what I want?
> 
> Thank you for your help
> 
> S?ren
> 
> 
> Refs.:
> 
> Venables, W. N., and Ripley, B. D. (2002). Modern applied statistics  
> with S (4th edition). New York: Springer.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 
View this message in context:
http://www.nabble.com/pca-vs.-pfa%3A-dimension-reduction-tp22707926p22709481.html
Sent from the R help mailing list archive at Nabble.com.

Maybe Matching Threads

Search for more reasonably related threads

R help - Mar 2009 - pca vs. pfa: dimension reduction

[R] pca vs. pfa: dimension reduction

[R] pca vs. pfa: dimension reduction

[R] pca vs. pfa: dimension reduction

Maybe Matching Threads