Dear Users, I ran factor analysis using R and SAS. However, I had different outputs from R and SAS. Why they provide different outputs? Especially, the factor loadings are different. I did real dataset(n=264), however, I had an extremely different from R and SAS. Why this things happened? Which software is correct on? Thanks in advance, - TY #R code with example data # A little demonstration, v2 is just v1 with noise, # and same for v4 vs. v3 and v6 vs. v5 # Last four cases are there to add noise # and introduce a positive manifold (g factor) v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6) v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5) v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6) v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4) v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5) v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4) m1 <- cbind(v1,v2,v3,v4,v5,v6) cor(m1) # v1 v2 v3 v4 v5 v6 #v1 1.0000000 0.9393083 0.5128866 0.4320310 0.4664948 0.4086076 #v2 0.9393083 1.0000000 0.4124441 0.4084281 0.4363925 0.4326113 #v3 0.5128866 0.4124441 1.0000000 0.8770750 0.5128866 0.4320310 #v4 0.4320310 0.4084281 0.8770750 1.0000000 0.4320310 0.4323259 #v5 0.4664948 0.4363925 0.5128866 0.4320310 1.0000000 0.9473451 #v6 0.4086076 0.4326113 0.4320310 0.4323259 0.9473451 1.0000000 factanal(m1, factors=3) # varimax is the default # Output from R #Call: #factanal(x = m1, factors = 3) #Uniquenesses: # v1 v2 v3 v4 v5 v6 #0.005 0.101 0.005 0.224 0.084 0.005 #Loadings: # Factor1 Factor2 Factor3 #v1 0.944 0.182 0.267 #v2 0.905 0.235 0.159 #v3 0.236 0.210 0.946 #v4 0.180 0.242 0.828 #v5 0.242 0.881 0.286 #v6 0.193 0.959 0.196 # Factor1 Factor2 Factor3 #SS loadings 1.893 1.886 1.797 #Proportion Var 0.316 0.314 0.300 #Cumulative Var 0.316 0.630 0.929 #The degrees of freedom for the model is 0 and the fit was 0.4755 /* SAS code with example data*/ data fact; input v1-v6; datalines; 1 1 3 3 1 1 1 2 3 3 1 1 1 1 3 4 1 1 1 1 3 3 1 2 1 1 3 3 1 1 1 1 1 1 3 3 1 2 1 1 3 3 1 1 1 2 3 3 1 2 1 1 3 4 1 1 1 1 3 3 3 3 1 1 1 1 3 4 1 1 1 1 3 3 1 2 1 1 3 3 1 1 1 2 3 3 1 1 1 1 4 4 5 5 6 6 5 6 4 6 4 5 6 5 6 4 5 4 ; run; proc factor data=fact rotate=varimax method=p nfactors=3; var v1-v6; run; /* Output from SAS*/ The FACTOR Procedure Initial Factor Method: Principal Components Prior Communality Estimates: ONE Eigenvalues of the Correlation Matrix: Total = 6 Average = 1 Eigenvalue Difference Proportion Cumulative 1 3.69603077 2.62291629 0.6160 0.6160 2 1.07311448 0.07234039 0.1789 0.7949 3 1.00077409 0.83977061 0.1668 0.9617 4 0.16100348 0.12004232 0.0268 0.9885 5 0.04096116 0.01284515 0.0068 0.9953 6 0.02811601 0.0047 1.0000 3 factors will be retained by the NFACTOR criterion. Factor Pattern Factor1 Factor2 Factor3 v1 0.79880 0.54995 -0.17614 v2 0.77036 0.56171 -0.24862 v3 0.79475 -0.07685 0.54982 v4 0.75757 -0.08736 0.59785 v5 0.80878 -0.45610 -0.33437 v6 0.77771 -0.48331 -0.36933 Variance Explained by Each Factor Factor1 Factor2 Factor3 3.6960308 1.0731145 1.0007741 Final Communality Estimates: Total = 5.769919 v1 v2 v3 v4 v5 v6 0.97154741 0.97078498 0.93983835 0.93897798 0.97394719 0.97482345 The FACTOR Procedure Rotation Method: Varimax Orthogonal Transformation Matrix 1 2 3 1 0.58233 0.57714 0.57254 2 -0.64183 0.75864 -0.11193 3 -0.49895 -0.30229 0.81220 Rotated Factor Pattern Factor1 Factor2 Factor3 v1 0.20008 0.93148 0.25272 v2 0.21213 0.94590 0.17626 v3 0.23781 0.23418 0.91019 v4 0.19893 0.19023 0.92909 v5 0.93054 0.22185 0.24253 v6 0.94736 0.19384 0.19939 Variance Explained by Each Factor Factor1 Factor2 Factor3 1.9445607 1.9401828 1.8851759 Final Communality Estimates: Total = 5.769919 v1 v2 v3 v4 v5 v6 0.97154741 0.97078498 0.93983835 0.93897798 0.97394719 0.97482345 [[alternative HTML version deleted]]
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,> I ran factor analysis using R and SAS. However, I had different outputs from > R and SAS. > Why they provide different outputs? Especially, the factor loadings are > different. > I did real dataset(n=264), however, I had an extremely different from R and > SAS. > Why this things happened? Which software is correct on?factanal uses ML-method for estimating the loadings. SAS and SPSS use the principal component method. Maybe you should better use princomp + varimax. However, the rotated solutions are basically the same :) Yours sincerely Sigbert -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJ0gWdWvYUYQkj1zkRAi/NAJ9gHEcXbYzafE9MEbL8ZnCY/B8inwCeORuB 0vQ81ucg86IrkJl+EJSP1n0=lDR6 -----END PGP SIGNATURE-----
Dear TY, Considering that you used different methods -- maximum-likelihood factor analysis in R and principal components analysis in SAS -- the results are quite similar (although the three rotated factors/components come out in different orders). I hope this helps, John> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]On> Behalf Of Tae-Young Heo > Sent: March-31-09 7:07 AM > To: r-help at r-project.org > Subject: [R] Factor Analysis Output from R and SAS > > Dear Users, > > I ran factor analysis using R and SAS. However, I had different outputsfrom> R and SAS. > Why they provide different outputs? Especially, the factor loadings are > different. > I did real dataset(n=264), however, I had an extremely different from Rand> SAS. > Why this things happened? Which software is correct on? > > Thanks in advance, > > - TY > > #R code with example data > > # A little demonstration, v2 is just v1 with noise, > # and same for v4 vs. v3 and v6 vs. v5 > # Last four cases are there to add noise > # and introduce a positive manifold (g factor) > v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6) > v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5) > v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6) > v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4) > v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5) > v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4) > m1 <- cbind(v1,v2,v3,v4,v5,v6) > cor(m1) > # v1 v2 v3 v4 v5 v6 > #v1 1.0000000 0.9393083 0.5128866 0.4320310 0.4664948 0.4086076 > #v2 0.9393083 1.0000000 0.4124441 0.4084281 0.4363925 0.4326113 > #v3 0.5128866 0.4124441 1.0000000 0.8770750 0.5128866 0.4320310 > #v4 0.4320310 0.4084281 0.8770750 1.0000000 0.4320310 0.4323259 > #v5 0.4664948 0.4363925 0.5128866 0.4320310 1.0000000 0.9473451 > #v6 0.4086076 0.4326113 0.4320310 0.4323259 0.9473451 1.0000000 > > factanal(m1, factors=3) # varimax is the default > > > # Output from R > > #Call: > #factanal(x = m1, factors = 3) > > #Uniquenesses: > # v1 v2 v3 v4 v5 v6 > #0.005 0.101 0.005 0.224 0.084 0.005 > > #Loadings: > # Factor1 Factor2 Factor3 > #v1 0.944 0.182 0.267 > #v2 0.905 0.235 0.159 > #v3 0.236 0.210 0.946 > #v4 0.180 0.242 0.828 > #v5 0.242 0.881 0.286 > #v6 0.193 0.959 0.196 > > # Factor1 Factor2 Factor3 > #SS loadings 1.893 1.886 1.797 > #Proportion Var 0.316 0.314 0.300 > #Cumulative Var 0.316 0.630 0.929 > > #The degrees of freedom for the model is 0 and the fit was 0.4755 > > /* SAS code with example data*/ > > data fact; > input v1-v6; > datalines; > 1 1 3 3 1 1 > 1 2 3 3 1 1 > 1 1 3 4 1 1 > 1 1 3 3 1 2 > 1 1 3 3 1 1 > 1 1 1 1 3 3 > 1 2 1 1 3 3 > 1 1 1 2 3 3 > 1 2 1 1 3 4 > 1 1 1 1 3 3 > 3 3 1 1 1 1 > 3 4 1 1 1 1 > 3 3 1 2 1 1 > 3 3 1 1 1 2 > 3 3 1 1 1 1 > 4 4 5 5 6 6 > 5 6 4 6 4 5 > 6 5 6 4 5 4 > ; > run; > > proc factor data=fact rotate=varimax method=p nfactors=3; > var v1-v6; > run; > > /* Output from SAS*/ > > The FACTOR > Procedure > Initial Factor Method: Principal > Components > > Prior Communality Estimates: > ONE > > > > Eigenvalues of the Correlation Matrix: > Total = 6 Average = 1 > > Eigenvalue Difference > Proportion Cumulative > > 1 3.69603077 2.62291629 > 0.6160 0.6160 > 2 1.07311448 0.07234039 > 0.1789 0.7949 > 3 1.00077409 0.83977061 > 0.1668 0.9617 > 4 0.16100348 0.12004232 > 0.0268 0.9885 > 5 0.04096116 0.01284515 > 0.0068 0.9953 > 6 0.02811601 > 0.0047 1.0000 > > 3 factors will be retained by the > NFACTOR criterion. > > > > Factor Pattern > > Factor1 > Factor2 Factor3 > > v1 0.79880 > 0.54995 -0.17614 > v2 0.77036 > 0.56171 -0.24862 > v3 0.79475 > -0.07685 0.54982 > v4 0.75757 > -0.08736 0.59785 > v5 0.80878 > -0.45610 -0.33437 > v6 0.77771 > -0.48331 -0.36933 > > > Variance Explained by Each > Factor > > Factor1 Factor2 > Factor3 > > 3.6960308 1.0731145 > 1.0007741 > > > Final Communality Estimates:Total> = 5.769919 > > v1 v2 v3 > v4 v5 v6 > 0.97154741 0.97078498 0.93983835 > 0.93897798 0.97394719 0.97482345 > > > > The FACTOR Procedure > Rotation Method:Varimax> > Orthogonal Transformation > Matrix > > 1 > 2 3 > > 1 0.58233 > 0.57714 0.57254 > 2 -0.64183 > 0.75864 -0.11193 > 3 -0.49895 > -0.30229 0.81220 > > > Rotated Factor Pattern > > Factor1 > Factor2 Factor3 > > v1 0.20008 > 0.93148 0.25272 > v2 0.21213 > 0.94590 0.17626 > v3 0.23781 > 0.23418 0.91019 > v4 0.19893 > 0.19023 0.92909 > v5 0.93054 > 0.22185 0.24253 > v6 0.94736 > 0.19384 0.19939 > > > Variance Explained by Each > Factor > > Factor1 Factor2 > Factor3 > > 1.9445607 1.9401828 > 1.8851759 > > > Final Communality Estimates:Total> = 5.769919 > > v1 v2 v3 > v4 v5 v6 > > 0.97154741 0.97078498 0.93983835 > 0.93897798 0.97394719 0.97482345 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
At 8:17 AM -0400 3/31/09, John Fox wrote:>Dear TY, > >Considering that you used different methods -- maximum-likelihood factor >analysis in R and principal components analysis in SAS -- the results are >quite similar (although the three rotated factors/components come out in >different orders). > >I hope this helps, > JohnAs John pointed out, PCA is not the same as FA. Unfortunately, SAS labeled the PCA as a factor analysis, when it is not. And, when one does a principal components and extracts just the first three components (as you did), the unrotated solution is identical:> package(psych) > pc3 <- principal(x,3,rotate="none") > print(pc3,digits=5,cut=0)V PC1 PC2 PC3 V1 1 0.79880 0.549948 -0.17614 V2 2 0.77036 0.561712 -0.24862 V3 3 0.79475 -0.076853 0.54982 V4 4 0.75757 -0.087363 0.59785 V5 5 0.80878 -0.456096 -0.33437 V6 6 0.77771 -0.483310 -0.36933 PC1 PC2 PC3 SS loadings 3.69603 1.07311 1.00077 Proportion Var 0.61601 0.17885 0.16680 Cumulative Var 0.61601 0.79486 0.96165 Test of the hypothesis that 3 factors are sufficient. The degrees of freedom for the model is 0 and the fit was 1.57483 The number of observations was 18 with Chi Square = 19.16046 with prob < NA When you then rotate these components using Varimax, the solutions differ at the fourth decimal place> pc3 <- principal(x,3) > print(pc3,cut=0,digits=5)V PC1 PC2 PC3 V1 1 0.19991 0.93168 0.25210 V2 2 0.21193 0.94606 0.17562 V3 3 0.23805 0.23479 0.90997 V4 4 0.19920 0.19084 0.92891 V5 5 0.93057 0.22225 0.24208 V6 6 0.94738 0.19422 0.19896 PC1 PC2 PC3 SS loadings 1.94470 1.94172 1.88350 Proportion Var 0.32412 0.32362 0.31392 Cumulative Var 0.32412 0.64774 0.96165 Bill> >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] >On >> Behalf Of Tae-Young Heo >> Sent: March-31-09 7:07 AM >> To: r-help at r-project.org >> Subject: [R] Factor Analysis Output from R and SAS >> >> Dear Users, >> >> I ran factor analysis using R and SAS. However, I had different outputs >from >> R and SAS. >> Why they provide different outputs? Especially, the factor loadings are >> different. >> I did real dataset(n=264), however, I had an extremely different from R >and >> SAS. >> Why this things happened? Which software is correct on? >> >> Thanks in advance, >> >> - TY >> >> #R code with example data >> >> # A little demonstration, v2 is just v1 with noise, >> # and same for v4 vs. v3 and v6 vs. v5 >> # Last four cases are there to add noise >> # and introduce a positive manifold (g factor) >> v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6) >> v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5) >> v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6) >> v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4) >> v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5) >> v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4) >> m1 <- cbind(v1,v2,v3,v4,v5,v6) >> cor(m1) >> # v1 v2 v3 v4 v5 v6 >> #v1 1.0000000 0.9393083 0.5128866 0.4320310 0.4664948 0.4086076 >> #v2 0.9393083 1.0000000 0.4124441 0.4084281 0.4363925 0.4326113 >> #v3 0.5128866 0.4124441 1.0000000 0.8770750 0.5128866 0.4320310 >> #v4 0.4320310 0.4084281 0.8770750 1.0000000 0.4320310 0.4323259 >> #v5 0.4664948 0.4363925 0.5128866 0.4320310 1.0000000 0.9473451 >> #v6 0.4086076 0.4326113 0.4320310 0.4323259 0.9473451 1.0000000 >> >> factanal(m1, factors=3) # varimax is the default >> >> >> # Output from R >> >> #Call: >> #factanal(x = m1, factors = 3) >> >> #Uniquenesses: >> # v1 v2 v3 v4 v5 v6 >> #0.005 0.101 0.005 0.224 0.084 0.005 >> >> #Loadings: >> # Factor1 Factor2 Factor3 >> #v1 0.944 0.182 0.267 >> #v2 0.905 0.235 0.159 >> #v3 0.236 0.210 0.946 >> #v4 0.180 0.242 0.828 >> #v5 0.242 0.881 0.286 >> #v6 0.193 0.959 0.196 >> >> # Factor1 Factor2 Factor3 > > #SS loadings 1.893 1.886 1.797 >> #Proportion Var 0.316 0.314 0.300 >> #Cumulative Var 0.316 0.630 0.929 >> >> #The degrees of freedom for the model is 0 and the fit was 0.4755 >> >> /* SAS code with example data*/ >> >> data fact; >> input v1-v6; >> datalines; >> 1 1 3 3 1 1 >> 1 2 3 3 1 1 >> 1 1 3 4 1 1 >> 1 1 3 3 1 2 >> 1 1 3 3 1 1 >> 1 1 1 1 3 3 >> 1 2 1 1 3 3 >> 1 1 1 2 3 3 >> 1 2 1 1 3 4 >> 1 1 1 1 3 3 >> 3 3 1 1 1 1 >> 3 4 1 1 1 1 >> 3 3 1 2 1 1 >> 3 3 1 1 1 2 >> 3 3 1 1 1 1 >> 4 4 5 5 6 6 >> 5 6 4 6 4 5 >> 6 5 6 4 5 4 >> ; >> run; >> >> proc factor data=fact rotate=varimax method=p nfactors=3; > > var v1-v6; >> run; >> >> /* Output from SAS*/ >> >> The FACTOR >> Procedure >> Initial Factor Method: Principal >> Components >> >> Prior Communality Estimates: >> ONE >> >> >> >> Eigenvalues of the Correlation Matrix: >> Total = 6 Average = 1 >> >> Eigenvalue Difference >> Proportion Cumulative >> >> 1 3.69603077 2.62291629 >> 0.6160 0.6160 >> 2 1.07311448 0.07234039 >> 0.1789 0.7949 >> 3 1.00077409 0.83977061 >> 0.1668 0.9617 >> 4 0.16100348 0.12004232 >> 0.0268 0.9885 >> 5 0.04096116 0.01284515 >> 0.0068 0.9953 >> 6 0.02811601 >> 0.0047 1.0000 >> >> 3 factors will be retained by the >> NFACTOR criterion. >> >> >> >> Factor Pattern >> >> Factor1 >> Factor2 Factor3 >> >> v1 0.79880 >> 0.54995 -0.17614 >> v2 0.77036 >> 0.56171 -0.24862 >> v3 0.79475 >> -0.07685 0.54982 >> v4 0.75757 >> -0.08736 0.59785 >> v5 0.80878 >> -0.45610 -0.33437 >> v6 0.77771 >> -0.48331 -0.36933 >> >> >> Variance Explained by Each >> Factor >> >> Factor1 Factor2 >> Factor3 >> >> 3.6960308 1.0731145 >> 1.0007741 >> >> >> Final Communality Estimates: >Total >> = 5.769919 >> >> v1 v2 v3 >> v4 v5 v6 >> 0.97154741 0.97078498 0.93983835 >> 0.93897798 0.97394719 0.97482345 >> >> >> >> The FACTOR Procedure >> Rotation Method: >Varimax >> >> Orthogonal Transformation >> Matrix >> >> 1 >> 2 3 >> >> 1 0.58233 >> 0.57714 0.57254 >> 2 -0.64183 >> 0.75864 -0.11193 >> 3 -0.49895 >> -0.30229 0.81220 >> >> >> Rotated Factor Pattern >> >> Factor1 >> Factor2 Factor3 >> >> v1 0.20008 >> 0.93148 0.25272 >> v2 0.21213 >> 0.94590 0.17626 >> v3 0.23781 >> 0.23418 0.91019 >> v4 0.19893 > > 0.19023 0.92909 >> v5 0.93054 >> 0.22185 0.24253 >> v6 0.94736 >> 0.19384 0.19939 >> >> >> Variance Explained by Each >> Factor >> >> Factor1 Factor2 >> Factor3 >> >> 1.9445607 1.9401828 >> 1.8851759 >> >> >> Final Communality Estimates: >Total >> = 5.769919 >> >> v1 v2 v3 > > v4 v5 v6 >> >> 0.97154741 0.97078498 0.93983835 >> 0.93897798 0.97394719 0.97482345 >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org/personality.html Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern University http://www.northwestern.edu/ Attend ISSID/ARP:2009 http://issid.org/issid.2009/
Apparently Analagous Threads
- Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?
- {Lattice} help.
- loadings function (PR#13886)
- ggpliot2: reordering of factors in facets facet.grid(). Reordering of factor on x-axis no problem.
- (Grouped + Stacked) Barplot