Dear Users,
I ran factor analysis using R and SAS. However, I had different outputs from
R and SAS.
Why they provide different outputs? Especially, the factor loadings are
different.
I did real dataset(n=264), however, I had an extremely different from R and
SAS.
Why this things happened? Which software is correct on?
Thanks in advance,
- TY
#R code with example data
# A little demonstration, v2 is just v1 with noise,
# and same for v4 vs. v3 and v6 vs. v5
# Last four cases are there to add noise
# and introduce a positive manifold (g factor)
v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
m1 <- cbind(v1,v2,v3,v4,v5,v6)
cor(m1)
# v1 v2 v3 v4 v5 v6
#v1 1.0000000 0.9393083 0.5128866 0.4320310 0.4664948 0.4086076
#v2 0.9393083 1.0000000 0.4124441 0.4084281 0.4363925 0.4326113
#v3 0.5128866 0.4124441 1.0000000 0.8770750 0.5128866 0.4320310
#v4 0.4320310 0.4084281 0.8770750 1.0000000 0.4320310 0.4323259
#v5 0.4664948 0.4363925 0.5128866 0.4320310 1.0000000 0.9473451
#v6 0.4086076 0.4326113 0.4320310 0.4323259 0.9473451 1.0000000
factanal(m1, factors=3) # varimax is the default
# Output from R
#Call:
#factanal(x = m1, factors = 3)
#Uniquenesses:
# v1 v2 v3 v4 v5 v6
#0.005 0.101 0.005 0.224 0.084 0.005
#Loadings:
# Factor1 Factor2 Factor3
#v1 0.944 0.182 0.267
#v2 0.905 0.235 0.159
#v3 0.236 0.210 0.946
#v4 0.180 0.242 0.828
#v5 0.242 0.881 0.286
#v6 0.193 0.959 0.196
# Factor1 Factor2 Factor3
#SS loadings 1.893 1.886 1.797
#Proportion Var 0.316 0.314 0.300
#Cumulative Var 0.316 0.630 0.929
#The degrees of freedom for the model is 0 and the fit was 0.4755
/* SAS code with example data*/
data fact;
input v1-v6;
datalines;
1 1 3 3 1 1
1 2 3 3 1 1
1 1 3 4 1 1
1 1 3 3 1 2
1 1 3 3 1 1
1 1 1 1 3 3
1 2 1 1 3 3
1 1 1 2 3 3
1 2 1 1 3 4
1 1 1 1 3 3
3 3 1 1 1 1
3 4 1 1 1 1
3 3 1 2 1 1
3 3 1 1 1 2
3 3 1 1 1 1
4 4 5 5 6 6
5 6 4 6 4 5
6 5 6 4 5 4
;
run;
proc factor data=fact rotate=varimax method=p nfactors=3;
var v1-v6;
run;
/* Output from SAS*/
The FACTOR
Procedure
Initial Factor Method: Principal
Components
Prior Communality Estimates:
ONE
Eigenvalues of the Correlation Matrix:
Total = 6 Average = 1
Eigenvalue Difference
Proportion Cumulative
1 3.69603077 2.62291629
0.6160 0.6160
2 1.07311448 0.07234039
0.1789 0.7949
3 1.00077409 0.83977061
0.1668 0.9617
4 0.16100348 0.12004232
0.0268 0.9885
5 0.04096116 0.01284515
0.0068 0.9953
6 0.02811601
0.0047 1.0000
3 factors will be retained by the
NFACTOR criterion.
Factor Pattern
Factor1
Factor2 Factor3
v1 0.79880
0.54995 -0.17614
v2 0.77036
0.56171 -0.24862
v3 0.79475
-0.07685 0.54982
v4 0.75757
-0.08736 0.59785
v5 0.80878
-0.45610 -0.33437
v6 0.77771
-0.48331 -0.36933
Variance Explained by Each
Factor
Factor1 Factor2
Factor3
3.6960308 1.0731145
1.0007741
Final Communality Estimates: Total
= 5.769919
v1 v2 v3
v4 v5 v6
0.97154741 0.97078498 0.93983835
0.93897798 0.97394719 0.97482345
The FACTOR Procedure
Rotation Method: Varimax
Orthogonal Transformation
Matrix
1
2 3
1 0.58233
0.57714 0.57254
2 -0.64183
0.75864 -0.11193
3 -0.49895
-0.30229 0.81220
Rotated Factor Pattern
Factor1
Factor2 Factor3
v1 0.20008
0.93148 0.25272
v2 0.21213
0.94590 0.17626
v3 0.23781
0.23418 0.91019
v4 0.19893
0.19023 0.92909
v5 0.93054
0.22185 0.24253
v6 0.94736
0.19384 0.19939
Variance Explained by Each
Factor
Factor1 Factor2
Factor3
1.9445607 1.9401828
1.8851759
Final Communality Estimates: Total
= 5.769919
v1 v2 v3
v4 v5 v6
0.97154741 0.97078498 0.93983835
0.93897798 0.97394719 0.97482345
[[alternative HTML version deleted]]
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,> I ran factor analysis using R and SAS. However, I had different outputs from > R and SAS. > Why they provide different outputs? Especially, the factor loadings are > different. > I did real dataset(n=264), however, I had an extremely different from R and > SAS. > Why this things happened? Which software is correct on?factanal uses ML-method for estimating the loadings. SAS and SPSS use the principal component method. Maybe you should better use princomp + varimax. However, the rotated solutions are basically the same :) Yours sincerely Sigbert -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJ0gWdWvYUYQkj1zkRAi/NAJ9gHEcXbYzafE9MEbL8ZnCY/B8inwCeORuB 0vQ81ucg86IrkJl+EJSP1n0=lDR6 -----END PGP SIGNATURE-----
Dear TY, Considering that you used different methods -- maximum-likelihood factor analysis in R and principal components analysis in SAS -- the results are quite similar (although the three rotated factors/components come out in different orders). I hope this helps, John> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]On> Behalf Of Tae-Young Heo > Sent: March-31-09 7:07 AM > To: r-help at r-project.org > Subject: [R] Factor Analysis Output from R and SAS > > Dear Users, > > I ran factor analysis using R and SAS. However, I had different outputsfrom> R and SAS. > Why they provide different outputs? Especially, the factor loadings are > different. > I did real dataset(n=264), however, I had an extremely different from Rand> SAS. > Why this things happened? Which software is correct on? > > Thanks in advance, > > - TY > > #R code with example data > > # A little demonstration, v2 is just v1 with noise, > # and same for v4 vs. v3 and v6 vs. v5 > # Last four cases are there to add noise > # and introduce a positive manifold (g factor) > v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6) > v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5) > v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6) > v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4) > v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5) > v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4) > m1 <- cbind(v1,v2,v3,v4,v5,v6) > cor(m1) > # v1 v2 v3 v4 v5 v6 > #v1 1.0000000 0.9393083 0.5128866 0.4320310 0.4664948 0.4086076 > #v2 0.9393083 1.0000000 0.4124441 0.4084281 0.4363925 0.4326113 > #v3 0.5128866 0.4124441 1.0000000 0.8770750 0.5128866 0.4320310 > #v4 0.4320310 0.4084281 0.8770750 1.0000000 0.4320310 0.4323259 > #v5 0.4664948 0.4363925 0.5128866 0.4320310 1.0000000 0.9473451 > #v6 0.4086076 0.4326113 0.4320310 0.4323259 0.9473451 1.0000000 > > factanal(m1, factors=3) # varimax is the default > > > # Output from R > > #Call: > #factanal(x = m1, factors = 3) > > #Uniquenesses: > # v1 v2 v3 v4 v5 v6 > #0.005 0.101 0.005 0.224 0.084 0.005 > > #Loadings: > # Factor1 Factor2 Factor3 > #v1 0.944 0.182 0.267 > #v2 0.905 0.235 0.159 > #v3 0.236 0.210 0.946 > #v4 0.180 0.242 0.828 > #v5 0.242 0.881 0.286 > #v6 0.193 0.959 0.196 > > # Factor1 Factor2 Factor3 > #SS loadings 1.893 1.886 1.797 > #Proportion Var 0.316 0.314 0.300 > #Cumulative Var 0.316 0.630 0.929 > > #The degrees of freedom for the model is 0 and the fit was 0.4755 > > /* SAS code with example data*/ > > data fact; > input v1-v6; > datalines; > 1 1 3 3 1 1 > 1 2 3 3 1 1 > 1 1 3 4 1 1 > 1 1 3 3 1 2 > 1 1 3 3 1 1 > 1 1 1 1 3 3 > 1 2 1 1 3 3 > 1 1 1 2 3 3 > 1 2 1 1 3 4 > 1 1 1 1 3 3 > 3 3 1 1 1 1 > 3 4 1 1 1 1 > 3 3 1 2 1 1 > 3 3 1 1 1 2 > 3 3 1 1 1 1 > 4 4 5 5 6 6 > 5 6 4 6 4 5 > 6 5 6 4 5 4 > ; > run; > > proc factor data=fact rotate=varimax method=p nfactors=3; > var v1-v6; > run; > > /* Output from SAS*/ > > The FACTOR > Procedure > Initial Factor Method: Principal > Components > > Prior Communality Estimates: > ONE > > > > Eigenvalues of the Correlation Matrix: > Total = 6 Average = 1 > > Eigenvalue Difference > Proportion Cumulative > > 1 3.69603077 2.62291629 > 0.6160 0.6160 > 2 1.07311448 0.07234039 > 0.1789 0.7949 > 3 1.00077409 0.83977061 > 0.1668 0.9617 > 4 0.16100348 0.12004232 > 0.0268 0.9885 > 5 0.04096116 0.01284515 > 0.0068 0.9953 > 6 0.02811601 > 0.0047 1.0000 > > 3 factors will be retained by the > NFACTOR criterion. > > > > Factor Pattern > > Factor1 > Factor2 Factor3 > > v1 0.79880 > 0.54995 -0.17614 > v2 0.77036 > 0.56171 -0.24862 > v3 0.79475 > -0.07685 0.54982 > v4 0.75757 > -0.08736 0.59785 > v5 0.80878 > -0.45610 -0.33437 > v6 0.77771 > -0.48331 -0.36933 > > > Variance Explained by Each > Factor > > Factor1 Factor2 > Factor3 > > 3.6960308 1.0731145 > 1.0007741 > > > Final Communality Estimates:Total> = 5.769919 > > v1 v2 v3 > v4 v5 v6 > 0.97154741 0.97078498 0.93983835 > 0.93897798 0.97394719 0.97482345 > > > > The FACTOR Procedure > Rotation Method:Varimax> > Orthogonal Transformation > Matrix > > 1 > 2 3 > > 1 0.58233 > 0.57714 0.57254 > 2 -0.64183 > 0.75864 -0.11193 > 3 -0.49895 > -0.30229 0.81220 > > > Rotated Factor Pattern > > Factor1 > Factor2 Factor3 > > v1 0.20008 > 0.93148 0.25272 > v2 0.21213 > 0.94590 0.17626 > v3 0.23781 > 0.23418 0.91019 > v4 0.19893 > 0.19023 0.92909 > v5 0.93054 > 0.22185 0.24253 > v6 0.94736 > 0.19384 0.19939 > > > Variance Explained by Each > Factor > > Factor1 Factor2 > Factor3 > > 1.9445607 1.9401828 > 1.8851759 > > > Final Communality Estimates:Total> = 5.769919 > > v1 v2 v3 > v4 v5 v6 > > 0.97154741 0.97078498 0.93983835 > 0.93897798 0.97394719 0.97482345 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
At 8:17 AM -0400 3/31/09, John Fox wrote:>Dear TY, > >Considering that you used different methods -- maximum-likelihood factor >analysis in R and principal components analysis in SAS -- the results are >quite similar (although the three rotated factors/components come out in >different orders). > >I hope this helps, > JohnAs John pointed out, PCA is not the same as FA. Unfortunately, SAS labeled the PCA as a factor analysis, when it is not. And, when one does a principal components and extracts just the first three components (as you did), the unrotated solution is identical:> package(psych) > pc3 <- principal(x,3,rotate="none") > print(pc3,digits=5,cut=0)V PC1 PC2 PC3 V1 1 0.79880 0.549948 -0.17614 V2 2 0.77036 0.561712 -0.24862 V3 3 0.79475 -0.076853 0.54982 V4 4 0.75757 -0.087363 0.59785 V5 5 0.80878 -0.456096 -0.33437 V6 6 0.77771 -0.483310 -0.36933 PC1 PC2 PC3 SS loadings 3.69603 1.07311 1.00077 Proportion Var 0.61601 0.17885 0.16680 Cumulative Var 0.61601 0.79486 0.96165 Test of the hypothesis that 3 factors are sufficient. The degrees of freedom for the model is 0 and the fit was 1.57483 The number of observations was 18 with Chi Square = 19.16046 with prob < NA When you then rotate these components using Varimax, the solutions differ at the fourth decimal place> pc3 <- principal(x,3) > print(pc3,cut=0,digits=5)V PC1 PC2 PC3 V1 1 0.19991 0.93168 0.25210 V2 2 0.21193 0.94606 0.17562 V3 3 0.23805 0.23479 0.90997 V4 4 0.19920 0.19084 0.92891 V5 5 0.93057 0.22225 0.24208 V6 6 0.94738 0.19422 0.19896 PC1 PC2 PC3 SS loadings 1.94470 1.94172 1.88350 Proportion Var 0.32412 0.32362 0.31392 Cumulative Var 0.32412 0.64774 0.96165 Bill> >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] >On >> Behalf Of Tae-Young Heo >> Sent: March-31-09 7:07 AM >> To: r-help at r-project.org >> Subject: [R] Factor Analysis Output from R and SAS >> >> Dear Users, >> >> I ran factor analysis using R and SAS. However, I had different outputs >from >> R and SAS. >> Why they provide different outputs? Especially, the factor loadings are >> different. >> I did real dataset(n=264), however, I had an extremely different from R >and >> SAS. >> Why this things happened? Which software is correct on? >> >> Thanks in advance, >> >> - TY >> >> #R code with example data >> >> # A little demonstration, v2 is just v1 with noise, >> # and same for v4 vs. v3 and v6 vs. v5 >> # Last four cases are there to add noise >> # and introduce a positive manifold (g factor) >> v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6) >> v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5) >> v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6) >> v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4) >> v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5) >> v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4) >> m1 <- cbind(v1,v2,v3,v4,v5,v6) >> cor(m1) >> # v1 v2 v3 v4 v5 v6 >> #v1 1.0000000 0.9393083 0.5128866 0.4320310 0.4664948 0.4086076 >> #v2 0.9393083 1.0000000 0.4124441 0.4084281 0.4363925 0.4326113 >> #v3 0.5128866 0.4124441 1.0000000 0.8770750 0.5128866 0.4320310 >> #v4 0.4320310 0.4084281 0.8770750 1.0000000 0.4320310 0.4323259 >> #v5 0.4664948 0.4363925 0.5128866 0.4320310 1.0000000 0.9473451 >> #v6 0.4086076 0.4326113 0.4320310 0.4323259 0.9473451 1.0000000 >> >> factanal(m1, factors=3) # varimax is the default >> >> >> # Output from R >> >> #Call: >> #factanal(x = m1, factors = 3) >> >> #Uniquenesses: >> # v1 v2 v3 v4 v5 v6 >> #0.005 0.101 0.005 0.224 0.084 0.005 >> >> #Loadings: >> # Factor1 Factor2 Factor3 >> #v1 0.944 0.182 0.267 >> #v2 0.905 0.235 0.159 >> #v3 0.236 0.210 0.946 >> #v4 0.180 0.242 0.828 >> #v5 0.242 0.881 0.286 >> #v6 0.193 0.959 0.196 >> >> # Factor1 Factor2 Factor3 > > #SS loadings 1.893 1.886 1.797 >> #Proportion Var 0.316 0.314 0.300 >> #Cumulative Var 0.316 0.630 0.929 >> >> #The degrees of freedom for the model is 0 and the fit was 0.4755 >> >> /* SAS code with example data*/ >> >> data fact; >> input v1-v6; >> datalines; >> 1 1 3 3 1 1 >> 1 2 3 3 1 1 >> 1 1 3 4 1 1 >> 1 1 3 3 1 2 >> 1 1 3 3 1 1 >> 1 1 1 1 3 3 >> 1 2 1 1 3 3 >> 1 1 1 2 3 3 >> 1 2 1 1 3 4 >> 1 1 1 1 3 3 >> 3 3 1 1 1 1 >> 3 4 1 1 1 1 >> 3 3 1 2 1 1 >> 3 3 1 1 1 2 >> 3 3 1 1 1 1 >> 4 4 5 5 6 6 >> 5 6 4 6 4 5 >> 6 5 6 4 5 4 >> ; >> run; >> >> proc factor data=fact rotate=varimax method=p nfactors=3; > > var v1-v6; >> run; >> >> /* Output from SAS*/ >> >> The FACTOR >> Procedure >> Initial Factor Method: Principal >> Components >> >> Prior Communality Estimates: >> ONE >> >> >> >> Eigenvalues of the Correlation Matrix: >> Total = 6 Average = 1 >> >> Eigenvalue Difference >> Proportion Cumulative >> >> 1 3.69603077 2.62291629 >> 0.6160 0.6160 >> 2 1.07311448 0.07234039 >> 0.1789 0.7949 >> 3 1.00077409 0.83977061 >> 0.1668 0.9617 >> 4 0.16100348 0.12004232 >> 0.0268 0.9885 >> 5 0.04096116 0.01284515 >> 0.0068 0.9953 >> 6 0.02811601 >> 0.0047 1.0000 >> >> 3 factors will be retained by the >> NFACTOR criterion. >> >> >> >> Factor Pattern >> >> Factor1 >> Factor2 Factor3 >> >> v1 0.79880 >> 0.54995 -0.17614 >> v2 0.77036 >> 0.56171 -0.24862 >> v3 0.79475 >> -0.07685 0.54982 >> v4 0.75757 >> -0.08736 0.59785 >> v5 0.80878 >> -0.45610 -0.33437 >> v6 0.77771 >> -0.48331 -0.36933 >> >> >> Variance Explained by Each >> Factor >> >> Factor1 Factor2 >> Factor3 >> >> 3.6960308 1.0731145 >> 1.0007741 >> >> >> Final Communality Estimates: >Total >> = 5.769919 >> >> v1 v2 v3 >> v4 v5 v6 >> 0.97154741 0.97078498 0.93983835 >> 0.93897798 0.97394719 0.97482345 >> >> >> >> The FACTOR Procedure >> Rotation Method: >Varimax >> >> Orthogonal Transformation >> Matrix >> >> 1 >> 2 3 >> >> 1 0.58233 >> 0.57714 0.57254 >> 2 -0.64183 >> 0.75864 -0.11193 >> 3 -0.49895 >> -0.30229 0.81220 >> >> >> Rotated Factor Pattern >> >> Factor1 >> Factor2 Factor3 >> >> v1 0.20008 >> 0.93148 0.25272 >> v2 0.21213 >> 0.94590 0.17626 >> v3 0.23781 >> 0.23418 0.91019 >> v4 0.19893 > > 0.19023 0.92909 >> v5 0.93054 >> 0.22185 0.24253 >> v6 0.94736 >> 0.19384 0.19939 >> >> >> Variance Explained by Each >> Factor >> >> Factor1 Factor2 >> Factor3 >> >> 1.9445607 1.9401828 >> 1.8851759 >> >> >> Final Communality Estimates: >Total >> = 5.769919 >> >> v1 v2 v3 > > v4 v5 v6 >> >> 0.97154741 0.97078498 0.93983835 >> 0.93897798 0.97394719 0.97482345 >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org/personality.html Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern University http://www.northwestern.edu/ Attend ISSID/ARP:2009 http://issid.org/issid.2009/
Apparently Analagous Threads
- Ggplot2: Moving legend, change fill and removal of space between plots when using grid.arrange() possible use of facet_grid?
- {Lattice} help.
- loadings function (PR#13886)
- ggpliot2: reordering of factors in facets facet.grid(). Reordering of factor on x-axis no problem.
- (Grouped + Stacked) Barplot