I've read through many postings about principle component analysis in the R-help archives, but haven't been able to piece together the information I need. I'd like to recreate an SPSS-like experience of factor analysis using R. Here's what SPSS produces: 1. Scatterplots of all possible variable pairs, with regression lines. xyplot(my.dataframe) is perfect but for the lack of regression lines. 2. Frequency histograms overlaid with normal curves for each variable. I can do this one at a time; I'd love R to do it in a big layout for all the variables in the data frame. 3. Descriptive statistics of each variable. Jim Lemon's excellent dstats() function does this. Solved. 4. A large correlation matrix for the data frame. The built-in function cov() does this. Solved. 5. KMO (Kaiser-Meyer-Olkin Measure of Sampling Adequacy) and Bartlett test of sphericity on the data frame as a whole. I can't find ways to recreate these tests -- bartlett.test() doesn't produce the type of response that makes sense. 6. Anti-image matricies, including MSA (sampling adequacy) scores for each variable I can't find a way to generate this, maybe because I'm unsure how its calculated. The MSA scores would tell me how strongly each variable measures the data set as a whole, which I could use to guide subsequent factor analysis. 7. Total Variance Explained -- a table listing eigenvalues for each eigenvector, along with the % variance for each eigenvector. This is the best part of the SPSS output. I feel like I'm close to finding the right function in R , but I don't know how to look at the eigenvalues of each component in R. princomp() seems a step in the right direction. 8. Scree plot. No problem, princomp() and screeplot() seem to produce about the right result. 9. Component matrix (lists the variable loading on each factor) factanal() seems to do this, but again the results don't jive with SPSS and I'm unsure why. 10. Factor rotation No problem, factanal(rotation="varimax") does this. If anyone can suggest how to fill in the missing pieces (particularly steps 6 and 7), please do let me know. Thanks! --Ashish. ----- Ashish Ranpura Institute of Cognitive Neuroscience University College London
[Apologies if this posting is appears twice -- I think I was unsuccessful in posting it previously.] I've read through many postings about principle component analysis in the R-help archives, but haven't been able to piece together the information I need. I'd like to recreate an SPSS-like experience of factor analysis using R. Here's what SPSS produces: 1. Scatterplots of all possible variable pairs, with regression lines. xyplot(my.dataframe) is perfect but for the lack of regression lines. 2. Frequency histograms overlaid with normal curves for each variable. I can do this one at a time; I'd love R to do it in a big layout for all the variables in the data frame. 3. Descriptive statistics of each variable. Jim Lemon's excellent dstats() function does this. Solved. 4. A large correlation matrix for the data frame. The built-in function cov() does this. Solved. 5. KMO (Kaiser-Meyer-Olkin Measure of Sampling Adequacy) and Bartlett test of sphericity on the data frame as a whole. I can't find ways to recreate these tests -- bartlett.test() doesn't produce the type of response that makes sense. 6. Anti-image matricies, including MSA (sampling adequacy) scores for each variable I can't find a way to generate this, maybe because I'm unsure how its calculated. The MSA scores would tell me how strongly each variable measures the data set as a whole, which I could use to guide subsequent factor analysis. 7. Total Variance Explained -- a table listing eigenvalues for each eigenvector, along with the % variance for each eigenvector. This is the best part of the SPSS output. I feel like I'm close to finding the right function in R , but I don't know how to look at the eigenvalues of each component in R. princomp() seems a step in the right direction. 8. Scree plot. No problem, princomp() and screeplot() seem to produce about the right result. 9. Component matrix (lists the variable loading on each factor) factanal() seems to do this, but again the results don't jive with SPSS and I'm unsure why. 10. Factor rotation No problem, factanal(rotation="varimax") does this. If anyone can suggest how to fill in the missing pieces (particularly steps 6 and 7), please do let me know. Thanks! --Ashish. ----- Ashish Ranpura Institute of Cognitive Neuroscience University College London