Hello folks, I have recently finished a pilot study of a survey and am working to complete the statistical analysis of the results in R. My Phd is technically in computer science (not statistics), although I teach basic stats and have a "decent" working knowledge of the area. With that said, my expertise in psychometrics theory and factor analysis is weaker, so I thought I would send an email here and try to solicit some advice on the proper technique for my analysis. First, in the survey, I have a series of "concepts" and word choices regarding those concepts (e.g., how well does concept A relate to words A1 through AN), which each participant rates on a scale from 1 to 10. For each question, I've gathered a significant amount of data with various answers to the questions. Now, what I'm most interested in is gathering whether there were differences, for each answer in each question, between group A and B. The total difference between A and B summed across all questions and answers in the survey, isn't very meaningful. Similarly, the relationship between questions are not meaningful at all, nor is the rate of change (if any) between questions. In other words, there are probably correlations between questions, as there are with many surveys, but they aren't of interest here. It seems like there would be a few ways to tackle this. Since I'm only interested the relationship between a list of answers to each question individually, I was thinking I could run a simple ANOVA for each question with appropriate post-hoc tests, but I'm not sure. First, there are quite a few questions (about 12), and I'm a little worried about inflating my family-wise error. Now, I could lower my alpha, but ... Second, I know in some branches of survey analysis, they use factor analysis and a series of complicated measures for determining the consistency of the survey itself. Since the relationships between questions doesn't have any significant meaning, I'm not sure if that sort of analysis is the right way to go here or not. For example, if a particular metric (chronbach's alpha), said the survey was consistent or not, I don't know what that would even mean in this case. As for the data itself, it looks pretty good. Skew and Kurtosis values look fine, the data appears reasonably normally distributed. There was no discussion between participants or correlated error in that. In graphing and going through the data, I don't see anything that pops out as unusual. A couple questions: 1. Should I even be concerned about running measures for survey consistency (chronbach's alpha or some kind of factor analysis related measures) if I'm not particularly interested in the relationship between questions? 2. Should I run something more complex, like a MANOVA, in this case, to try and weed out any correlated errors between the questions? Would a Wilks' Lambda score even hold any meaning in a case like this, where the correlations between the questions are quite coincidental anyway? Or maybe I'm barking up the wrong tree completely and I should be doing a thorough analysis of internal consistency measures, as that tells me something I'm not quite realizing. Any hints out there from the R community, perhaps from folks that do more survey analysis than I do? Andreas Stefik, Ph.D. Department of Computer Science Southern Illinois University Edwardsville [[alternative HTML version deleted]]
Hello folks, I have recently finished a pilot study of a survey and am working to complete the statistical analysis of the results in R. My Phd is technically in computer science (not statistics), although I teach basic stats and have a "decent" working knowledge of the area. With that said, my expertise in psychometrics theory and factor analysis is weaker, so I thought I would send an email here and try to solicit some advice on the proper technique for my analysis. First, in the survey, I have a series of "concepts" and word choices regarding those concepts (e.g., how well does concept A relate to words A1 through AN), which each participant rates on a scale from 1 to 10. For each question, I've gathered a significant amount of data with various answers to the questions. Now, what I'm most interested in is gathering whether there were differences, for each answer in each question, between group A and B. The total difference between A and B summed across all questions and answers in the survey, isn't very meaningful. Similarly, the relationship between questions are not meaningful at all, nor is the rate of change (if any) between questions. In other words, there are probably correlations between questions, as there are with many surveys, but they aren't of interest here. It seems like there would be a few ways to tackle this. Since I'm only interested the relationship between a list of answers to each question individually, I was thinking I could run a simple ANOVA for each question with appropriate post-hoc tests, but I'm not sure. First, there are quite a few questions (about 12), and I'm a little worried about inflating my family-wise error. Now, I could lower my alpha, but ... Second, I know in some branches of survey analysis, they use factor analysis and a series of complicated measures for determining the consistency of the survey itself. Since the relationships between questions doesn't have any significant meaning, I'm not sure if that sort of analysis is the right way to go here or not. For example, if a particular metric (chronbach's alpha), said the survey was consistent or not, I don't know what that would even mean in this case. As for the data itself, it looks pretty good. Skew and Kurtosis values look fine, the data appears reasonably normally distributed. There was no discussion between participants or correlated error in that. In graphing and going through the data, I don't see anything that pops out as unusual. A couple questions: 1. Should I even be concerned about running measures for survey consistency (chronbach's alpha or some kind of factor analysis related measures) if I'm not particularly interested in the relationship between questions? 2. Should I run something more complex, like a MANOVA, in this case, to try and weed out any correlated errors between the questions? Would a Wilks' Lambda score even hold any meaning in a case like this, where the correlations between the questions are quite coincidental anyway? Or maybe I'm barking up the wrong tree completely and I should be doing a thorough analysis of internal consistency measures, as that tells me something I'm not quite realizing. Any hints out there from the R community, perhaps from folks that do more survey analysis than I do? Andreas Stefik, Ph.D. Department of Computer Science Southern Illinois University Edwardsville [[alternative HTML version deleted]]