Help with survey data: Hello R colleagues, I hope this is an appropriate place to direct this question. It relates specifically to the comparability of a 5-point likert to a 4- point likert scale. One question in my dataset asks "How much should be done to reduce the gap between rich and poor" Much more, somewhat more, about the same, somewhat less and much less. The second questions ask: "People who can afford to, should be able to pay for their own health care" strongly agree, agree, disagree, strongly agree. Now, assuming that I rescale them so that 1 equals the most egalitarian position and the highest number (4 or 5) equals the least egalitarian position, how can I make these two results comparable. Two ways come to mind: one is to collapse both into a dichotomous variable and do a logistic regression on both. The danger here is that I have to decide what to do with the middle position in the first question, assign it to the egalitarian or non-egalitarian category. A second way would be to multiply the scores in the first question by 4 (to get results that are either 4, 8, 12, 16 or 20) and then multiply the second question by five to get responses that are either 5, 10, 15 or 20. My idea is then to add the two, average them and use that value as an index of economic egalitarianism? Yes / no? Suggestions? I am an R user and I hope that a purely statistical question is not especially misplaced. Yours truly, Simon Kiss ********************************* Simon J. Kiss, PhD SSHRC and DAAD Post-Doctoral Fellow John F. Kennedy Institute of North America Studies Free University of Berlin Lansstra?e 7-9 14195 Berlin, Germany Cell: +49 (0)1525-300-2812, Web: http://www.jfki.fu-berlin.de/index.html
Dear Simon, These two questions already are comparable (using a strict sense of that word). What is your goal in trying to put them both on the same scale? Even if both were measured 1-5, it would be unreasonable to say that a 3 on one question meant the same as a 3 on the other question because they are fundamentally different question and the distributions both in your data and in the population I am presuming you wish to generalize to are potentially very different. Additionally, the responses and wording of the question are quite different (how much should be done vs. level of agreement). If your goal is to make some sort of composite variable that you are predicting, what about z-scoring both? If you do dichotomize, why not use a theory driven approach? Does someone who thinks 'about the same' in whatever country the data was collected in fit your idea or definition of egalitarian? I imagine this could very quite a bit by country even, depending how much (if anything) was actually being done to lessen the gap between rich and poor. It seems like your question is really more methodological than even statistical (although that may just be me). Best regards, Josh On Thu, Jun 3, 2010 at 6:11 AM, Simon Kiss <sjkiss at gmail.com> wrote:> Help with survey data: > Hello R colleagues, > I hope this is an appropriate place to direct this question. ?It relates > specifically to the comparability of a 5-point likert to a 4-point likert > scale. > > One question in my dataset asks "How much should be done to reduce the gap > between rich and poor" > Much more, somewhat more, about the same, somewhat less and much less. > > The second questions ask: > "People who can afford to, should be able to pay for their own health care" > strongly agree, agree, disagree, strongly agree. > > Now, assuming that I rescale them so that 1 equals the most egalitarian > position and the highest number (4 or 5) equals the least egalitarian > position, how can I make these two results comparable. > > Two ways come to mind: one is to collapse both into a dichotomous variable > and do a logistic regression on both. The danger here is that I have to > decide what to do with the middle position in the first question, assign it > to the egalitarian or non-egalitarian category. > A second way would be to multiply the scores in the first question by 4 (to > get results that are either 4, 8, 12, 16 or 20) and then multiply the second > question by five to get responses that are either 5, 10, 15 or 20. My idea > is then to add the two, average them and use that value as an index of > economic egalitarianism? > Yes / no? Suggestions? > I am an R user and I hope that a purely statistical question is not > especially misplaced. > Yours truly, > Simon Kiss > ********************************* > Simon J. Kiss, PhD > SSHRC and DAAD Post-Doctoral Fellow > John F. Kennedy Institute of North America Studies > Free University of Berlin > Lansstra?e 7-9 > 14195 Berlin, Germany > Cell: +49 (0)1525-300-2812, > Web: http://www.jfki.fu-berlin.de/index.html > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Senior in Psychology University of California, Riverside http://www.joshuawiley.com/
On 06/03/2010 11:11 PM, Simon Kiss wrote:> Help with survey data: > Hello R colleagues, > I hope this is an appropriate place to direct this question. It relates > specifically to the comparability of a 5-point likert to a 4-point > likert scale. > > One question in my dataset asks "How much should be done to reduce the > gap between rich and poor" > Much more, somewhat more, about the same, somewhat less and much less. > > The second questions ask: > "People who can afford to, should be able to pay for their own health care" > strongly agree, agree, disagree, strongly agree. > > Now, assuming that I rescale them so that 1 equals the most egalitarian > position and the highest number (4 or 5) equals the least egalitarian > position, how can I make these two results comparable. > > Two ways come to mind: one is to collapse both into a dichotomous > variable and do a logistic regression on both. The danger here is that I > have to decide what to do with the middle position in the first > question, assign it to the egalitarian or non-egalitarian category. > A second way would be to multiply the scores in the first question by 4 > (to get results that are either 4, 8, 12, 16 or 20) and then multiply > the second question by five to get responses that are either 5, 10, 15 > or 20. My idea is then to add the two, average them and use that value > as an index of economic egalitarianism? > Yes / no? Suggestions? > I am an R user and I hope that a purely statistical question is not > especially misplaced.Hi Simon, Strictly speaking, only the second question is a Likert scale, as that assumes a measure of agreement, not some other quantitative dimension. Assuming that the fourth option on Q2 is "Strongly disagree", and you wish to argue that this and the first option on Q1 ("Much more") both represent the maximally egalitarian responses, you could reverse Q2 and scale it to the same range (i.e. 1,2,3,4,to 5,3.67,2.33,1) so that it would have the same weight in an additive composite score. If I was reviewing a paper that suggested this, I would expect a pretty sound defense of the notion that income redistribution and public health care were strongly linked attitudes. Jim
Reasonably Related Threads
- Problem with recode -Error in parse(text = range[[1]][1]) : unexpected end of input in " c(0"
- help calculating variable based on factor level of another
- Grouping and stacking bar plot for categorical variables
- generate irregular series of dates
- Compare data between two groups/countries on 5-point Likert scale questionnare?