andrewH
2011-Jul-26 01:26 UTC
[R] Package or procedure recommendations for analysis of repeated cross-sections?
I have a survey data set of 6 years and about 1500 persons surveyed per year, with roughly 200 questions per survey. The samples are drawn independently without replacement and are intended to represent the nation (USA). I would like to create something like a synthetic panel, dividing the respondents up into groups and then seeing if year to year changes in the mean value of my independent variable for each group varies with the level or the change in the group mean of my explanatory variable. The grouping would be based on several factors the levels of which denote demographic variables such as income, race, and birth cohort. Each group would consist of all those respondents that are identical in their level of all the selected factors, i.e., it would consistent of all the respondents in the sample who share an identical race, income level, birth cohort, etc. After being imported from an SPSS data set, these variables are implemented as R factors. My dependent variables are measures of ideology and party affiliation; the variables that identify the groups are factors known to be correlated to political ideology for which I wish to control; and my independent variables focus on sources of news and information. My hypothesis is that the change in ideology we have observed over the period for which I have data can be explained in part by changes how these groups get their information. I?m not sure if the ideology change should respond to the level or to the change in level my independent variable. I intend to test both. I was about to try to write this from scratch, but it occurred to me that this is a variety of problem for which a nice package probably already exists, and I could probably find it if I knew the right terminology. I am not enough of a statistician to know the conventional name for the procedure of using subgroupings of cross-sections repeated over time as if they were panels. Moreover, I suspect my procedure of dividing a population into groups based on each combination of the classifying variables has a conventional name, and that looking at differences or ratios of the means of an independent variable over those groups and how they respond to the mean level of an independent variable by group has a name, and that each has one or more good implementation in R. Finally, I was thinking of simply regressing changes in the group means of my independent variable on the group means or changes in the group means of my independent variable. But this throws away information that I know is relevant, though I am not sure how best to use it, e.g. that the groups are of different sizes, so the mean differences or ratios will differ in their variances. I could assume they are normal and do a correction for heteroskedasticity, but if there is a better approach, I?d rather use it. My apologies if this question is unduely basic. I did two semesters of graduate econometrics once, but that was more than a decade ago, and I fear that, like many with a superficial knowledge of econometrics, I tend to see every research question in terms of OLS or GLM, even if that is not the right model for the problem. Any help or suggestions would be greatly appreciated. Sincerely, andrewH -- View this message in context: http://r.789695.n4.nabble.com/Package-or-procedure-recommendations-for-analysis-of-repeated-cross-sections-tp3694587p3694587.html Sent from the R help mailing list archive at Nabble.com.
andrewH
2011-Jul-26 19:53 UTC
[R] Package or procedure recommendations for analysis of repeated cross-sections?
OK, Ive done more research, and I think that what I am looking for is "repeated cross section" or "pseudo-panel" estimators. Does anyone know if these have been implimented inany r package? -- View this message in context: http://r.789695.n4.nabble.com/Package-or-procedure-recommendations-for-analysis-of-repeated-cross-sections-tp3694587p3696832.html Sent from the R help mailing list archive at Nabble.com.