Sotiris Adamakis
2010-Dec-21 18:52 UTC
[R] "variable lengths differ (found for '(weights)')" error in Zelig library
Dear R users, I am trying to estimate to estimate the average treatmen effect on the treated (ATT) using first the MatchIt software to weight the data set and, after this, the Zelig software as shown in Ho et al. (2007). See here for an explanation of how to apply this technique in R: http://imai.princeton.edu/research/files/matchit.pdf I encounter a slight problem when I apply the weights that are produced in the stage of preprocessing the data. The idea of this is to use the MatchIt software to preprocess the data and then use the Zelig software to generate the distribution of ATT. I believe that the main reason for preprocessing the data is to create weights (depending on the matching technique you use) so that balance would be achieved for the matching variables between the treatment and the control group. Then you use these weights in the regressions that follow in the Zelig library. Copied from the matchit article, whose link I provide above, the authors say: "If one chooses options that allow matching with replacement, or any solution that has different numbers of controls (or treateds) within each subclass or strata (such as full matching), then the parametric analysis following matching must accomodate these procedures, such as by using fixed effects or weights, as appropriate. (Similar procedures can also be used to estimate various other quantities of interest such as the average treatment effect by computing it for all observations, but then one must be aware that the quantity of interest may change during the matching procedure as some control units may be dropped.)" The following code is for the "lalonde" data set, where I get an error message in the end:> library(Zelig) > library(MatchIt) > data(lalonde) > m.out1 = matchit(treat ~ age + educ + black + hispan + nodegree + married+ re74 + re75, method = "subclass", subclass=6, data = lalonde)> z.out1 = zelig(re78 ~ age + educ + black + hispan + nodegree + married +re74 + re75, data = match.data(m.out1, "control"), model = "ls", weights="weights")> x.out1 = setx(z.out1, data = match.data(m.out1, "treat"), cond = TRUE) > s.out1 = sim(z.out1, x = x.out1)Error in model.frame.default(formula = re78 ~ age + educ + black + hispan + : variable lengths differ (found for '(weights)') I was wondering if somebody could tell me how to get around with this problem? Also, I have seen people adding the propensity scores in the regression analysis applied in the Zelig package, i.e.> z.out1 = zelig(re78 ~ age + educ + black + hispan + nodegree + married +re74 + re75 + *distance*, data = match.data(m.out1, "control"), model "ls", weights="weights") Does anyone have a clue of why this can happen? Kind regards, Sotiris [[alternative HTML version deleted]]