Renzo Giudice
2017-May-09 21:09 UTC
[R] Estimating cluster standard errors in Diff-in-Diff panel models with plm
Hi, I want to estimate the cluster SE of a differences-in-differences panel model with 100 groups, 6,156 individuals and 15 years. Some of the individuals are repeated (4,201 unique) because they are part of a matched sample obtained with a one-to-one, with replacement, matching method. I have been using plm to estimate the model coefficients, after transforming my matched sample into a pdata.frame by using indivuals and years as indexes. I have also been able to estimate the cluster standard errors at the individual level by using the vcovHC function. However, these individuals are clustered within the groups, and therefore I want to cluster at this higher level of aggregation rather than at the individual level. Unfortunately, it is not clear to me how to proceed. Of course if I replace the individuals for groups in the index I get repeated row.names and then I can?t estimate the panel model with plm. I get the following error message: Error in `row.names<-.data.frame`(`*tmp*`, value = c("1-1", "1-1", "1-1", : duplicate 'row.names' are not allowed For simplicity, I make my case using the following example (copied from: http://www.richard-bluhm.com/clustered-ses-in-r-and-stata-2/): # load packages require(plm) require(lmtest) # get data and load as pdata.frame url <- "http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.txt" p.df <- read.table(url) names(p.df) <- c("firmid", "year", "x", "y") #Introduce group (State) Id p.df$State <- rep(1:100, each=50) p.df2 <- pdata.frame(p.df, index = c("State", "year"), drop.index = F, row.names = T) # fit model with plm pm1 <- plm(y ~ x, data = p.df2, model = "within") #this is where the error occurs. So is there any way I could cluster SE at the group level using plm? Any other comments would be highly appreciated. Thanks in advance! Renzo Center for Development Research University of Bonn