Renzo Giudice
2017-May-09 21:09 UTC
[R] Estimating cluster standard errors in Diff-in-Diff panel models with plm
Hi,
I want to estimate the cluster SE of a differences-in-differences
panel model with 100 groups, 6,156 individuals and 15 years. Some of
the individuals are repeated (4,201 unique) because they are part of a
matched sample obtained with a one-to-one, with replacement, matching
method.
I have been using plm to estimate the model coefficients, after
transforming my matched sample into a pdata.frame by using indivuals
and years as indexes. I have also been able to estimate the cluster
standard errors at the individual level by using the vcovHC function.
However, these individuals are clustered within the groups, and
therefore I want to cluster at this higher level of aggregation rather
than at the individual level. Unfortunately, it is not clear to me how
to proceed. Of course if I replace the individuals for groups in the
index I get repeated row.names and then I can?t estimate the panel
model with plm. I get the following error message:
Error in `row.names<-.data.frame`(`*tmp*`, value = c("1-1",
"1-1",
"1-1", : duplicate 'row.names' are not allowed
For simplicity, I make my case using the following example (copied
from: http://www.richard-bluhm.com/clustered-ses-in-r-and-stata-2/):
# load packages
require(plm)
require(lmtest)
# get data and load as pdata.frame
url <-
"http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.txt"
p.df <- read.table(url)
names(p.df) <- c("firmid", "year", "x",
"y")
#Introduce group (State) Id
p.df$State <- rep(1:100, each=50)
p.df2 <- pdata.frame(p.df, index = c("State", "year"),
drop.index = F,
row.names = T)
# fit model with plm
pm1 <- plm(y ~ x, data = p.df2, model = "within") #this is where
the
error occurs.
So is there any way I could cluster SE at the group level using plm?
Any other comments would be highly appreciated.
Thanks in advance!
Renzo
Center for Development Research
University of Bonn