Christian Schoder
2010-Oct-01  21:53 UTC
[R] plm: lag() and diff() do not (always) recognize a gap in the time dimension
Hello! I came accross a strange behavior of the plm package. When using an unbalanced panel with years lag() and diff() do not recognize a break in the time dimension. It does, however, if there is only one more year after the break. This is strange, right? Consider the following example:> library(plm)Loading required package: kinship Loading required package: survival Loading required package: splines Loading required package: nlme Loading required package: lattice [1] "kinship is loaded" Loading required package: Formula Loading required package: MASS Loading required package: sandwich Loading required package: zoo> dat.raw <- data.frame(id=c("b", "b","b", "c","c","c", "d","d","d","d","d","d","d"), t=c(1980,1981,1983,1982,1983,1985,1984,1985,1988,1989,1993,1994,1995), y=c(1,2,3,2,4,5,6,5,6,7,-2,1,3))> dat.raw <- pdata.frame(dat.raw, index=c("id", "t"), drop.index=FALSE)> dat.raw$l1<-lag(dat.raw$y,1)> dat.raw$l2<-diff(dat.raw$y,1)> dat.rawid t y l1 l2 b-1980 b 1980 1 NA NA b-1981 b 1981 2 1 1 b-1983 b 1983 3 NA NA c-1982 c 1982 2 NA NA c-1983 c 1983 4 2 2 c-1985 c 1985 5 NA NA d-1984 d 1984 6 NA NA d-1985 d 1985 5 6 -1 d-1988 d 1988 6 5 1 d-1989 d 1989 7 6 1 d-1993 d 1993 -2 7 -9 d-1994 d 1994 1 -2 3 d-1995 d 1995 3 1 2 Why do the last two entries in the third and sixth rows indicate NA, while row 11 does not seem to recognize the gap? Would you have a solution to this problem for me? My dataset is huge and I cannot edit it manually. Thank you! Christian