Cecilia Carmo
2011-Sep-27 15:01 UTC
[R] Keep consecutive year observations (remove gap's) in panel data (dataframes). Difficulties in using lag(). Package plm.
Hi everyone. I have two questions. I’ve found some other questions and answers similar to these but they didn’t solve my problem. I’m working with a panel of firm/years observations (see my reproducible example). I’m using the plm package. My panel not only is unbalanced but also have some gap’s in years. #reproducible example data1<-data.frame(year=c(2001,2002,2003,2004,2005,2001,2002,2004,2005,2001,2 002,2003,2005), firm=c(1,1,1,1,1,2,2,2,2,3,3,3,3),x=c(11,22,32,25,26,47,85,98,101,14,87,56,1 4)) data1 #load package plm and format data data2<-plm.data(data1,index=c("firm","year")) First I want to keep for each firm the longest serie of consecutive years. So I want a dataframe like this (keeping years 2001 and 2002 in firm 2) year firm x 1 2001 1 11 2 2002 1 22 3 2003 1 32 4 2004 1 25 5 2005 1 26 6 2001 2 47 7 2002 2 85 8 2001 3 14 9 2002 3 87 10 2003 3 56 Or like this (keeping years 2004 and 2005 in firm 2) year firm x 1 2001 1 11 2 2002 1 22 3 2003 1 32 4 2004 1 25 5 2005 1 26 6 2004 2 98 7 2005 2 101 8 2001 3 14 9 2002 3 87 10 2003 3 56 Second, I need to create a new variable that is the lagged value of x. I''ve done newdata1<-transform(data1,y=lag(x,1)) But it doesn''t work. I also need to create a new variable that is the opposite of lag(). I''ve done newdata2<-transform(data1,z=lag(x,-1)) But, of course, it doesn''t work neither. Thank you for all your help. Cecília Carmo (Universidade de Aveiro – Portugal) [[alternative HTML version deleted]]
Apparently Analagous Threads
- linear model coefficients by year and industry, fitted values, residuals, panel data
- how to handle with gap's in panel data (plm package)
- plm package, R squared, dummies in panel data
- the opposite of lag() in panel data
- GMM, panel data, functions lag() and diff()