Cecilia Carmo
2011-Sep-27 15:01 UTC
[R] Keep consecutive year observations (remove gap's) in panel data (dataframes). Difficulties in using lag(). Package plm.
Hi everyone.
I have two questions. I’ve found some other questions and answers similar to
these but they didn’t solve my problem.
I’m working with a panel of firm/years observations (see my reproducible
example). I’m using the plm package.
My panel not only is unbalanced but also have some gap’s in years.
#reproducible example
data1<-data.frame(year=c(2001,2002,2003,2004,2005,2001,2002,2004,2005,2001,2
002,2003,2005),
firm=c(1,1,1,1,1,2,2,2,2,3,3,3,3),x=c(11,22,32,25,26,47,85,98,101,14,87,56,1
4))
data1
#load package plm and format data
data2<-plm.data(data1,index=c("firm","year"))
First I want to keep for each firm the longest serie of consecutive years.
So I want a dataframe like this (keeping years 2001 and 2002 in firm 2)
year firm x
1 2001 1 11
2 2002 1 22
3 2003 1 32
4 2004 1 25
5 2005 1 26
6 2001 2 47
7 2002 2 85
8 2001 3 14
9 2002 3 87
10 2003 3 56
Or like this (keeping years 2004 and 2005 in firm 2)
year firm x
1 2001 1 11
2 2002 1 22
3 2003 1 32
4 2004 1 25
5 2005 1 26
6 2004 2 98
7 2005 2 101
8 2001 3 14
9 2002 3 87
10 2003 3 56
Second, I need to create a new variable that is the lagged value of x.
I''ve
done
newdata1<-transform(data1,y=lag(x,1))
But it doesn''t work.
I also need to create a new variable that is the opposite of lag().
I''ve
done
newdata2<-transform(data1,z=lag(x,-1))
But, of course, it doesn''t work neither.
Thank you for all your help.
Cecília Carmo
(Universidade de Aveiro – Portugal)
[[alternative HTML version deleted]]
Reasonably Related Threads
- linear model coefficients by year and industry, fitted values, residuals, panel data
- how to handle with gap's in panel data (plm package)
- plm package, R squared, dummies in panel data
- the opposite of lag() in panel data
- GMM, panel data, functions lag() and diff()
