I have a dataframe with many firm-year observations and many variables. Not all firms have information for all the years. I want another dataframe with only those firms that have information all years. This is, I want a balanced panel data, but with the maximum number of years. In my reprocucible example I want to keep firms 1,2 and 3 (period 2000 to 2004). I need your help to create a code for this. Thank you very much, Cecília Carmo (Universidade de Aveiro) #My reproducible example: firm<-sort(rep(1:3,5),decreasing=F) year<-rep(2000:2004,3) X<-rnorm(15) data1<-data.frame(firm,year,X) data1 firm<-sort(rep(4:6,3),decreasing=F) year<-rep(2001:2003,3) X<-rnorm(9) data2<-data.frame(firm,year,X) data2 finaldata<-rbind(data1,data2) finaldata [[alternative HTML version deleted]]
# If you know how many years are needed you could do this
makenewtable <- function(x, years) {
xlist <- split(x, x$firm)
new <- list()
dat <- lapply(xlist, function(z) if(length(unique(z$year)) == years) {new
<- z} )
dat_ <- do.call(rbind, dat)
return(dat_)
}
makenewtable(finaldata, 5)
Scott
On Thursday, May 19, 2011 at 6:24 AM, Cecilia Carmo wrote:
I have a dataframe with many firm-year observations and many variables.
>
> Not all firms have information for all the years.
>
> I want another dataframe with only those firms that have information all
> years.
>
> This is, I want a balanced panel data, but with the maximum number of
years.
>
> In my reprocucible example I want to keep firms 1,2 and 3 (period 2000 to
> 2004).
>
>
>
> I need your help to create a code for this.
>
>
>
> Thank you very much,
>
>
>
> Cecília Carmo
>
> (Universidade de Aveiro)
>
>
>
>
>
> #My reproducible example:
>
> firm<-sort(rep(1:3,5),decreasing=F)
>
> year<-rep(2000:2004,3)
>
> X<-rnorm(15)
>
> data1<-data.frame(firm,year,X)
>
> data1
>
>
>
> firm<-sort(rep(4:6,3),decreasing=F)
>
> year<-rep(2001:2003,3)
>
> X<-rnorm(9)
>
> data2<-data.frame(firm,year,X)
>
> data2
>
>
>
> finaldata<-rbind(data1,data2)
>
> finaldata
>
>
> [[alternative HTML version deleted]]
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
It works!
Thank you.
Cecília
De: Scott Chamberlain [mailto:scttchamberlain4@gmail.com]
Enviada: quinta-feira, 19 de Maio de 2011 13:40
Para: Cecilia Carmo
Cc: r-help@r-project.org
Assunto: Re: [R] balanced panel data
# If you know how many years are needed you could do this
makenewtable <- function(x, years) {
xlist <- split(x, x$firm)
new <- list()
dat <- lapply(xlist, function(z) if(length(unique(z$year)) == years) {new
<- z} )
dat_ <- do.call(rbind, dat)
return(dat_)
}
makenewtable(finaldata, 5)
Scott
On Thursday, May 19, 2011 at 6:24 AM, Cecilia Carmo wrote:
I have a dataframe with many firm-year observations and many variables.
Not all firms have information for all the years.
I want another dataframe with only those firms that have information all
years.
This is, I want a balanced panel data, but with the maximum number of years.
In my reprocucible example I want to keep firms 1,2 and 3 (period 2000 to
2004).
I need your help to create a code for this.
Thank you very much,
Cecília Carmo
(Universidade de Aveiro)
#My reproducible example:
firm<-sort(rep(1:3,5),decreasing=F)
year<-rep(2000:2004,3)
X<-rnorm(15)
data1<-data.frame(firm,year,X)
data1
firm<-sort(rep(4:6,3),decreasing=F)
year<-rep(2001:2003,3)
X<-rnorm(9)
data2<-data.frame(firm,year,X)
data2
finaldata<-rbind(data1,data2)
finaldata
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]