Hi everyone, my data is in a dataframe similar to this but with more firms, more industries, more years and variables that correspond to financial information:>firm<-c(rep(1,4),rep(2,4),rep(3,4),rep(4,4)) >year<-c(rep(2000:2003,4)) >industry<-c(rep(10,4),rep(20,4),rep(10,4),rep(30,4)) >X1<-c(10,14,18,16,20,45,23,54,24,67,98,58,76,34,23,89) >X2<-c(11,46,89,36,72,78,55,44,22,78,53,25,12,45,87,23) >Y<-c(12,45,32,69,87,54,33,22,89,66,35,23,15,54,67,87) >data<-data.frame(firm, industry,year,Y,X1,X2) >data >firm industry year Y X1 X21 10 2000 12 10 11 1 10 2001 45 14 46 1 10 2002 32 18 89 1 10 2003 69 16 36 2 20 2000 87 20 72 2 20 2001 54 45 78 2 20 2002 33 23 55 2 20 2003 22 54 44 3 10 2000 89 24 22 3 10 2001 66 67 78 3 10 2002 35 98 53 3 10 2003 23 58 25 4 30 2000 15 76 12 4 30 2001 54 34 45 4 30 2002 67 23 87 4 30 2003 87 89 23 I need to obtain the coefficients and the statistics by year and by industry, for the lm function: ff<-Y~b1 + b2.X1 + b3.X3 So what I?ve done was: subset the dataframe by year, so I have 3 dataframes corresponding to the 3 years (dataframe2000, dataframe2001, ) and then I applied a function that I?ve found in R-help mails: coef2000<-as.data.frame(t(sapply(split(dataframe2000,dataframe2000$industry),function(x){coef(lm(ff,data=x))}))) I need help in two ways: First: I?d like to have in the dataframe of the coefficients more statistics information that helps me to understand the significance of the coefficients; and Second: Is that possible to obtain this output for all years at once? Thanks in advance, Cec?lia Carmo (Universidade de Aveiro ? Portugal)
Gabor Grothendieck
2009-Jun-15 13:34 UTC
[R] how to obtain lm statistics for multiple subsets
Check out: http://tolstoy.newcastle.edu.au/R/e2/help/07/05/17714.html On Mon, Jun 15, 2009 at 9:26 AM, Cecilia Carmo<cecilia.carmo at ua.pt> wrote:> Hi everyone, my data is in a dataframe similar to this but with more firms, > more industries, more years and variables that correspond to financial > information: > >> firm<-c(rep(1,4),rep(2,4),rep(3,4),rep(4,4)) >> year<-c(rep(2000:2003,4)) >> industry<-c(rep(10,4),rep(20,4),rep(10,4),rep(30,4)) >> X1<-c(10,14,18,16,20,45,23,54,24,67,98,58,76,34,23,89) >> X2<-c(11,46,89,36,72,78,55,44,22,78,53,25,12,45,87,23) >> Y<-c(12,45,32,69,87,54,33,22,89,66,35,23,15,54,67,87) >> data<-data.frame(firm, industry,year,Y,X1,X2) >> data >> firm industry year ?Y X1 X2 > > ?1 ? ? ? 10 2000 12 10 11 > 1 ? ? ? 10 2001 45 14 46 > 1 ? ? ? 10 2002 32 18 89 > 1 ? ? ? 10 2003 69 16 36 > 2 ? ? ? 20 2000 87 20 72 > 2 ? ? ? 20 2001 54 45 78 > 2 ? ? ? 20 2002 33 23 55 > 2 ? ? ? 20 2003 22 54 44 > 3 ? ? ? 10 2000 89 24 22 > 3 ? ? ? 10 2001 66 67 78 > ?3 ? ? ? 10 2002 35 98 53 > 3 ? ? ? 10 2003 23 58 25 > 4 ? ? ? 30 2000 15 76 12 > 4 ? ? ? 30 2001 54 34 45 > 4 ? ? ? 30 2002 67 23 87 > 4 ? ? ? 30 2003 87 89 23 > > I need to obtain the coefficients and the statistics by year and by > industry, for the lm function: > ff<-Y~b1 + b2.X1 + b3.X3 > > So what I?ve done was: subset the dataframe by year, so I have 3 dataframes > corresponding to the 3 years (dataframe2000, dataframe2001, ? ) and then I > applied a function that I?ve found in R-help mails: > coef2000<-as.data.frame(t(sapply(split(dataframe2000,dataframe2000$industry),function(x){coef(lm(ff,data=x))}))) > > I need help in two ways: > First: I?d like to have in the dataframe of the coefficients more statistics > information that helps me to understand the significance of the > coefficients; and > Second: Is that possible to obtain this output for all years at once? > Thanks in advance, > Cec?lia Carmo (Universidade de Aveiro ? Portugal) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >