Janka Vanschoenwinkel
2015-Jan-23 09:46 UTC
[R] Testing for significant differences between groups in multiple linear regression
Dear R-colleagues, I am looking for a way to test whether one regression has significant different coefficients and overall results for 10 groups (grouping variable is "irr"). *What I have* The regression is: Depend = temp + temp? + perc + perc? + conti ? split up for multiple groups of irr *Dataset = Alldata (real dataset has over 50000 IDs)* *ID* *irr * *(= grouping variable)* *temp* *perc* *conti* *Depend* *w* 1 1 10 34 26 8 23 2 1 11 36 27 6 58 3 1 26 57 45 3 76 4 2 23 68 24 2 4 5 2 6 26 8 1 323 6 2 3 17 56 6 45 7 3 17 39 17 5 57 I can obtain the different regression coefficients for the different groups with the following code (other codes are possible as wel). datairrigation <- split(Alldata, Alldata$irr) model.per.irrigation <- lapply(datairrigation, function (x) { lm(Depend~ temp + temp? + perc + perc? + conti, weights=w, data = x) }) OR I can do it manually by splitting all the data in subsets (and then I also receive the R??) *What I don?t have* However, now I don?t know how to compare those regressions to test whether they differ significantly over all the groups. (Preferably, I would like to test the coefficients individually (temp(group 1) = temp(group2)) and the regression as a whole between the groups.) *Note* I know that one way to test differences in significance between groups, is to use dummy variables of that group, in the regression. Yet, this is no option for my model because it only allows exogenous variables in the regression (and irrigation is an endogenous variable because the farmer can decide himself if he irrigates or not). Thank you very much in advance! I really appreciate your help! Janka P Please consider the environment before printing this e-mail [[alternative HTML version deleted]]
Bert Gunter
2015-Jan-23 17:43 UTC
[R] Testing for significant differences between groups in multiple linear regression
Look no further! The answer is yes. However, if you are interested in why your query is probably nonsense and why overall tests of significance are a **really bad idea** in most scientific contexts (imho, anyway), then I suggest you post to a statistical list like stats.stackexchange.com . ... oh, and while you're at it, please read the posting guide for this list (see link below) and, in particular, DO NOT POST IN HTML, which, as you can see here, often becomes a mess on this **plain text** mailing list. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Fri, Jan 23, 2015 at 1:46 AM, Janka Vanschoenwinkel <janka.vanschoenwinkel at uhasselt.be> wrote:> Dear R-colleagues, > > I am looking for a way to test whether one regression has significant > different coefficients and overall results for 10 groups (grouping variable > is "irr"). > > > > *What I have* > > The regression is: > > Depend = temp + temp? + perc + perc? + conti ? split up for multiple groups > of irr > > > *Dataset = Alldata (real dataset has over 50000 IDs)* > > *ID* > > *irr * > > *(= grouping variable)* > > *temp* > > *perc* > > *conti* > > *Depend* > > *w* > > 1 > > 1 > > 10 > > 34 > > 26 > > 8 > > 23 > > 2 > > 1 > > 11 > > 36 > > 27 > > 6 > > 58 > > 3 > > 1 > > 26 > > 57 > > 45 > > 3 > > 76 > > 4 > > 2 > > 23 > > 68 > > 24 > > 2 > > 4 > > 5 > > 2 > > 6 > > 26 > > 8 > > 1 > > 323 > > 6 > > 2 > > 3 > > 17 > > 56 > > 6 > > 45 > > 7 > > 3 > > 17 > > 39 > > 17 > > 5 > > 57 > > > > I can obtain the different regression coefficients for the different groups > with the following code (other codes are possible as wel). > > > datairrigation <- split(Alldata, Alldata$irr) > > model.per.irrigation <- lapply(datairrigation, function (x) { > > lm(Depend~ temp + temp? + perc + perc? + conti, > > weights=w, data = x) > > }) > > > OR I can do it manually by splitting all the data in subsets (and then I > also receive the R??) > > > > *What I don?t have* > > However, now I don?t know how to compare those regressions to test whether > they differ significantly over all the groups. > > (Preferably, I would like to test the coefficients individually (temp(group > 1) = temp(group2)) and the regression as a whole between the groups.) > > > > *Note* > > I know that one way to test differences in significance between groups, is > to use dummy variables of that group, in the regression. Yet, this is no > option for my model because it only allows exogenous variables in the > regression (and irrigation is an endogenous variable because the farmer can > decide himself if he irrigates or not). > > > > Thank you very much in advance! I really appreciate your help! > > > Janka > > > P Please consider the environment before printing this e-mail > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.