Matthew Carroll
2010-Apr-12 09:15 UTC
[R] Interpreting factor*numeric interaction coefficients
Dear all, I am a relative novice with R, so please forgive any terrible errors... I am working with a GLM that describes a response variable as a function of a categorical variable with three levels and a continuous variable. These two predictor variables are believed to interact. An example of such a model follows at the bottom of this message, but here is a section of its summary table: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.220186 0.539475 2.262 0.0237 * var1 0.028182 0.050850 0.554 0.5794 cat2 -0.112454 0.781137 -0.144 0.8855 cat3 0.339589 0.672828 0.505 0.6138 var1:cat2 0.007091 0.068072 0.104 0.9170 var1:cat3 -0.027248 0.064468 -0.423 0.6725 I am having trouble interpreting this output. I think I understand that: # the 'var1' value refers to the slope of the relationship within the first factor level # the 'cat2' and 'cat3' values refer to the difference in intercept from 'cat1' # the interaction terms describe the difference in slope between the relationship in 'cat1' and that in 'cat2' and 'cat3' respectively Therefore, if I wanted a single value to describe the slope in either cat2 or cat3, I would sum the interaction value with that of var1. However, if I wanted to report a standard error for the slope in 'cat2', how would I go about doing this? Is the reported standard error that for the overall slope for that factor level, or is the actual standard error a function of the standard error of var1 and that of the interaction? Any help with this would be much appreciated, Matthew Carroll ### example code resp <- rpois(30, 5) cat <- factor(rep(c(1:3), 10)) var1 <- rnorm(30, 10, 3) mod <- glm(resp ~ var1 * cat, family="poisson") summary(mod) Call: glm(formula = resp ~ var1 * cat, family = "poisson") Deviance Residuals: Min 1Q Median 3Q Max -1.80269 -0.54107 -0.06169 0.51819 1.58169 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.220186 0.539475 2.262 0.0237 * var1 0.028182 0.050850 0.554 0.5794 cat2 -0.112454 0.781137 -0.144 0.8855 cat3 0.339589 0.672828 0.505 0.6138 var1:cat2 0.007091 0.068072 0.104 0.9170 var1:cat3 -0.027248 0.064468 -0.423 0.6725 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 23.222 on 29 degrees of freedom Residual deviance: 22.192 on 24 degrees of freedom AIC: 133.75 Number of Fisher Scoring iterations: 5 -- Matthew Carroll E-mail: mjc510 at york.ac.uk
ONKELINX, Thierry
2010-Apr-12 09:47 UTC
[R] Interpreting factor*numeric interaction coefficients
Dear Matthew, The easiest way the get the estimates (and their standard error) for the different slopes it to reparametrise your model. Use resp ~ var1 : cat + 0 instead of resp ~ var1 * cat HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey> -----Oorspronkelijk bericht----- > Van: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Namens Matthew Carroll > Verzonden: maandag 12 april 2010 11:16 > Aan: r-help at r-project.org > Onderwerp: [R] Interpreting factor*numeric interaction coefficients > > Dear all, > I am a relative novice with R, so please forgive any terrible > errors... > > I am working with a GLM that describes a response variable as > a function of a categorical variable with three levels and a > continuous variable. These two predictor variables are > believed to interact. > An example of such a model follows at the bottom of this > message, but here is a section of its summary table: > > Estimate Std. Error z value Pr(>|z|) > (Intercept) 1.220186 0.539475 2.262 0.0237 * > var1 0.028182 0.050850 0.554 0.5794 > cat2 -0.112454 0.781137 -0.144 0.8855 > cat3 0.339589 0.672828 0.505 0.6138 > var1:cat2 0.007091 0.068072 0.104 0.9170 > var1:cat3 -0.027248 0.064468 -0.423 0.6725 > > I am having trouble interpreting this output. > I think I understand that: > > # the 'var1' value refers to the slope of the relationship > within the first factor level > > # the 'cat2' and 'cat3' values refer to the difference in > intercept from 'cat1' > > # the interaction terms describe the difference in slope > between the relationship in 'cat1' and that in 'cat2' and > 'cat3' respectively > > Therefore, if I wanted a single value to describe the slope > in either cat2 or cat3, I would sum the interaction value > with that of var1. > > However, if I wanted to report a standard error for the slope > in 'cat2', how would I go about doing this? Is the reported > standard error that for the overall slope for that factor > level, or is the actual standard error a function of the > standard error of var1 and that of the interaction? > > Any help with this would be much appreciated, > > Matthew Carroll > > > ### example code > > resp <- rpois(30, 5) > cat <- factor(rep(c(1:3), 10)) > var1 <- rnorm(30, 10, 3) > > mod <- glm(resp ~ var1 * cat, family="poisson") > summary(mod) > > Call: > glm(formula = resp ~ var1 * cat, family = "poisson") > > Deviance Residuals: > Min 1Q Median 3Q Max > -1.80269 -0.54107 -0.06169 0.51819 1.58169 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) 1.220186 0.539475 2.262 0.0237 * > var1 0.028182 0.050850 0.554 0.5794 > cat2 -0.112454 0.781137 -0.144 0.8855 > cat3 0.339589 0.672828 0.505 0.6138 > var1:cat2 0.007091 0.068072 0.104 0.9170 > var1:cat3 -0.027248 0.064468 -0.423 0.6725 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > (Dispersion parameter for poisson family taken to be 1) > > Null deviance: 23.222 on 29 degrees of freedom Residual > deviance: > 22.192 on 24 degrees of freedom > AIC: 133.75 > > Number of Fisher Scoring iterations: 5 > > > > -- > Matthew Carroll > E-mail: mjc510 at york.ac.uk > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
Peter Ehlers
2010-Apr-12 10:37 UTC
[R] Interpreting factor*numeric interaction coefficients
On 2010-04-12 3:15, Matthew Carroll wrote:> Dear all, > I am a relative novice with R, so please forgive any terrible errors... > > I am working with a GLM that describes a response variable as a function of > a categorical variable with three levels and a continuous variable. These > two predictor variables are believed to interact. > An example of such a model follows at the bottom of this message, but here > is a section of its summary table: > > Estimate Std. Error z value Pr(>|z|) > (Intercept) 1.220186 0.539475 2.262 0.0237 * > var1 0.028182 0.050850 0.554 0.5794 > cat2 -0.112454 0.781137 -0.144 0.8855 > cat3 0.339589 0.672828 0.505 0.6138 > var1:cat2 0.007091 0.068072 0.104 0.9170 > var1:cat3 -0.027248 0.064468 -0.423 0.6725 > > I am having trouble interpreting this output. > I think I understand that: > > # the 'var1' value refers to the slope of the relationship within the first > factor level > > # the 'cat2' and 'cat3' values refer to the difference in intercept from > 'cat1' > > # the interaction terms describe the difference in slope between the > relationship in 'cat1' and that in 'cat2' and 'cat3' respectively > > Therefore, if I wanted a single value to describe the slope in either cat2 > or cat3, I would sum the interaction value with that of var1. > > However, if I wanted to report a standard error for the slope in 'cat2', how > would I go about doing this? Is the reported standard error that for the > overall slope for that factor level, or is the actual standard error a > function of the standard error of var1 and that of the interaction? >You can relevel your factor variable: mod <- glm(resp ~ var1 * relevel(cat, ref=2), family="poisson") Or, to do this for all levels, you can specify the model as: mod <- glm(resp ~ cat/var1 + 0, family="poisson") which will give the regressions resp ~ var1 within each level of 'cat'. Or you can calculate the SE from the covariance matrix given by summary(mod)$cov.unscaled, using the formula for the variance of a linear combination of random variables. -Peter Ehlers> Any help with this would be much appreciated, > > Matthew Carroll > > > ### example code > > resp<- rpois(30, 5) > cat<- factor(rep(c(1:3), 10)) > var1<- rnorm(30, 10, 3) > > mod<- glm(resp ~ var1 * cat, family="poisson") > summary(mod) > > Call: > glm(formula = resp ~ var1 * cat, family = "poisson") > > Deviance Residuals: > Min 1Q Median 3Q Max > -1.80269 -0.54107 -0.06169 0.51819 1.58169 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) 1.220186 0.539475 2.262 0.0237 * > var1 0.028182 0.050850 0.554 0.5794 > cat2 -0.112454 0.781137 -0.144 0.8855 > cat3 0.339589 0.672828 0.505 0.6138 > var1:cat2 0.007091 0.068072 0.104 0.9170 > var1:cat3 -0.027248 0.064468 -0.423 0.6725 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > (Dispersion parameter for poisson family taken to be 1) > > Null deviance: 23.222 on 29 degrees of freedom Residual deviance: > 22.192 on 24 degrees of freedom > AIC: 133.75 > > Number of Fisher Scoring iterations: 5 > > > > -- > Matthew Carroll > E-mail: mjc510 at york.ac.uk > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Peter Ehlers University of Calgary