Joao Azevedo
2012-Jul-27 11:32 UTC
[R] Understanding the intercept value in a multiple linear regression with categorical values
Hi! I'm failing to understand the value of the intercept value in a multiple linear regression with categorical values. Taking the "warpbreaks" data set as an example, when I do:> lm(breaks ~ wool, data=warpbreaks)Call: lm(formula = breaks ~ wool, data = warpbreaks) Coefficients: (Intercept) woolB 31.037 -5.778 I'm able to understand that the value of intercept is the mean value of breaks when wool equals "A", and that adding up the "woolB" coefficient to the intercept value I get the mean value of breaks when wool equals "B". However, if I also consider the tension variable in the model, I'm unable to figure out the meaning of the intercept value:> lm(breaks ~ wool + tension, data=warpbreaks)Call: lm(formula = breaks ~ wool + tension, data = warpbreaks) Coefficients: (Intercept) woolB tensionM tensionH 39.278 -5.778 -10.000 -14.722 I thought it would be the mean value of breaks when either wool equals "A" or tension equals "L", but that isn't true for this dataset. Any clues on interpreting the value of intercept? Thanks! -- Joao.
Jean V Adams
2012-Jul-27 12:04 UTC
[R] Understanding the intercept value in a multiple linear regression with categorical values
Joao, There's a very thorough explanation at http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm Jean Joao Azevedo <joao.c.azevedo@gmail.com> wrote on 07/27/2012 06:32:31 AM:> > Hi! > > I'm failing to understand the value of the intercept value in a > multiple linear regression with categorical values. Taking the > "warpbreaks" data set as an example, when I do: > > > lm(breaks ~ wool, data=warpbreaks) > > Call: > lm(formula = breaks ~ wool, data = warpbreaks) > > Coefficients: > (Intercept) woolB > 31.037 -5.778 > > I'm able to understand that the value of intercept is the mean value > of breaks when wool equals "A", and that adding up the "woolB" > coefficient to the intercept value I get the mean value of breaks when > wool equals "B". However, if I also consider the tension variable in > the model, I'm unable to figure out the meaning of the intercept > value: > > > lm(breaks ~ wool + tension, data=warpbreaks) > > Call: > lm(formula = breaks ~ wool + tension, data = warpbreaks) > > Coefficients: > (Intercept) woolB tensionM tensionH > 39.278 -5.778 -10.000 -14.722 > > I thought it would be the mean value of breaks when either wool equals > "A" or tension equals "L", but that isn't true for this dataset. > > Any clues on interpreting the value of intercept? > > Thanks! > > -- > Joao.[[alternative HTML version deleted]]