Dear Johan,
On 2020-09-17 9:07 a.m., Johan Lassen wrote:> Dear R-users,
>
> I am using the R-function "linearHypothesis" to test if the sum
of all
> parameters, but the intercept, in a multiple linear regression is different
> from zero.
> I wonder if it is statistically valid to use the linearHypothesis-function
> for this?
Yes, assuming of course that the hypothesis makes sense.
> Below is a reproducible example in R. A multiple regression: y >
beta0*t0+beta1*t1+beta2*t2+beta3*t3+beta4*t4
>
> It seems to me that the linearHypothesis function does the calculation as
> an F-test on the extra residuals when going from the starting model to a
> 'subset' model, although all variables in the 'subset'
model differ from
> the variables in the starting model.
> I normally think of a subset model as a model built on the same input data
> as the starting model but one variable.
>
> Hence, is this a valid calculation?
First, linearHypothesis() doesn't literally fit alternative models, but
rather tests the linear hypothesis directly from the coefficient
estimates and their covariance matrix. The test is standard -- look at
the references in ?linearHypothesis or most texts on linear models.
Second, formulating the hypothesis using alternative models is also
legitimate, since the second model is a restricted version of the first.
>
> Thanks in advance,Johan
>
> # R-code:
> y <-
>
c(101133190,96663050,106866486,97678429,83212348,75719714,77861937,74018478,82181104,68667176,64599495,62414401,63534709,58571865,65222727,60139788,
>
63355011,57790610,55214971,55535484,55759192,49450719,48834699,51383864,51250871,50629835,52154608,54636478,54942637)
>
> data <-
>
data.frame(y,"t0"=1,"t1"=1990:2018,"t2"=c(rep(0,12),1:17),"t3"=c(rep(0,17),1:12),"t4"=c(rep(0,23),1:6))
>
> model <- lm(y~t0+t1+t2+t3+t4+0,data=data)
You need not supply the constant regressor t0 explicitly and suppress
the intercept -- you'd get the same test from linearHypothesis() for
lm(y~t1+t2+t3+t4,data=data).
>
> linearHypothesis(model,"t1+t2+t3+t4=0",test=c("F"))
test = "F" is the default.
>
> # Reproduce the result from linearHypothesis:
> # beta1+beta2+beta3+beta4=0 -> beta4=-(beta1+beta2+beta3) ->
> # y=beta0+beta1*t1+beta2*t2+beta3*t3-(beta1+beta2+beta3)*t4
> # y = beta0'+beta1'*(t1-t4)+beta2'*(t2-t4)+beta3'*(t3-t4)
>
> data$t1 <- data$t1-data$t4
> data$t2 <- data$t2-data$t4
> data$t3 <- data$t3-data$t4
>
> model_reduced <- lm(y~t0+t1+t2+t3+0,data=data)
>
> anova(model_reduced,model)
Yes, this is equivalent to the test performed by linearHypothesis()
using the coefficients and their covariances from the original model.
I hope this helps,
John
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/>