thr3ads.net - R help - [R] [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM [Feb 2018]

If this information is useful, please help other people find it:
Share via:

Rolf Turner

2018-Feb-16 04:02 UTC

[R] [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM

On 16/02/18 15:28, Bert Gunter wrote:
> This is really a statistical issue. What do you think the Intercept term
> represents? See ?contrasts.
> 
> Cheers,
> Bert
> 
> 
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along
and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
It's a *little* bit R-ish in that the results depend on the 
parametrisation of the model which by default in R is formed using the 
so-called "treatment" contrasts.

To wander from R into statistics (sorry Bert) the problem arises because
the "usual" parametrisation of the model is the
"over-parametrised" form:

Y_ij = mu + beta_i + E_ij (i = 1, ..., I, j = 1, ..., J_i)

where Y_ij is the j-th observation corresponding to the i-th
"treatment"
or group.  (Things get a bit more complicated in "multi-way" models; 
let's not go there.)

The parameter "mu" is the "grand mean" and the
"beta_i" are the
"treatment" effects.

The trouble with this parametrisation is that the parameters are 
meaningless!  (The usual jargon is to say that they are "not
estimable",
but "meaningless"  is a more accurate description.)

In order to ascribe unique values to the parameters, one must apply a 
"constraint".  With the "treatment contrasts" the constraint
is that
beta_1 = 0.

As a result the mean for the first treatment is mu, that for the second
treatment is mu + beta_2, and so on.

Consequently the SE corresponding to "(Intercept)" is the SE of 
estimated mean for treatment 1.  The SE corresponding to beta_2 is the 
SE of the estimated *difference* between the mean for treatment 2 and 
that for treatment 1, and so on.

Frequently the constraint beta_1 + ...+ beta_I = 0 is used.  This
sort of treats all of the beta_i equally. At this point it gets R-ish 
again. You can impose the foregoing constraint by using either "sum"
or
"Helmert" contrasts.  You can get the "more natural",
"fully" (rather
than "over") parametrised model via the syntax:

     g <- glm(a ~ 0 + b, data=df)

or

     g <- glm(a ~ b - 1, data=df)

This syntax actually imposes the constraint that mu = 0.

(Here "contrasts" don't get involved --- a bit counterintuitive,
that.)

The foregoing expressions will give you estimates labelled "b0",
"b1",
"b2" (nothing labelled "(Intercept)") and these estimate are
of the
treatment means and the SEs are straightforward to interpret.

There *is* a rationale for the use of the over-parametrised model, but I 
won't try to explain it here.  I've raved on long enough.

Marc:  I hope this helps.

cheers,

Rolf

P.S. It is bad form to call a data frame "df" since this is the name
of
the density function for the family of F-distributions.  There are 
circumstances in which such usage can lead to error messages which are 
impossible to interpret, e.g. "object of type 'closure' is not 
subsettable". (!!!)

R.
> 
> On Thu, Feb 15, 2018 at 5:27 PM, Marc Girondot via R-help <
> r-help at r-project.org> wrote:
> 
>> Dear R-er,
>>
>> I try to get the standard error of fitted parameters for factors with a
>> glm, even the reference one:
>>
>> a <- runif(100)
>> b <- sample(x=c("0", "1", "2"),
size=100, replace = TRUE)
>> df <- data.frame(A=a, B=b, stringsAsFactors = FALSE)
>>
>> g <- glm(a ~ b, data=df)
>> summary(g)$coefficients
>>
>> # I don't get SE for the reference factor, here 0:
>>
>>                Estimate Std. Error    t value     Pr(>|t|)
>> (Intercept)  0.50384827 0.05616631  8.9706490 2.236684e-14
>> b1          -0.03598386 0.07496151 -0.4800311 6.322860e-01
>> b2           0.03208039 0.07063113  0.4541962 6.507023e-01
>>
>> # Then I try to change the order of factors, for example:
>>
>> df$B[df$B=="0"] <- "3"
>> g <- glm(a ~ b, data=df)
>> summary(g)$coefficients
>>
>> By I get the same...
>>
>> Any idea ?
>>
>> Thanks
>>
>> Marc

Peter Dalgaard

2018-Feb-16 09:24 UTC

head link

[R] [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM

To give a short answer to the original question:
> On 16 Feb 2018, at 05:02 , Rolf Turner <r.turner at auckland.ac.nz>
wrote:
> 
> In order to ascribe unique values to the parameters, one must apply a
"constraint".  With the "treatment contrasts" the constraint
is that
> beta_1 = 0.
...and consequently, being a constant, has an s.e. of 0.  


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Marc Girondot

2018-Feb-16 09:32 UTC

head link

[R] [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM

Le 16/02/2018 ? 10:24, Peter Dalgaard a ?crit?:> To give a short answer to the original question:
>
>> On 16 Feb 2018, at 05:02 , Rolf Turner <r.turner at
auckland.ac.nz> wrote:
>>
>> In order to ascribe unique values to the parameters, one must apply a
"constraint".  With the "treatment contrasts" the constraint
is that
>> beta_1 = 0.
> ...and consequently, being a constant, has an s.e. of 0.
>
>It makes sense !

Thanks for the precision !

Marc

Maybe Matching Threads

Search for more apparently analagous threads

R help - Feb 2018 - [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM

[R] [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM

[R] [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM

[R] [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM

Maybe Matching Threads