thr3ads.net - R help - [R] nested factors different with/out brackets

If this information is useful, please help other people find it:
Share via:

Jonathan Williams

2010-May-06 15:39 UTC

[R] nested factors different with/out brackets - is this a design feature?

Dear R Helpers,

I came across discrepancies in estimates for nested factors in linear models
computed with brackets and also when a crossing factor is defined as a factor or
as a dummy (0/1) vector. These are probably designed features that I do not
understand. I wonder if someone would be so kind as to explain them to me.

First, if I do not bracket the nested pair of factors A/C, then (1) I obtain
coefficients for all 3 levels (0/1/2) of the crossing factor B (compare models
m0 and m1, below). Moreover, the values of some corresponding coefficients in
the 2 models are not the same. So, in m1 B0:A1:D1 = -0.13112 and in m2, A1:C1
(the corresponding term in a model with no B0 coefficients) = -0.13112. But, the
coefficients for B1:A1:C1 in m1 and m2 are 0.08909 and 0.22021. Why do they
differ, here? Also, (2) if I bracket the nested pair of factors
"(A/C)", then I obtain the B:C interaction - which I did not expect. I
thought that if I bracket (A/C) - then this would constrain the model to
generate only A, A:B and A:B:C as in m1 (compare models m1 and m2).

Second, if I use restrict the levels of the crossing factor B to be 0 or 1 (via
subset=B!=2) and compare this model with a dummy vector b with identical 0/1
values, then I again obtain different outputs - compare m3 and m4, below. Again,
there is no simple correspondence between m3 and m4, even though
all(dat$B==dat$b) returns TRUE.

So, I do not understand what is happening here, either when comparing m1 with m2
or when comparing m3 with m4.

With many thanks, in anticipation of your help in explaining this,

Jonathan Williams

Here is a simple sample code to generate the discrepancies:-

set.seed(1)
A=factor(rep(c(1:5),600))
B=factor(rep(c(0:2),each=1000))
b=as.numeric(as.character(B))
C=factor(rbinom(3000,1,0.5))
set.seed(2)
y=rnorm(3000)
dat=data.frame(A,B,b,C,y)
m0=lm(y~B*A); summary(m0)
m1=lm(y~B*A/C,dat); summary(m1)
m2=lm(y~B*(A/C),dat); summary(m2)

m3=lm(y~B*A/C,dat,subset=B!=2); summary(m3)
m4=lm(y~b*A/C,dat,subset=B!=2); summary(m4)

RICHARD M. HEIBERGER

2010-May-06 15:58 UTC

head link

[R] nested factors different with/out brackets - is this a design feature?

> attr(terms(~B*A), "term.labels")
[1] "B"   "A"   "B:A"> attr(terms(~A/C), "term.labels")
[1] "A"   "A:C"> attr(terms(~B*A/C), "term.labels")[1] "B"     "A"     "B:A"  
"B:A:C"> attr(terms(~(B*A)/C), "term.labels")[1] "B"     "A"     "B:A"  
"B:A:C"> attr(terms(~B*(A/C)), "term.labels")[1] "B"     "A"     "A:C"   "B:A"  
"B:A:C">
The issue is precedence of operations.  "*" has higher precedence than
"/".

This is exactly the same issue you see in ordinary
arithmetic> 1+2*3
[1] 7> (1+2)*3
[1] 9>
Rich

	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more seemingly similar threads

R help - May 2010 - nested factors different with/out brackets - is this a design feature?

[R] nested factors different with/out brackets - is this a design feature?

[R] nested factors different with/out brackets - is this a design feature?

Apparently Analagous Threads