Since I have not seen a reply to this post, I will attempt a brief
comment: I won't comment on the specifics, but your general approach
seems appropriate. You already seem to know that a factor with k levels
is converted into (k-1) separate numerical variables, and a separate
regression coefficient is estimated for each one.
To see in more detail what you were estimating, I looked at the
following:
model.matrix(y~xma)
fit2 <- aov(y~xma)
attributes(fit2)
From the latter, I identified "contrasts" as something that might
be
interesting to examine, as follows:
fit2$contrasts
For clarity, I think I might reduce the number of observations
substantially, limiting myself to only 2 or 3 schools and paramterize
the problem manually, but preserving imbalance. Then I'd use
"lm",
specifying the terms in different orders. With imbalance, the answer
depends on the order unfortunately. When in doubt, I often experiment
with changing the order: If the changes do not affect the conclusions,
I pick the simplest case to present to my audience. If the changes do
affect the conclusions, I know I need to worry about which answer seems
most correct, and I also know something about the limits of the
conclusions.
I know this doesn't answer your question, but I hope it helped with a
solution methodology.
Best Wishes,
Spencer Graves
Scot W McNary wrote:
> Hi,
>
> I have a problem in which I have test score data on students from a number
> of schools. In each school I have a measure of whether or not they
> received special programming. I am interested in the interaction between
> school and attendance to the programming, but in a very select set of
> comparisons. I'd like to cast the test as one in which students in
each
> school who attend are compared with students who don't across all
schools.
> So, I would be comparing school 1 attenders with school 1 non-attenders,
> school 2 attenders with school 2 non-attenders, etc. The reason for the
> custom contrast is that the between school comparisons (e.g., school 1
> attenders vs. school 2 non-attenders) are of less interest.
>
> This seems to require a custom contrast statement for the interaction
> term. I have a toy example that seems to work as it should, but wonder if
> I've correctly created the contrast needed.
>
> Here is a toy example (code put together from bits taken from MASS ch 6,
> and various R-help postings, (e.g.,
> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/49077.html)):
>
> # toy interaction contrast example, 10 schools, 100 kids, 5 attenders (1)
> # and 5 non-attenders (2) in each school
>
> # make the data
> school <- gl(10, 10)
> attend <- gl(2, 5, 100)
> # creates an interaction with schools 6 and 7
> y <- c(sample(seq(450, 650, 1), 50), rep(c(rep(650, 5), rep(450, 5)),
2),
> sample(seq(450, 650, 1), 30))
>
> # anova
> summary(aov(y ~ school * attend))
>
> # graphically
> Means <- tapply(y, list(school, attend), mean)
>
> plot(Means[,1], col="red", type = "l", ylim =
c(400,700))
>
> points(Means[,2], col="blue", type = "l")
>
> # create contrasts for hypothesis of interest
> # school i attend j - school i attend j'
> # for all schools
> sxa <- interaction(school, attend)
> sxam <- as.matrix(rbind(diag(1,10), diag(1,10) * -1))
> contrasts(sxa) <- sxam
>
> summary(aov(y ~ sxa), split=list(sxa=1:10), expand.split = T)
>
> The actual problem has a few more schools, other covariates, considerably
> more students, and is somewhat unbalanced.
>
> Thanks,
>
> Scot
>
>
> --
> Scot W. McNary email:smcnary at charm.net
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA
spencer.graves at pdf.com
www.pdf.com <http://www.pdf.com>
Tel: 408-938-4420
Fax: 408-280-7915