Hello!
I'm reading through a logistic regression book and using R to replicate
the results. Although my question is not directly related to this, it's
the context I discovered it in, so here we go.
Consider these data:
interco <- structure(list(white = c(1, 1, 0, 0), male = c(1, 0, 1, 0),
yes = c(43, 26, 29, 22), no = c(134, 149, 23, 36), total = c(177, 175,
52, 58)), .Names = c("white", "male", "yes",
"no", "total"), row.names =
c(NA, -4L), class = "data.frame")
We can use logistic regression to analyze this table, using glm's syntax
for successes/failures described on the top of page 191 in MASS 4th
edition.
summary(glm(as.matrix(interco[c("yes", "no")]) ~ white +
male,
data = interco, family = binomial))
The output prints out, no problem!
Now, another data set, note the identifying feature of this one is that
it contains a column with the same name as the object (i.e.,
"working")
working <- structure(list(france = c(1, 1, 1, 1, 0, 0, 0, 0), manual =
c(1, 1, 0, 0, 1, 1, 0, 0), famanual = c(1, 0, 1, 0, 1, 0, 1, 0), total =
c(107, 65, 66, 171, 87, 65, 85, 148), working = c(85, 44, 24, 17, 24,
22, 1, 6), no = c(22, 21, 42, 154, 63, 43, 84, 142)), .Names =
c("france", "manual", "famanual",
"total", "working", "no"), row.names =
c(NA, -8L), class = "data.frame")
summary(glm(as.matrix(working[c("working", "no")]) ~ france
+ manual +
famanual, data = working, family = binomial))
Error in model.frame.default(formula = as.matrix(working[c("working",
:
variable lengths differ (found for 'france')
Well, this error goes away simply by renaming the "working" variable
in
the data.frame "working" to something else. I found the
"eval" line in
model.frame that's throwing the error, but I'm still confused as to why.
I'm sure it's not a bug, but could someone point to a thread or offer
some gentle advice on what's happening? I think it's related to:
test <- data.frame(name1 = 1:5, name2 = 6:10, test = 11:15)
eval(expression(test[c("name1", "name2")]))
eval(expression(interco[c("name1", "test")]))
Thanks!
--Erik