Dear all, I was trying to fit a GLM on the following data (this data was
taken from Agresti):
Dat <- matrix(c(24, 1355, 35, 603, 21, 192, 30, 224), 4, byrow = TRUE)
Here the 1st column denotes the success and the second column is for
failure. We have 4 rows represeting the 4 states of some explanatory
variable, let say those states are:
Scores <- c(0, 2, 4, 5)
My goal is to estimate the success probabilities for each state. Therefore,
I use a simple GLM:
p(x) = alpha + beta * x
*** My first approach
Here I break my sample into sample from Bernoulli distribution and fit glm:
YY <- c(rep(1, 24), rep(0, 1355), rep(1, 35), rep(0, 603), rep(1, 21),
rep(0, 192), rep(1, 30), rep(0, 224))
XX <- c(rep(0, 24 + 1355), rep(2, 35 + 603), rep(4, 21 + 192), rep(5, 30 +
224))
summary(glm(YY~XX, binomial(link = "identity")))
*** My second approach
Here I work with the given sample as it is. Hence assuming Binomial
distribution as follows:
Proportion <- apply(Dat, 1, function(x) return(x[1]/(x[1]+x[2])))
summary(glm(Proportion~c(0,2,4,5), binomial(link = "identity")))
Here I was expecting those 2 approaches should give exactly same result
(i.e. same estimates and same SE), which is not the case. Can somebody
point me what I am missing here?
Thanks and regards,
[[alternative HTML version deleted]]