thr3ads.net - R help - [R] Different results in glm() probit model using vector vs. two-column matrix response [Dec 2010]

If this information is useful, please help other people find it:
Share via:

Lensing, Shelly Y

2010-Dec-30 19:49 UTC

[R] Different results in glm() probit model using vector vs. two-column matrix response

Hi - I am fitting a probit model using glm(), and the deviance and residual
degrees of freedom are different depending on whether I use a binary response
vector of length 80 or a two-column matrix response (10 rows) with the number of
success and failures in each column. I would think that these would be just two
different ways of specifying the same model, but this does not appear to be the
case.

Binary response vector gives:
Residual deviance:  43.209  on 77  degrees of freedom

Two-column matrix response gives:
Residual deviance:  4.9204  on 7  degrees of freedom

I'd like to understand why the two-column response format gives a residual
degrees of freedom of 7, and why the weights for one is nearly, but not exactly,
a multiple of the other. I need the deviance, df, and weights for another
formula, which is why I'm focused on these. My code is below. Thank you in
advance for any assistance! Shelly

****

# 10 record set-up
group <- gl(2, 5, 10, labels=c("U","M"))    
dose  <- rep(c(7, 8, 9, 10, 11), 2)
ldose <- log10(dose)  
n     <- c(8,8,8,8,8,8,8,8,8,8)
r     <- c(0,1,3,8,8,0,0,0,4,5)
p     <- r/n
d     <- data.frame(group, dose, ldose, n, r, p)
SF <- cbind(success=d$r, failure=d$n - d$r)

#80 record set-up
dose2<-c(7,8,9,10,11)
doserep<-sort(rep(dose2,8))
x<-c(doserep,doserep)
log10x<-log10(x)
y_U<-c(rep(0,8), 1, rep(0, 7), 1, 1, 1, rep(0,5), rep(1, 16))
y_M<-c(rep(0,24), rep(1,4), rep(0,4), rep(1,5), rep(0,3))
y<-c(y_U, y_M)
trt<-c(rep(1, 40), rep(0, 40))

# print x & y's for both
SF
y
ldose
log10x

# analysis with 10 records and 80 records
f1 <- glm(SF ~ group + ldose, family=binomial(link="probit"))
f3 <- glm(SF ~         ldose, family=binomial(link="probit"))
f180 <- glm(y ~ trt + log10x, family=binomial(link="probit"))
f380 <- glm(y ~       log10x, family=binomial(link="probit"))

summary(f1)
summary(f180)

f1$weights
f180$weights
# check weights divided by 8 to see if match -- match several decimal places, 
# but not exactly
f1$weights/8

****

Shelly Lensing
Biostatistics / University of Arkansas for Medical Sciences

Confidentiality Notice: This e-mail message, including a...{{dropped:7}}

Lensing, Shelly Y

2010-Dec-30 21:01 UTC

head link

[R] Different results in glm() probit model using vector vs. two-column matrix response

Hi - I am fitting a probit model using glm(), and the deviance and residual
degrees of freedom are different depending on whether I use a binary response
vector of length 80 or a two-column matrix response (10 rows) with the number of
success and failures in each column. I would think that these would be just two
different ways of specifying the same model, but this does not appear to be the
case.

Binary response vector gives:
Residual deviance:  43.209  on 77  degrees of freedom

Two-column matrix response gives:
Residual deviance:  4.9204  on 7  degrees of freedom

I'd like to understand why the two-column response format gives a residual
degrees of freedom of 7, and why the weights for one is nearly, but not exactly,
a multiple of the other. I need the deviance, df, and weights for another
formula, which is why I'm focused on these. My code is below. Thank you in
advance for any assistance! Shelly

****

# 10 record set-up
group <- gl(2, 5, 10, labels=c("U","M"))    
dose  <- rep(c(7, 8, 9, 10, 11), 2)
ldose <- log10(dose)  
n     <- c(8,8,8,8,8,8,8,8,8,8)
r     <- c(0,1,3,8,8,0,0,0,4,5)
p     <- r/n
d     <- data.frame(group, dose, ldose, n, r, p)
SF <- cbind(success=d$r, failure=d$n - d$r)

#80 record set-up
dose2<-c(7,8,9,10,11)
doserep<-sort(rep(dose2,8))
x<-c(doserep,doserep)
log10x<-log10(x)
y_U<-c(rep(0,8), 1, rep(0, 7), 1, 1, 1, rep(0,5), rep(1, 16))
y_M<-c(rep(0,24), rep(1,4), rep(0,4), rep(1,5), rep(0,3))
y<-c(y_U, y_M)
trt<-c(rep(1, 40), rep(0, 40))

# print x & y's for both
SF
y
ldose
log10x

# analysis with 10 records and 80 records
f1 <- glm(SF ~ group + ldose, family=binomial(link="probit"))
f3 <- glm(SF ~         ldose, family=binomial(link="probit"))
f180 <- glm(y ~ trt + log10x, family=binomial(link="probit"))
f380 <- glm(y ~       log10x, family=binomial(link="probit"))

summary(f1)
summary(f180)

f1$weights
f180$weights
# check weights divided by 8 to see if match -- match several decimal places, 
# but not exactly
f1$weights/8

****

Shelly Lensing
Biostatistics / University of Arkansas for Medical Sciences

Confidentiality Notice: This e-mail message, including a...{{dropped:7}}

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Dec 2010 - Different results in glm() probit model using vector vs. two-column matrix response

[R] Different results in glm() probit model using vector vs. two-column matrix response

[R] Different results in glm() probit model using vector vs. two-column matrix response

Seemingly Similar Threads