Christian.Lajaunie at ensmp.fr
2007-Dec-05 17:40 UTC
[Rd] confint for coefficients from lm model (PR#10496)
Full_Name: Christian Lajaunie Version: 2.5.1 OS: Fedora fc6 Submission from: (NULL) (193.251.63.39) confint() does not use the appropriate variance term when the design matrix contains a zero column (which of course should not happen). Example: A 10x2 matrix with trivial column 1:> junk <- data.frame(x=rep(0,10), u=factor(sample(c("Y", "N"), 10, replace=T)))The response:> ans <- as.integer(junk$u) + rnorm(10)and the model:> junk.model <- lm(ans ~ junk$x + junk$u)3 coefficients:> coefficients(junk.model)(Intercept) junk$x junk$uY 0.6808802 NA 1.5912192 and a 2x2 variance (X^tX)^-1:> vcov(junk.model)(Intercept) junk$uY (Intercept) 0.09905378 -0.09905378 junk$uY -0.09905378 0.19810756 result in no confidence interval for the third term:> confint(junk.model)2.5 % 97.5 % (Intercept) -0.04488412 1.406644 junk$x NA NA junk$uY NA NA confint() seems to be looking for diag(vcov(junk.model))[3] instead of diag(vcov(junk.model))[2]
Peter Dalgaard
2007-Dec-05 21:32 UTC
[Rd] confint for coefficients from lm model (PR#10496)
Christian.Lajaunie at ensmp.fr wrote:> Full_Name: Christian Lajaunie > Version: 2.5.1 > OS: Fedora fc6 > Submission from: (NULL) (193.251.63.39) > > > confint() does not use the appropriate variance term when the design > matrix contains a zero column (which of course should not happen). > Example: > > A 10x2 matrix with trivial column 1: > > >> junk <- data.frame(x=rep(0,10), u=factor(sample(c("Y", "N"), 10, replace=T))) >> > > The response: > >> ans <- as.integer(junk$u) + rnorm(10) >> > and the model: > >> junk.model <- lm(ans ~ junk$x + junk$u) >> > > 3 coefficients: > > >> coefficients(junk.model) >> > (Intercept) junk$x junk$uY > 0.6808802 NA 1.5912192 > > and a 2x2 variance (X^tX)^-1: > > >> vcov(junk.model) >> > (Intercept) junk$uY > (Intercept) 0.09905378 -0.09905378 > junk$uY -0.09905378 0.19810756 > > result in no confidence interval for the third term: > > >> confint(junk.model) >> > 2.5 % 97.5 % > (Intercept) -0.04488412 1.406644 > junk$x NA NA > junk$uY NA NA > > confint() seems to be looking for diag(vcov(junk.model))[3] > instead of diag(vcov(junk.model))[2] >(You should upgrade, but this is the same in 2.6.1) Yes. And confint.glm and confint.default are bad too. The glm method must be a different issue, but the other two share the vcov issue. I'm a bit unsure what is the right fix, though. Is vcov really returning the wrong thing? Should we rather have > vcov(junk.model) [,1] [,2] [,3] [1,] 0.5525259 NA -0.5525259 [2,] NA NA NA [3,] -0.5525259 NA 0.6906574 which is not massively hard to achieve. Alternatively we could just skip the aliased coefficients. For GLMs we definitely do not want to profile them...> ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907