A multinomial logit model can be specified as a conditional logit model after restructuring the data. Doing so gives flexibility in imposing restrictions on the dependent variable. One application is to specify a loglinear model for square tables, e.g. quasi-symmetry or quasi-independence, as a multinomial logit model with covariates. Further details on this technique and examples with several packages are on my homepage at http://www.xs4all.nl/~jhckx/mcl/ I've been trying to implement an MCL model in R and have been mostly succesful. However I'm still stuck on a few points and I'm hoping someone can point out how to do fix them. To estimate an MCL model, the data must be restructured into a person-choice file. * Each case must be duplicated "ncat" times ("ncat" is the number of categories of the dependent variable) * Each case must be identified by a strata variable (id) * Each duplicate must be identified by a variable indexing the choices (newy) * A dichotomous variable must indicate which duplicate corresponds with the respondent's choice (didep) I've done this as follows: mclgen <- function (datamat,catvar) { ncat <- nlevels(catvar) id<-1:length(catvar) datamat<-cbind(id,datamat) perschoice <-NULL for (i in 1:ncat) { perschoice<-rbind(perschoice,datamat) } perschoice<-perschoice[sort.list(perschoice$id),] perschoice$newy<-gl(ncat,1,length=nrow(perschoice)) dep<-parse(text=paste("perschoice$",substitute(catvar),sep="")) perschoice$depvar<-as.numeric(eval(dep)==perschoice$newy) perschoice } This works but I wonder if it's the most efficient method. I tried using "rep" rather than "rbind" in a loop but that replicated the dataset horizontally, not vertically. Is this the best solution? I also finally figure out how to include the argument "catvar" in a transformation involving another dataset but this solution seems very complicated (eval+parse+substitute). Is there a simpler solution? What I haven't figured out is how to use "catvar" once again but on the righthand side of the transformation. In the second to last line, I'd like to do "perschoice$catvar<-perschoice$newy". I've tried several possibities but I can't figure out how to let R substitute the value of "catvar" in that expression. "eval(dep)" doesn't work as it does for the lefthand side but how should it be done? My solution for now is to do it afterward. My program continues as follows: pc<-mclgen(logan,occ) pc$occ<-factor(pc$newy) library(survival) cl.lr<-clogit(depvar~occ+occ:educ+occ:black+strata(id),data=pc) summary(cl.lr) The dataset "logan" is available from the website. "occ" is the dependent variable, "educ" and "black" are independents. Once "mclgen" has transformed the data into a person-choice file, a multinomial logit model can be estimated using the dichotomous dependent variable, the main effects of "occ" for the intercept term and interactions of "educ" and "black" with "occ" for the effects of these variables. This works more or less. What's unexpected though is that the reference category of "occ" isn't dropped in the interactions with "educ" and "black". The highest category is dropped due to collineairity but that's not quite what I want. Could someone explain why this happens and how to let R drop the first category? A final problem is that the results of clogit are only approximate (2 digits) whereas in other packages the cl model produces exactly the same estimates as an mnl model. Is this because terms are dropped due to collinearity or are there options I should examine. clogit issues the following warnings which seem to be related to the collinearity but perhaps point to something else: Warning messages: 1: Loglik converged before variable 10 ; beta may be infinite. in: fitter(X, Y, strats, offset, init, control, weights = weights, 2: X matrix deemed to be singular; variable 9 14 in: coxph(formula Surv(rep(1, 4190), depvar) ~ occ + occ:educ + Apologies for the length of this post. Hopefully someone will have time to read it through and help me out. As always, advance thanks for any assistance. John Hendrickx
Hi, | I've been trying to implement an MCL model in R and have been mostly | succesful. However I'm still stuck on a few points and I'm hoping | someone can point out how to do fix them. | | To estimate an MCL model, the data must be restructured into a | person-choice file. | | * Each case must be duplicated "ncat" times ("ncat" is the number of | categories of the dependent variable) | * Each case must be identified by a strata variable (id) | * Each duplicate must be identified by a variable indexing the | choices (newy) | * A dichotomous variable must indicate which duplicate corresponds | with the respondent's choice (didep) | | I've done this as follows: | | mclgen <- function (datamat, catvar) { | ncat <- nlevels(catvar) | id <- 1:length(catvar) | datamat <- cbind(id,datamat) | perschoice <-NULL | for (i in 1:ncat) { | perschoice<-rbind(perschoice,datamat) | } | perschoice <- perschoice[sort.list(perschoice$id),] | perschoice$newy <- gl(ncat, 1, length=nrow(perschoice)) | dep <- parse(text=paste("perschoice$", substitute(catvar), sep="")) | perschoice$depvar <- as.numeric(eval(dep) == perschoice$newy) | perschoice | } | | This works but I wonder if it's the most efficient method. I tried | using "rep" rather than "rbind" in a loop but that replicated the | dataset horizontally, not vertically. Is this the best solution? Well, in general it is not efficient to add something to a matrix in a cycle. You should in either make the datamat in required size and fill the required parts of it, or in this case you may consider e.g.> datamat <- matrix(1:4, 2, 2) > matrix(t(datamat), 4, 2, byrow=T)[,1] [,2] [1,] 1 3 [2,] 2 4 [3,] 1 3 [4,] 2 4 | I also finally figure out how to include the argument "catvar" in a | transformation involving another dataset but this solution seems very | complicated (eval+parse+substitute). Is there a simpler solution? Unfortunately, I was not able to understand quite what are you doing here. Where is the dataset logan? I did not find it on http://www.xs4all.nl/~jhckx. If I could get a simple example running, I can try further. best wishes, Ott
Friedrich.Leisch@univie.ac.at
2003-Jan-30 12:12 UTC
[R] multinomial conditional logit models
>>>>> On Wed, 29 Jan 2003 11:57:20 -0800 (PST), >>>>> John Hendrickx (JH) wrote:> A multinomial logit model can be specified as a conditional logit > model after restructuring the data. Doing so gives flexibility in > imposing restrictions on the dependent variable. One application is > to specify a loglinear model for square tables, e.g. quasi-symmetry > or quasi-independence, as a multinomial logit model with covariates. > Further details on this technique and examples with several packages > are on my homepage at http://www.xs4all.nl/~jhckx/mcl/ > I've been trying to implement an MCL model in R and have been mostly > succesful. However I'm still stuck on a few points and I'm hoping > someone can point out how to do fix them. > To estimate an MCL model, the data must be restructured into a > person-choice file. > * Each case must be duplicated "ncat" times ("ncat" is the number of > categories of the dependent variable) > * Each case must be identified by a strata variable (id) > * Each duplicate must be identified by a variable indexing the > choices (newy) > * A dichotomous variable must indicate which duplicate corresponds > with the respondent's choice (didep) I'm not completely sure if I understand your target format completely, but it sounds like reshape() could do all you want: R> df3 id age dose1 dose2 dose4 1 1 40 1 2 3 2 2 50 2 1 3 3 3 60 1 2 3 4 4 50 2 1 3 R> reshape(df3, direction = "long", idvar=1, varying=list(dose=c("dose1", "dose2", "dose4"))) id age time dose1 1.1 1 40 1 1 2.1 2 50 1 2 3.1 3 60 1 1 4.1 4 50 1 2 1.2 1 40 2 2 2.2 2 50 2 1 3.2 3 60 2 2 4.2 4 50 2 1 1.3 1 40 3 3 2.3 2 50 3 3 3.3 3 60 3 3 4.3 4 50 3 3 which is a (modified) example from help(reshape) ... there you find much more examples. Hope this helps, Fritz -- ------------------------------------------------------------------- Friedrich Leisch Institut f?r Statistik & DSS Tel: (+43 1) 4277 38613 Universit?t Wien Fax: (+43 1) 4277 38639 Universit?tsstra?e 5 Friedrich.Leisch at univie.ac.at A-1010 Wien, Austria http://www.ci.tuwien.ac.at/~leisch -------------------------------------------------------------------
A little while ago, I asked for some help in estimating "multinomial conditional logit models". If you restructure your data, you can estimate a multinomial logit as a conditional logit model. This gives flexibility in imposing constraints on your dependent variable. One application is to estimate a loglinear model for square tables such as quasi-independence or quasi-symmetry with covariates. Anyhow, I think I've sorted out most of the problems and I've posted a sample program at http://www.xs4all.nl/~jhckx/mcl/R/ There's also a sample program there on how to estimate models for square tables, aka mobility models. Comments and suggestions on how to do things more efficiently or elegantly in R are most welcome. One problem that remains is that clogit in R doesn't produce the same estimates as other programs like Stata. I've estimated this sample model in several packages and they all have the same estimates to 3 decimal places accuracy. I've tried different convergence settings in clogit but to no effect. I'd appreciate it if anyone could clarify this. Best regards, John Hendrickx