Dear all R-users, I am a new user of R and I am trying to build a discrete choice model (with more than two alternatives A, B, C and D) using logistic regression. I have data that describes the observed choice probabilities and some background information. An example below describes the data: Sex Age pr(A) pr(B) pr(C) pr(D) ... 1 11 0.5 0.5 0 0 1 40 1 0 0 0 0 34 0 0 0 1 0 64 0.1 0.5 0.2 0.2 ... I have been able to model a case with only two alternatives "A" and "not A" by using glm(). I do not know what functions are available to estimate such a model with more than two alternatives. Multinom() is one possibility, but it only allows the use of binary 0/1-data instead of observed probabilities. Did I understand this correctly? Additionally, I am willing to use different independent variables for the different alternatives in the model. Formally, I mean that: Pr(A)=exp(uA)/(exp(uA)+exp(uB)+exp(uC)+exp(uD) Pr(B)=exp(uB)/(exp(uA)+exp(uB)+exp(uC)+exp(uD) ... where uA, uB, uC and uD are linear functions with different independent variables, e.g. uA=alpha_A1*Age, uB=alpha_B1*Sex. Do you know how to estimate this type of models in R? Best regards, Ville Koskinen
Hi Koskinen For response variables with multiple categories, you may try polr() in MASS package, which implement a proportional odds model. And you may search the R archives, several threads discussed this problem before... Wuming On 6/15/05, Ville Koskinen <ville.koskinen at matrex.fi> wrote:> Dear all R-users, > > I am a new user of R and I am trying to build a discrete choice model (with > more than two alternatives A, B, C and D) using logistic regression. I have > data that describes the observed choice probabilities and some background > information. An example below describes the data: > > Sex Age pr(A) pr(B) pr(C) pr(D) ... > 1 11 0.5 0.5 0 0 > 1 40 1 0 0 0 > 0 34 0 0 0 1 > 0 64 0.1 0.5 0.2 0.2 > ... > > I have been able to model a case with only two alternatives "A" and "not A" > by using glm(). > > I do not know what functions are available to estimate such a model with > more than two alternatives. Multinom() is one possibility, but it only > allows the use of binary 0/1-data instead of observed probabilities. Did I > understand this correctly? > > Additionally, I am willing to use different independent variables for the > different alternatives in the model. Formally, I mean that: > Pr(A)=exp(uA)/(exp(uA)+exp(uB)+exp(uC)+exp(uD) > Pr(B)=exp(uB)/(exp(uA)+exp(uB)+exp(uC)+exp(uD) > ... > where uA, uB, uC and uD are linear functions with different independent > variables, e.g. uA=alpha_A1*Age, uB=alpha_B1*Sex. > > Do you know how to estimate this type of models in R? > > Best regards, Ville Koskinen > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
>Dear all R-users, > >I am a new user of R and I am trying to build a discrete choice model (with >more than two alternatives A, B, C and D) using logistic regression. I have >data that describes the observed choice probabilities and some background >information. An example below describes the data: > >Sex Age pr(A) pr(B) pr(C) pr(D) ... >1 11 0.5 0.5 0 0 >1 40 1 0 0 0 >0 34 0 0 0 1 >0 64 0.1 0.5 0.2 0.2 >...You can use multinom() Here is an exemple For example let this matrix to be analyzed: male female aborted factor 10 12 1 1.2 14 14 4 1.3 15 12 3 1.4 The data are to be entered in a text file like this: output factor n m 1.2 10 f 1.2 12 a 1.2 1 m 1.3 14 f 1.3 14 a 1.3 4 m 1.4 15 f 1.4 12 a 1.4 3 library(MASS) dt.plr <- multinom(output ~ factor, data=dt, weights=n, maxit=1000) dt.pr1<-predict(dt.plr, , type="probs") dt.pr1>I have been able to model a case with only two alternatives "A" and "not A" >by using glm(). > >I do not know what functions are available to estimate such a model with >more than two alternatives. Multinom() is one possibility, but it only >allows the use of binary 0/1-data instead of observed probabilities. Did I >understand this correctly? > >Additionally, I am willing to use different independent variables for the >different alternatives in the model. Formally, I mean that: >Pr(A)=exp(uA)/(exp(uA)+exp(uB)+exp(uC)+exp(uD) >Pr(B)=exp(uB)/(exp(uA)+exp(uB)+exp(uC)+exp(uD) >... >where uA, uB, uC and uD are linear functions with different independent >variables, e.g. uA=alpha_A1*Age, uB=alpha_B1*Sex. > >Do you know how to estimate this type of models in R?I don't think it is possible... (at least simply, without writing all the script !) Note that I don't undrestand where the residual deviance from multinom() come from. I cant find the logic. Marc -- __________________________________________________________ Marc Girondot, Pr Laboratoire Ecologie, Syst??matique et Evolution Equipe de Conservation des Populations et des Communaut??s CNRS, ENGREF et Universit?? Paris-Sud 11 , UMR 8079 B??timent 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1 69 15 56 96 e-mail: marc.girondot at ese.u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot Fax in US: 1-425-732-6934