Jose Marcos Ferraro
2016-Mar-31 21:51 UTC
[R] reduced set of alternatives in package mlogit
I'm trying to estimate a multinomial logit model but in some choices only alternatives from a subset of all possible alternatives can be chosen. At the moment I get around it by creating "dummy" variables to mean the alternative is not available and let it estimate this coefficient as highly negative. Is there a better way to do it? [[alternative HTML version deleted]]
code? example data? We can only guess based on your vague post. "PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code." Moreover, this sounds like a statistical question, not a question about R programming, and so might be more appropriate for a statistical list like stats.stackexchange.com . Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Mar 31, 2016 at 2:51 PM, Jose Marcos Ferraro <jose.ferraro at logiteng.com> wrote:> I'm trying to estimate a multinomial logit model but in some choices only alternatives from a subset of all possible alternatives can be chosen. > At the moment I get around it by creating "dummy" variables to mean the alternative is not available and let it estimate this coefficient as highly negative. Is there a better way to do it? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Jose Marcos Ferraro
2016-Apr-01 13:40 UTC
[R] reduced set of alternatives in package mlogit
-----Original Message----- From: Bert Gunter [mailto:bgunter.4567 at gmail.com] Sent: quinta-feira, 31 de mar?o de 2016 20:22 To: Jose Marcos Ferraro <jose.ferraro at LOGITeng.com> Cc: r-help at r-project.org Subject: Re: [R] reduced set of alternatives in package mlogit code? example data? We can only guess based on your vague post. "PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code." Moreover, this sounds like a statistical question, not a question about R programming, and so might be more appropriate for a statistical list like stats.stackexchange.com . Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Mar 31, 2016 at 2:51 PM, Jose Marcos Ferraro <jose.ferraro at logiteng.com> wrote:> I'm trying to estimate a multinomial logit model but in some choices only alternatives from a subset of all possible alternatives can be chosen. > At the moment I get around it by creating "dummy" variables to mean the alternative is not available and let it estimate this coefficient as highly negative. Is there a better way to do it? >Sorry if I was not clear enough, but there is hardly any code to show. The problem is that a parameter or function is lacking (or , mostly likely, I can't find it), so in some sense the problem itself is that there is no code to show. In what follows choice situations , alternatives, wide, and variables have the same meaning that they have on the mlogit documentation. All variables are alternative specific. 1)I want to estimate a multinomial Logit using the mlogit package 2)I have a dataset, made of choice situations 3)There is a set of alternatives 4)in some choice situations, not all alternatives were available, but only a subset of them. So there are no variables for the unavailable alternatives and the chosen alternative evidently belongs to the set of available ones. 5)I use mlogit.data to prepare the dataset from a "wide" dataframe . There is no option to have only a subset of alternatives and the resulting object will have them all , that is, there will be a line for every alternative and every choice situation, even if in reality some of them were not available. The variables of these alternatives did not exist, so must be filled with 0s or any other made up value 6) If ones estimate a model from this data it will be wrong 7) It is possible to get an "almost right" model by using a dummy variable marking which alternatives are unavailable, for as it is only used in alternatives that are never chosen, its coefficient will get negative with big absolute value, in practice giving almost 0% probability for them 8)this is a workaround because it obligates the model to estimate a number that should be -infinity and this is known in advance, so it's ugly and difficult to know what the numeric consequences are as the coefficient can never converge. In fact, I don't use it the way I described for these reasons, preferring a more complex but almost equivalent formulation. The important point is that I want a clean solution, not a workaround 9)I demand simply if mlogit package has such functionality