Franckx Laurent
2014-Jan-29 15:33 UTC
[R] error message "system is computationally singular" under mlogit
Dear all, I am trying to estimate a multinomial logit model with mlogit. The data I use for the estimation have the following format (in the full data set, there are many more explanatory variables, but I omit them here for the sake of simplicity):> head(sample)choice cl_vint_com gezinsid pr_tot 1 0 1 411060112 2176.015 2 0 2 411060112 2240.531 3 0 3 411060112 3649.945 4 0 4 411060112 3255.782 5 0 5 411060112 5391.076 6 0 6 411060112 3740.085 "choice" is 1 if the alternative is chosen and 0 otherwise "cl_vint_com " is the ID of the alternative "gezinsid " is the ID of the individuals "pr_tot" is the price of each alternative. I use the following steps for the estimation: two_carmodel_data <- mlogit.data(sample, choice = "choice", shape = "long", alt.var = "cl_vint_com", chid.var = "gezinsid" ) formula_2cars <- mFormula(choice ~ pr_tot ) mod_res <- mlogit(formula_2cars, two_carmodel_data) This leads to the following error messages: Error in solve.default(H, g[!fixed]) : system is computationally singular: reciprocal condition number = 5.17802e-25>From the documentation of the mlogit package, I do not see any mistake in my formulation of mFormula() (pr_tot is an alternative specific variable).I have found some internet discussions of users facing a similar problem, and I understand that in some cases, this may be due to some alternatives that are never chosen in the sample (leading to separation problems). However, I have eliminated all alternatives that are never chosen, and the problem persists. The output of the following table confirms that no quasi-separation occurs): table(sample $pr_tot , sample $choice) I have probably made a very trivial mistake leading to collinerarity, but I do not see where (after all, mlogit takes care of setting the alternative-specific constant for the 1st alternative to zero, doesn't it?) Please advice. Laurent Franckx, PhD Senior researcher sustainable mobility VITO NV | Boeretang 200 | 2400 Mol Tel. ++ 32 14 33 58 22| mob. +32 479 25 59 07 | Skype: laurent.franckx | laurent.franckx at vito.be | Twitter @LaurentFranckx #verify whether all cl_vint are chosen to avoid strict separation test_for_separ <- table(input_for_NLOGIT_all_alt_joined$cl_vint_com ,input_for_NLOGIT_all_alt_joined$choice) test_for_separ_sel <- test_for_separ[test_for_separ$1 == 0, ] VITO Disclaimer: http://www.vito.be/e-maildisclaimer
Bert Gunter
2014-Jan-29 15:42 UTC
[R] error message "system is computationally singular" under mlogit
Advice: I would guess that you are overfitting. Simplify your model. Drop some of the variables.You probably have (near) linear dependencies in your design matrix. -- Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." H. Gilbert Welch On Wed, Jan 29, 2014 at 7:33 AM, Franckx Laurent <laurent.franckx at vito.be> wrote:> Dear all, > > I am trying to estimate a multinomial logit model with mlogit. > > The data I use for the estimation have the following format (in the full data set, there are many more explanatory variables, but I omit them here for the sake of simplicity): > >> head(sample) > choice cl_vint_com gezinsid pr_tot > 1 0 1 411060112 2176.015 > 2 0 2 411060112 2240.531 > 3 0 3 411060112 3649.945 > 4 0 4 411060112 3255.782 > 5 0 5 411060112 5391.076 > 6 0 6 411060112 3740.085 > > "choice" is 1 if the alternative is chosen and 0 otherwise > "cl_vint_com " is the ID of the alternative > "gezinsid " is the ID of the individuals > "pr_tot" is the price of each alternative. > > I use the following steps for the estimation: > > two_carmodel_data <- mlogit.data(sample, choice = "choice", shape = "long", alt.var = "cl_vint_com", chid.var = "gezinsid" ) > formula_2cars <- mFormula(choice ~ pr_tot ) > mod_res <- mlogit(formula_2cars, two_carmodel_data) > > This leads to the following error messages: > > Error in solve.default(H, g[!fixed]) : system is computationally singular: reciprocal condition number = 5.17802e-25 > > > >From the documentation of the mlogit package, I do not see any mistake in my formulation of mFormula() (pr_tot is an alternative specific variable). > > I have found some internet discussions of users facing a similar problem, and I understand that in some cases, this may be due to some alternatives that are never chosen in the sample (leading to separation problems). > > However, I have eliminated all alternatives that are never chosen, and the problem persists. The output of the following table confirms that no quasi-separation occurs): > > table(sample $pr_tot , sample $choice) > > I have probably made a very trivial mistake leading to collinerarity, but I do not see where (after all, mlogit takes care of setting the alternative-specific constant for the 1st alternative to zero, doesn't it?) > > Please advice. > > > Laurent Franckx, PhD > Senior researcher sustainable mobility > VITO NV | Boeretang 200 | 2400 Mol > Tel. ++ 32 14 33 58 22| mob. +32 479 25 59 07 | Skype: laurent.franckx | laurent.franckx at vito.be | Twitter @LaurentFranckx > > > > > > > > > > #verify whether all cl_vint are chosen to avoid strict separation > test_for_separ <- table(input_for_NLOGIT_all_alt_joined$cl_vint_com ,input_for_NLOGIT_all_alt_joined$choice) > test_for_separ_sel <- test_for_separ[test_for_separ$1 == 0, ] > > > > > > > > > > > > > > > > VITO Disclaimer: http://www.vito.be/e-maildisclaimer > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.