I am trying to estimate multinomial logit models off of a .csv table in IDCASE IDALT format where I have ROWS HHID PERID CASE ALTNUM NUMALTS CHOSEN IVTT OVTT TVTT COST DIST WKZONE HMZONE RSPOPDEN RSEMPDEN WKPOPDEN.... 1 1 2 1 1 1 5 1 13.38 2.00 15.38 70.63 7.69 664 726 15.52 9.96 37.26 2 2 2 1 1 2 5 0 18.38 2.00 20.38 35.32 7.69 664 726 15.52 9.96 37.26 3 3 2 1 1 3 5 0 20.38 2.00 22.38 20.18 7.69 664 726 15.52 9.96 37.26 4 4 2 1 1 4 5 0 25.90 15.20 41.10 115.64 7.69 664 726 15.52 9.96 37.26 5 5 2 1 1 5 5 0 40.50 2.00 42.50 0.00 7.69 664 726 15.52 9.96 37.26 6 6 3 1 2 1 5 0 29.92 10.00 39.92 390.81 11.62 738 9 35.81 53.33 32.91 7 7 3 1 2 2 5 0 34.92 10.00 44.92 195.40 11.62 738 9 35.81 53.33 32.91 8 8 3 1 2 3 5 0 21.92 10.00 31.92 97.97 11.62 738 9 35.81 53.33 32.91 9 9 3 1 2 4 5 1 22.96 14.20 37.16 185.00 11.62 738 9 35.81 53.33 32.91 to bring the data.frame into R, I use> hbwtrips<-read.csv("workdata.csv",header=TRUE, sep=",",dec=".",row.names=NULL) everything is fine at this point. I can generate descriptive statistics, estimate regression models, and then do everything as a good data.frame should. But when I try to put the data into mlogit.data with> hbwmode<-mlogit.data(hbwtrips,varying=c(8:11),shape="long",choice="CHOSEN", alt.var="ALTNUM") I get Error in `row.names<-.data.frame`(`*tmp*`, value = c("1.1", "1.2", "1.3", : duplicate 'row.names' are not allowed In addition: Warning message: non-unique values when setting 'row.names': ‘1.1’, ‘10.1’, ‘10.4’, ‘100.4’, ‘1000.2’, ‘1001.1’, ‘1001.3’, ‘1002.2’, ‘1002.3’, ‘1003.4’, ‘1004.1’, ‘1004.2’, ‘1005.1’, ‘1005.3’, ‘1007.2’, ‘1008.3’, ‘1008.4’, ‘1009.1’, ‘1009.2’, ‘1009.3’, ‘101.1’, ‘1010.4’, ‘1011.1’, ‘1012.2’, ‘1013.3’, ‘1013.4’, [..., truncated] It seems as though the mlogit.data command tries to reassign my row.names, and doesn't do it right. Is this accurate? How do I move forward? -- Gregory Macfarlane, EIT Graduate Research Assistant School of Civil and Environmental Engineering Georgia Institute of Technology gregmacfarlane@gmail.com 801.616.9822 [[alternative HTML version deleted]]
On Feb 28, 2011; 10:33pm Gregory Macfarlane wrote:>> It seems as though the mlogit.data command tries to reassign my >> row.names, >> and doesn't do it right. Is this accurate? How do I move forward?Take the time to do as the posting guide asks you to do (and maybe consider the possibility that you have got it wrong). On that point, "Z" gave some further pointedly sound advice. Hope that helps, Mark. -- View this message in context: http://r.789695.n4.nabble.com/mlogit-data-tp3328739p3328865.html Sent from the R help mailing list archive at Nabble.com.
http://r.789695.n4.nabble.com/file/n3329821/workdata.csv workdata.csv The code I posted is exactly what I am running. What you need is this data. Here is the code again.> hbwmode<-mlogit.data("worktrips.csv", shape="long", choice="CHOSEN", > alt.var="ALTNUM") > hbwmode<-mlogit.data(hbwtrips, shape="long", choice="CHOSEN", > alt.var="ALTNUM")-- View this message in context: http://r.789695.n4.nabble.com/mlogit-data-tp3328739p3329821.html Sent from the R help mailing list archive at Nabble.com.
My previous posting seems to have got mangled. This reposts it. On Mar 01, 2011; 03:32pm gmacfarlane wrote:>> workdata.csv >> The code I posted is exactly what I am running. What you need is this >> data. Here is the code again. > hbwmode<-mlogit.data("worktrips.csv", shape="long", choice="CHOSEN", > alt.var="ALTNUM") > hbwmode<-mlogit.data(hbwtrips, shape="long", choice="CHOSEN", > alt.var="ALTNUM")You still have not done what the posting guide asks for but have expected me (or someone else) to scrutinize a large unknown data set (22003 rows). Fortunately there are other routes. Had you studied Yves Croissant's examples (?mlogit.data), which do work, you would have seen that your input or "raw" data have to have a particular format for mlogit.data to work. In particular, the "alt.var" ("mode" in the TravelMode data set and "ALTNUM" in your data set) has to go through all its levels in sequence. Yours don't (your variable has 6 levels but sometimes runs from 1 to 5, sometimes from 2 to 6, and so on). Within each run there must be only one choice. ##> library(mlogit) > data("TravelMode", package = "AER") > head(TravelMode, n= 20)individual mode choice wait vcost travel gcost income size 1 1 air no 69 59 100 70 35 1 2 1 train no 34 31 372 71 35 1 3 1 bus no 35 25 417 70 35 1 4 1 car yes 0 10 180 30 35 1 5 2 air no 64 58 68 68 30 2 6 2 train no 44 31 354 84 30 2 7 2 bus no 53 25 399 85 30 2 8 2 car yes 0 11 255 50 30 2 9 3 air no 69 115 125 129 40 1 10 3 train no 34 98 892 195 40 1 11 3 bus no 35 53 882 149 40 1 12 3 car yes 0 23 720 101 40 1 13 4 air no 64 49 68 59 70 3 14 4 train no 44 26 354 79 70 3 15 4 bus no 53 21 399 81 70 3 16 4 car yes 0 5 180 32 70 3 17 5 air no 64 60 144 82 45 2 18 5 train no 44 32 404 93 45 2 19 5 bus no 53 26 449 94 45 2 20 5 car yes 0 8 600 99 45 2 When we look at just the relevant part of your data we have the following:> hbwtrips<-read.csv("E:/Downloads/workdata.csv", header=TRUE, sep=",", > dec=".", row.names=NULL) > head(hbwtrips[, c(2:11)], n=25)HHID PERID CASE ALTNUM NUMALTS CHOSEN IVTT OVTT TVTT COST 1 2 1 1 1 5 1 13.38 2.00 15.38 70.63 2 2 1 1 2 5 0 18.38 2.00 20.38 35.32 3 2 1 1 3 5 0 20.38 2.00 22.38 20.18 4 2 1 1 4 5 0 25.90 15.20 41.10 115.64 5 2 1 1 5 5 0 40.50 2.00 42.50 0.00 6 3 1 2 1 5 0 29.92 10.00 39.92 390.81 7 3 1 2 2 5 0 34.92 10.00 44.92 195.40 8 3 1 2 3 5 0 21.92 10.00 31.92 97.97 9 3 1 2 4 5 1 22.96 14.20 37.16 185.00 10 3 1 2 5 5 0 58.95 10.00 68.95 0.00 11 5 1 3 1 4 1 8.60 6.00 14.60 37.76 12 5 1 3 2 4 0 13.60 6.00 19.60 18.88 13 5 1 3 3 4 0 15.60 6.00 21.60 10.79 14 5 1 3 4 4 0 16.87 21.40 38.27 105.00 15 6 1 4 1 4 0 30.60 8.50 39.10 417.32 16 6 1 4 2 4 0 35.70 8.50 44.20 208.66 17 6 1 4 3 4 0 22.70 8.50 31.20 105.54 18 6 1 4 4 4 1 24.27 9.00 33.27 193.49 19 8 2 5 2 4 1 23.04 3.00 26.04 29.95 20 8 2 5 3 4 0 25.04 3.00 28.04 17.12 21 8 2 5 4 4 0 25.04 23.50 48.54 100.00 22 8 2 5 5 4 0 34.35 3.00 37.35 0.00 23 8 3 6 2 5 0 11.14 3.50 14.64 14.00 24 8 3 6 3 5 0 13.14 3.50 16.64 8.00 25 8 3 6 4 5 1 3.95 16.24 20.19 100.00 To show you that this is so we will mock up two variables that have the characteristics described above and use them to execute the function. ## hbwtrips$CHOICEN <- rep(c(rep(0,10),1), 2003) hbwtrips$ALTNUMTest <- gl(11,1,22033, labels=LETTERS[1:11]) hbwtrips[1:30, c(1:11,44,45)] hbwmode <- mlogit.data(hbwtrips, varying=c(8:11), shape="long", choice="CHOICEN", alt.var="ALTNUMTest") Hope that helps, Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/mlogit-data-tp3328739p3330148.html Sent from the R help mailing list archive at Nabble.com.
On the same token, is it possible to run a mlogit on a data set that has differing numbers of "alt.vars"? would this dataset be valid with the addition of the below bolded record? 1 1 air no 69 59 100 70 35 1 2 1 train no 34 31 372 71 35 1 3 1 bus no 35 25 417 70 35 1 4 1 car yes 0 10 180 30 35 1 5 2 air no 64 58 68 68 30 2 6 2 train no 44 31 354 84 30 2 7 2 bus no 53 25 399 85 30 2 8 2 car yes 0 11 255 50 30 2 9 2 bike no 0 14 322 33 30 2 If not, what would be the best solution for this? Thanks in advance..... -- View this message in context: http://r.789695.n4.nabble.com/mlogit-data-tp3328739p3569902.html Sent from the R help mailing list archive at Nabble.com.