David Winsemius
2020-Sep-22 15:15 UTC
[R] Help with the Error Message in R "Error in 1:nchid : result would be too long a vector"
You were told two things about your code: 1) mlogit.data is deprecated by the package authors, so use dfidx. 2) dfidx does not allow duplicate ids in the first two columns. Which one of those are you asserting is not accurate? -- David. On 9/21/20 10:20 PM, Rahul Chakraborty wrote:> Hello David and everyone, > > I am really sorry for not abiding by the specific guidelines in my > prior communications. I tried to convert the present email in plain > text format (at least it is showing me so in my gmail client). I have > also converted the xlsx file into a csv format with .txt extension. > > So, my problem is I need to run panel mixed logit regression for a > choice model. There are 3 alternatives, 9 questions for each > individual and 516 individuals in data. I have created a csv file in > long format from the survey questionnaire. Apart from the alternative > specific variables I have many individual specific variables and most > of these are dummies (dummy coded). I will use subsets of these in my > alternative model specifications. So, in my data I have 100 columns > with 13932 rows (3*9*516). After reading the csv file and creating a > dataframe 'mydata' I used the following command for mlogit. > > mldata1<- mlogit.data(mydata, shape = "long", alt.var = "Alt_name", > choice = "Choice_binary", id.var = "IND") > > It gives me the same error message- Error in 1:nchid : result would be > too long a vector. > > The attached file (csv file with .txt extension) is an example of 2 > individuals each with 3 questions. I have also reduced the number of > columns to 57. Now, there are 18 rows. But still if I use the same > command on my new data I get the same error message. Can anyone please > help me out with this? Because of this error I am stuck at the > dataframe level. > > > Thanks in advance. > > > Regards, > Rahul Chakraborty > > On Tue, Sep 22, 2020 at 4:50 AM David Winsemius <dwinsemius at comcast.net> wrote: >> @Rahul; >> >> >> You need to learn to post in plain text and attachments may not be xls >> or xlsx. They need to be text files. And even if they are comma >> separated files and text, they still need to be named with a txt extension. >> >> >> I'm the only one who got the xlsx file. I got the error regardless of >> how many column I omitted, so my gues was possibly incorrect. But I did >> RTFM. See ?mlogit.datadfi The mlogit.data function is deprecated and you >> are told to use the dfidx function. Trying that you now get an error >> saying: " the two indexes don't define unique observations". >> >> >> > sum(duplicated( dfrm[,1:2])) >> [1] 12 >> > length(dfrm[,1]) >> [1] 18 >> >> So of your 18 lines in the example file, most of them appear to be >> duplicated in their first two rows and apparently that is not allowed by >> dfidx. >> >> >> Caveat: I'm not a user of the mlogit package so I'm just reading the >> manual and possibly coming up with informed speculation. >> >> Please read the Posting Guide. You have been warned. Repeated violations >> of the policies laid down in that hallowed document will possibly result >> in postings being ignored. >>
Rui Barradas
2020-Sep-22 17:30 UTC
[R] Help with the Error Message in R "Error in 1:nchid : result would be too long a vector"
Hello, I apologize if the rest of quotes prior to David's email are missing, for some reason today my mail client is not including them. As for the question, there are two other problems: 1) Alt_name is misspelled, it should be ALT_name; 2) the data is in wide, not long, format. A 3rd, problem is that in ?dfidx it says alt.var the name of the variable that contains the alternative index (for a long data.frame only) or the name under which the alternative index will be stored (the default name is alt) So if shape = "wide", alt.var is not needed. But I am not a user of package mlogit, I'm just guessing. The following seems to fix it (it doesn't throw errors). mldata1 <- dfidx(mydata, shape = "wide", #alt.var = "ALT_name", choice = "Choice_binary", id.var = "IND") Hope this helps, Rui Barradas ?s 16:15 de 22/09/20, David Winsemius escreveu:> You were told two things about your code: > > > 1) mlogit.data is deprecated by the package authors, so use dfidx. > > 2) dfidx does not allow duplicate ids in the first two columns. > > > Which one of those are you asserting is not accurate? > >
Rahul Chakraborty
2020-Sep-22 17:51 UTC
[R] Help with the Error Message in R "Error in 1:nchid : result would be too long a vector"
David, My apologies with the first one. I was checking different tutorials on mlogit where they were using mlogit.data, so I ended up using it. I am not getting what you are saying by the "duplicates in first two columns". See, my first column is IND which identifies my individuals, second column is QES which identifies the question number each individual faces, 3rd column is a stratification code that can be ignored. Columns 6-13 are alternative specific variables and rest are individual specific. So 1st 3 rows indicate 1st question faced by 1st individual containing 3 alternatives, and so on. So, I have already arranged the data in long format. Here, I could not get what the "duplicate in first two columns" mean. And I am really sorry that there was an error in my code as Rui has pointed out. The correct code is mldata1 <- dfidx(mydata, shape = "long", alt.var = "ALT_name", choice = "Choice_binary", id.var = "IND") It still shows the error- "the two indexes don't define unique observations" It would be really helpful if you kindly help. Regards, On Tue, Sep 22, 2020 at 8:46 PM David Winsemius <dwinsemius at comcast.net> wrote:> > You were told two things about your code: > > > 1) mlogit.data is deprecated by the package authors, so use dfidx. > > 2) dfidx does not allow duplicate ids in the first two columns. > > > Which one of those are you asserting is not accurate? > > > -- > > David. > > On 9/21/20 10:20 PM, Rahul Chakraborty wrote: > > Hello David and everyone, > > > > I am really sorry for not abiding by the specific guidelines in my > > prior communications. I tried to convert the present email in plain > > text format (at least it is showing me so in my gmail client). I have > > also converted the xlsx file into a csv format with .txt extension. > > > > So, my problem is I need to run panel mixed logit regression for a > > choice model. There are 3 alternatives, 9 questions for each > > individual and 516 individuals in data. I have created a csv file in > > long format from the survey questionnaire. Apart from the alternative > > specific variables I have many individual specific variables and most > > of these are dummies (dummy coded). I will use subsets of these in my > > alternative model specifications. So, in my data I have 100 columns > > with 13932 rows (3*9*516). After reading the csv file and creating a > > dataframe 'mydata' I used the following command for mlogit. > > > > mldata1<- mlogit.data(mydata, shape = "long", alt.var = "Alt_name", > > choice = "Choice_binary", id.var = "IND") > > > > It gives me the same error message- Error in 1:nchid : result would be > > too long a vector. > > > > The attached file (csv file with .txt extension) is an example of 2 > > individuals each with 3 questions. I have also reduced the number of > > columns to 57. Now, there are 18 rows. But still if I use the same > > command on my new data I get the same error message. Can anyone please > > help me out with this? Because of this error I am stuck at the > > dataframe level. > > > > > > Thanks in advance. > > > > > > Regards, > > Rahul Chakraborty > > > > On Tue, Sep 22, 2020 at 4:50 AM David Winsemius <dwinsemius at comcast.net> wrote: > >> @Rahul; > >> > >> > >> You need to learn to post in plain text and attachments may not be xls > >> or xlsx. They need to be text files. And even if they are comma > >> separated files and text, they still need to be named with a txt extension. > >> > >> > >> I'm the only one who got the xlsx file. I got the error regardless of > >> how many column I omitted, so my gues was possibly incorrect. But I did > >> RTFM. See ?mlogit.datadfi The mlogit.data function is deprecated and you > >> are told to use the dfidx function. Trying that you now get an error > >> saying: " the two indexes don't define unique observations". > >> > >> > >> > sum(duplicated( dfrm[,1:2])) > >> [1] 12 > >> > length(dfrm[,1]) > >> [1] 18 > >> > >> So of your 18 lines in the example file, most of them appear to be > >> duplicated in their first two rows and apparently that is not allowed by > >> dfidx. > >> > >> > >> Caveat: I'm not a user of the mlogit package so I'm just reading the > >> manual and possibly coming up with informed speculation. > >> > >> Please read the Posting Guide. You have been warned. Repeated violations > >> of the policies laid down in that hallowed document will possibly result > >> in postings being ignored. > >>-- Rahul Chakraborty Research Fellow National Institute of Public Finance and Policy New Delhi- 110067
Rui Barradas
2020-Sep-22 21:53 UTC
[R] Help with the Error Message in R "Error in 1:nchid : result would be too long a vector"
Hello, Please keep this on the list so that others can give their contribution. If you have reshaped your data can you post the code you ran to reshape it? Right now we only have the original attachment, in wide format, not the long format data. Rui Barradas ?s 21:55 de 22/09/20, Rahul Chakraborty escreveu:> Hi, > > Thank you so much for your reply. > Yes, thank you for pointing that out, I apologise for that error in > the variable name. However, my data is in long format. > > See, my first column is IND which identifies my individuals, > second column is QES which identifies the question number each > individual faces, 3rd column is a stratification code that can be > ignored. Columns 6-13 are alternative specific variables and rest are > individual specific. So 1st 3 rows indicate 1st question faced by 1st > individual containing 3 alternatives, and so on. So, I have already > arranged the data in long format. > > With that in mind if I use shape="long" it still gives me error. > > Best regards, > > On Tue, Sep 22, 2020 at 11:00 PM Rui Barradas <ruipbarradas at sapo.pt> wrote: >> >> Hello, >> >> I apologize if the rest of quotes prior to David's email are missing, >> for some reason today my mail client is not including them. >> >> As for the question, there are two other problems: >> >> 1) Alt_name is misspelled, it should be ALT_name; >> >> 2) the data is in wide, not long, format. >> >> A 3rd, problem is that in ?dfidx it says >> >> alt.var >> the name of the variable that contains the alternative index (for a long >> data.frame only) or the name under which the alternative index will be >> stored (the default name is alt) >> >> >> So if shape = "wide", alt.var is not needed. >> But I am not a user of package mlogit, I'm just guessing. >> >> The following seems to fix it (it doesn't throw errors). >> >> >> mldata1 <- dfidx(mydata, shape = "wide", >> #alt.var = "ALT_name", >> choice = "Choice_binary", >> id.var = "IND") >> >> >> Hope this helps, >> >> Rui Barradas >> >> >> ?s 16:15 de 22/09/20, David Winsemius escreveu: >>> You were told two things about your code: >>> >>> >>> 1) mlogit.data is deprecated by the package authors, so use dfidx. >>> >>> 2) dfidx does not allow duplicate ids in the first two columns. >>> >>> >>> Which one of those are you asserting is not accurate? >>> >>> > > >
Rahul Chakraborty
2020-Oct-01 06:24 UTC
[R] Help with the Error Message in R "Error in 1:nchid : result would be too long a vector"
Hello Rui, Thanks a lot for your response. But, I will surely say that the data I attached is in long format as it has 18 rows (3 alternatives*3 questions* 2 individuals). Had it been a wide format data it would have had 6 rows (3 questions* 2 individuals). But, anyway thanks. Best, Rahul On Wed, Sep 23, 2020 at 3:23 AM Rui Barradas <ruipbarradas at sapo.pt> wrote:> > Hello, > > Please keep this on the list so that others can give their contribution. > > If you have reshaped your data can you post the code you ran to reshape > it? Right now we only have the original attachment, in wide format, not > the long format data. > > Rui Barradas > > ?s 21:55 de 22/09/20, Rahul Chakraborty escreveu: > > Hi, > > > > Thank you so much for your reply. > > Yes, thank you for pointing that out, I apologise for that error in > > the variable name. However, my data is in long format. > > > > See, my first column is IND which identifies my individuals, > > second column is QES which identifies the question number each > > individual faces, 3rd column is a stratification code that can be > > ignored. Columns 6-13 are alternative specific variables and rest are > > individual specific. So 1st 3 rows indicate 1st question faced by 1st > > individual containing 3 alternatives, and so on. So, I have already > > arranged the data in long format. > > > > With that in mind if I use shape="long" it still gives me error. > > > > Best regards, > > > > On Tue, Sep 22, 2020 at 11:00 PM Rui Barradas <ruipbarradas at sapo.pt> wrote: > >> > >> Hello, > >> > >> I apologize if the rest of quotes prior to David's email are missing, > >> for some reason today my mail client is not including them. > >> > >> As for the question, there are two other problems: > >> > >> 1) Alt_name is misspelled, it should be ALT_name; > >> > >> 2) the data is in wide, not long, format. > >> > >> A 3rd, problem is that in ?dfidx it says > >> > >> alt.var > >> the name of the variable that contains the alternative index (for a long > >> data.frame only) or the name under which the alternative index will be > >> stored (the default name is alt) > >> > >> > >> So if shape = "wide", alt.var is not needed. > >> But I am not a user of package mlogit, I'm just guessing. > >> > >> The following seems to fix it (it doesn't throw errors). > >> > >> > >> mldata1 <- dfidx(mydata, shape = "wide", > >> #alt.var = "ALT_name", > >> choice = "Choice_binary", > >> id.var = "IND") > >> > >> > >> Hope this helps, > >> > >> Rui Barradas > >> > >> > >> ?s 16:15 de 22/09/20, David Winsemius escreveu: > >>> You were told two things about your code: > >>> > >>> > >>> 1) mlogit.data is deprecated by the package authors, so use dfidx. > >>> > >>> 2) dfidx does not allow duplicate ids in the first two columns. > >>> > >>> > >>> Which one of those are you asserting is not accurate? > >>> > >>> > > > > > >-- Rahul Chakraborty Research Fellow National Institute of Public Finance and Policy New Delhi- 110067