I am trying to read in a pipe delimited file that has rows with varying number of columns, here is my sample data: A|B|C|D A|B|C|D|E|F A|B|C|D|E A|B|C|D|E|F|G|H|I A|B|C|D A|B|C|D|E|F|G|H|I|J You can see line 6 has 10 columns. Yet, I can't explain why R does like so:> test <- read.delim("mypaths4.txt", sep="|", quote=NULL, header=F, colClasses="character") > testV1 V2 V3 V4 V5 V6 V7 V8 V9 1 A B C D 2 A B C D E F 3 A B C D E 4 A B C D E F G H I 5 A B C D 6 A B C D E F G H I 7 J You can see it moved "J" to row 7, I don't understand why it is not left in position 6,10. So, more strange to me, I remove line 1, so my data file contains: A|B|C|D|E|F A|B|C|D|E A|B|C|D|E|F|G|H|I A|B|C|D A|B|C|D|E|F|G|H|I|J and I get a totally different result:> test <- read.delim("mypaths5.txt", sep="|", quote=NULL, header=F, colClasses="character") > testV1 V2 V3 V4 V5 V6 V7 V8 V9 V10 1 A B C D E F 2 A B C D E 3 A B C D E F G H I 4 A B C D 5 A B C D E F G H I J what it is that I am doing that is changing the fate of that final "J"? This is just a basic ASCII text file, pipe delimited as shown. I have been racking my brain on this for a day! Brian
Duncan Murdoch
2012-Nov-17 21:27 UTC
[R] Strange problem with reading a pipe delimited file
On 12-11-17 4:18 PM, Brian Feeny wrote:> I am trying to read in a pipe delimited file that has rows with varying number of columns, here is my sample data: > > A|B|C|D > A|B|C|D|E|F > A|B|C|D|E > A|B|C|D|E|F|G|H|I > A|B|C|D > A|B|C|D|E|F|G|H|I|J > > You can see line 6 has 10 columns. Yet, I can't explain why R does like so: > >> test <- read.delim("mypaths4.txt", sep="|", quote=NULL, header=F, colClasses="character") >> test > V1 V2 V3 V4 V5 V6 V7 V8 V9 > 1 A B C D > 2 A B C D E F > 3 A B C D E > 4 A B C D E F G H I > 5 A B C D > 6 A B C D E F G H I > 7 J > > You can see it moved "J" to row 7, I don't understand why it is not left in position 6,10. > > So, more strange to me, I remove line 1, so my data file contains: > > A|B|C|D|E|F > A|B|C|D|E > A|B|C|D|E|F|G|H|I > A|B|C|D > A|B|C|D|E|F|G|H|I|J > > and I get a totally different result: > >> test <- read.delim("mypaths5.txt", sep="|", quote=NULL, header=F, colClasses="character") >> test > V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 > 1 A B C D E F > 2 A B C D E > 3 A B C D E F G H I > 4 A B C D > 5 A B C D E F G H I J > > what it is that I am doing that is changing the fate of that final "J"? This is just a basic ASCII text file, pipe delimited as shown.I would suggest reading the help file: read.delim only looks at the first 5 lines to determine the number of columns if you don't specify the colClasses. Duncan Murdoch