I have some textfiles with about the following format -- Lorem ipsum|And so on|And on|And on 5|3|4|77|2|3|5 More lorem|And more ipsum|And so forth -- Just several more lines :) What would be the easiest way to parse these files? Just iterate line by line and regex-split on "|"? There are also some of the files that might have the number 5 on line 3 (for example) which means that there will be 5 "blocks" of information (spanning say 3 lines per block), starting with line 4.. Any suggestions? Christian... --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Hi Christian, So you have 2 formats to parse: a single-line format and a multi-line format. It sounds like you could just use String.split for the single line format: data = line.split(''|'') after which, data will be an array with an element for each column. For line 2 of your example below, the array would be [5, 3, 4, 77, 2, 3, 5]. See here for more info on split: http://www.ruby-doc.org/core/classes/String.html#M000818. For the second format, how do you know the number of lines each block will span? -Dan On 9/12/07, Christian Wattengård <cwattengard-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> > > I have some textfiles with about the following format > -- > Lorem ipsum|And so on|And on|And on > 5|3|4|77|2|3|5 > More lorem|And more ipsum|And so forth > -- > Just several more lines :) > > What would be the easiest way to parse these files? Just iterate line > by line and regex-split on "|"? > There are also some of the files that might have the number 5 on line > 3 (for example) which means that there will be 5 "blocks" of > information (spanning say 3 lines per block), starting with line 4.. > > Any suggestions? > > Christian... > > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Yes I thought the simplest ones would be simple ;) The number of lines is a constant, if there is no info in a line, there is just a newline there. So a 4 line block could look like this -- Line1 Line2 Line4 -- Christian... On Sep 12, 1:56 pm, "Dan Falcone" <danfalc...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Hi Christian, > > So you have 2 formats to parse: a single-line format and a multi-line > format. It sounds like you could just use String.split for the single line > format: > > data = line.split(''|'') > > after which, data will be an array with an element for each column. For > line 2 of your example below, the array would be [5, 3, 4, 77, 2, 3, 5]. > See here for more info on split:http://www.ruby-doc.org/core/classes/String.html#M000818. > > For the second format, how do you know the number of lines each block will > span? > > -Dan > > On 9/12/07, Christian Wattengård <cwatteng...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > > I have some textfiles with about the following format > > -- > > Lorem ipsum|And so on|And on|And on > > 5|3|4|77|2|3|5 > > More lorem|And more ipsum|And so forth > > -- > > Just several more lines :) > > > What would be the easiest way to parse these files? Just iterate line > > by line and regex-split on "|"? > > There are also some of the files that might have the number 5 on line > > 3 (for example) which means that there will be 5 "blocks" of > > information (spanning say 3 lines per block), starting with line 4.. > > > Any suggestions? > > > Christian...--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
From your original email:> There are also some of the files that might have the number 5 on line > 3 (for example) which means that there will be 5 "blocks" of > information (spanning say 3 lines per block), starting with line 4.Does the 5 mean 5 "blocks" of information, or 5 lines of information? I''ll assume 5 lines for now, let me know if that''s incorrect. This isn''t the prettiest code, but it should work: in_block = false count = 0 max = nil open(file).each do |line| if in_block # do something with the next line in the block, could add it to an array, etc... # update block counters count = count + 1 in_block = false if count == max # exit block if necessary, could process the block now next end if line =~ /\d+$/ # entering a block count = 1 max = line.strip.to_i in_block = true else # parse a regular line data = line.split(''|'') # do something with the line data here... end end -Dan On 9/12/07, Christian Wattengård <cwattengard-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> > > Yes I thought the simplest ones would be simple ;) > > The number of lines is a constant, if there is no info in a line, > there is just a newline there. > So a 4 line block could look like this > -- > Line1 > Line2 > > Line4 > -- > > Christian... > > On Sep 12, 1:56 pm, "Dan Falcone" <danfalc...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > Hi Christian, > > > > So you have 2 formats to parse: a single-line format and a multi-line > > format. It sounds like you could just use String.split for the single > line > > format: > > > > data = line.split(''|'') > > > > after which, data will be an array with an element for each column. For > > line 2 of your example below, the array would be [5, 3, 4, 77, 2, 3, 5]. > > See here for more info on split: > http://www.ruby-doc.org/core/classes/String.html#M000818. > > > > For the second format, how do you know the number of lines each block > will > > span? > > > > -Dan > > > > On 9/12/07, Christian Wattengård <cwatteng...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > > > > > > I have some textfiles with about the following format > > > -- > > > Lorem ipsum|And so on|And on|And on > > > 5|3|4|77|2|3|5 > > > More lorem|And more ipsum|And so forth > > > -- > > > Just several more lines :) > > > > > What would be the easiest way to parse these files? Just iterate line > > > by line and regex-split on "|"? > > > There are also some of the files that might have the number 5 on line > > > 3 (for example) which means that there will be 5 "blocks" of > > > information (spanning say 3 lines per block), starting with line 4.. > > > > > Any suggestions? > > > > > Christian... > > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---