hi I have a string: my_string="blablablabla<coordinates>substring</coordinates>blabla" I need to extract the sentence beetween "<coordinates>" and "</ coordinates>" How can I do that? Thanks for your help JF --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
my_string="blablablabla<coordinates>substring</coordinates>blabla" #the parentheses below define the actual match for the overall regex pattern sub_string = /.*<coordinates>(.*)<\/coordinates>.*/.match(my_string) puts sub_string[0] Regex is the fastest/most effective for one/off text parsing. Another good option is Whytheluckystiff''s Hpricot: http://code.whytheluckystiff.net/hpricot/ Hank --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On Sep 21, 2008, at 5:03 PM, blasterpal wrote:> my_string="blablablabla<coordinates>substring</coordinates>blabla" > #the parentheses below define the actual match for the overall regex > pattern > sub_string = /.*<coordinates>(.*)<\/coordinates>.*/.match(my_string) > puts sub_string[0] > > Regex is the fastest/most effective for one/off text parsing. Another > good option is Whytheluckystiff''s Hpricot: > http://code.whytheluckystiff.net/hpricot/ > > HankYou probably want the regexp to be: /<coordinates>(.*)<\/coordinates>/ so there''s less backtracking when the .* first tries to gobble everything. You might also need something like: /<coordinates\b[^>]*>(.*)<\/coordinates>/ If there can be any attributes on the coordinates tag. Of course, if you really do have XML in my_string, a true parser like Hpricot or REXML will be more reliable than regular expressions. For example, if you had to match against: "blahblah<coordinates>first one</ coordinates>yadayadayada<coordinates>oops! another one</ coordinates>yakyakyak" would you want the substring to be: "first one</coordinates>yadayadayada<coordinates>oops! another one" (yeah, I didn''t think so ;-) -Rob Rob Biedenharn http://agileconsultingllc.com Rob-xa9cJyRlE0mWcWVYNo9pwxS2lgjeYSpx@public.gmane.org --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On Sun, Sep 21, 2008 at 1:28 PM, jef <jfferriere-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> > hi > > I have a string: > my_string="blablablabla<coordinates>substring</coordinates>blabla" > > I need to extract the sentence beetween "<coordinates>" and "</ > coordinates>" > > How can I do that?Hi, I would recommend using the Hpricot and you can find the documentation here: http://code.whytheluckystiff.net/doc/hpricot Good luck, -Conrad --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
hi /.*<coordinates>(.*)<\/coordinates>.*/ The reg exp you gave works fine. I tested it with rubular probleme I can retrieve the substring I always get the whole string. Here is what i did: irb(main):001:0> st="<Point><coordinates>-0.954850,46.436960,0</ coordinates></Point>" => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" irb(main):002:0> sub=/.*<coordinates>(.*)<\/coordinates>.*/.match(st) => #<MatchData:0x7f2040045fd0> irb(main):003:0> sub.inspect => "#<MatchData:0x7f2040045fd0>" irb(main):004:0> sub.to_s => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" irb(main):005:0> sub.string => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" irb(main):006:0> st.match(/.*<coordinates>(.*)<\/coordinates>.*/) => #<MatchData:0x7f2040019fc0> irb(main):007:0> st.match(/.*<coordinates>(.*)<\/coordinates>.*/).to_s => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" thank you for your help : --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Regexp.match(string) will return you a MatchData object, which is not just the match: It can be accessed as an Array. So: sub[0] returns the entire matched string sub[1], sub[2], ... return the values of the matched back references (the ones between parentheses). sub[1] is therefore the thing you want to use. No need to use to_s. On 22 sep, 14:20, jef <jfferri...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> hi > > /.*<coordinates>(.*)<\/coordinates>.*/ The reg exp you gave works > fine. I tested it with rubular > > probleme I can retrieve the substring I always get the whole string. > > Here is what i did: > > irb(main):001:0> st="<Point><coordinates>-0.954850,46.436960,0</ > coordinates></Point>" > => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" > irb(main):002:0> sub=/.*<coordinates>(.*)<\/coordinates>.*/.match(st) > => #<MatchData:0x7f2040045fd0> > irb(main):003:0> sub.inspect > => "#<MatchData:0x7f2040045fd0>" > irb(main):004:0> sub.to_s > => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" > irb(main):005:0> sub.string > => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" > irb(main):006:0> st.match(/.*<coordinates>(.*)<\/coordinates>.*/) > => #<MatchData:0x7f2040019fc0> > irb(main):007:0> st.match(/.*<coordinates>(.*)<\/coordinates>.*/).to_s > => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" > > thank you for your help > > :--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
ah ok thank you all for your help On 22 sep, 14:36, deegee <dirkgro...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Regexp.match(string) will return you a MatchData object, which is not > just the match: It can be accessed as an Array. So: > sub[0] returns the entire matched string > sub[1], sub[2], ... return the values of the matched back references > (the ones between parentheses). > > sub[1] is therefore the thing you want to use. No need to use to_s. > > On 22 sep, 14:20, jef <jfferri...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > hi > > > /.*<coordinates>(.*)<\/coordinates>.*/ The reg exp you gave works > > fine. I tested it with rubular > > > probleme I can retrieve the substring I always get the whole string. > > > Here is what i did: > > > irb(main):001:0> st="<Point><coordinates>-0.954850,46.436960,0</ > > coordinates></Point>" > > => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" > > irb(main):002:0> sub=/.*<coordinates>(.*)<\/coordinates>.*/.match(st) > > => #<MatchData:0x7f2040045fd0> > > irb(main):003:0> sub.inspect > > => "#<MatchData:0x7f2040045fd0>" > > irb(main):004:0> sub.to_s > > => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" > > irb(main):005:0> sub.string > > => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" > > irb(main):006:0> st.match(/.*<coordinates>(.*)<\/coordinates>.*/) > > => #<MatchData:0x7f2040019fc0> > > irb(main):007:0> st.match(/.*<coordinates>(.*)<\/coordinates>.*/).to_s > > => "<Point><coordinates>-0.954850,46.436960,0</coordinates></Point>" > > > thank you for your help > > > :--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---