Hello, I need some help on RegEx to detect First and Last names. This is what I currently have: /([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/ This is used to detect a First and Last name where two words are next to each other that begin with a capital letter. So it will detect: John Smith Jane Smith I run into problems where the name is close to the beginning of the sentence: Having John Smith over for dinner. --- This will look at "Having John" Getting Jane Smith ready for school. --- This will look at "Getting Jane" Do you know how to do a RegEx where it will ignore the first word whenever three capitalized words are next to each other? Thanks! -A -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
first you have to check whether there is three capital words are there or two.. if str.match(/([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/) # Do something elsif str.match/([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/ # Do something end I hope this will help u.. Thanks Brijesh Shah -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
That''s close. You want something like /\A([A-Z]+[a-zA-Z]*)\s+([A-Z]+[a-zA-Z]*)\s+([A-Z]+[a-zA-Z]*)/ Which gives you irb(main):021:0> x => "Having Jane Smith" irb(main):022:0> x =~ /\A([A-Z]+[a-zA-Z]*)\s+([A-Z]+[a-zA-Z]*)\s+([A-Z] +[a-zA-Z]*)/ => 0 irb(main):023:0> $1 => "Having" irb(main):024:0> $2 => "Jane" irb(main):025:0> $3 => "Smith" On Mar 5, 12:21 am, Brijesh Shah <li...-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> first you have to check whether there is three capital words are there > or two.. > > if str.match(/([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/) > # Do something > > elsif str.match/([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/ > > # Do something > end > > I hope this will help u.. > > Thanks > > Brijesh Shah > -- > Posted viahttp://www.ruby-forum.com/.-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
On Thu, Mar 4, 2010 at 10:53 PM, Allan Last <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> I run into problems where the name is close to the beginning of the > sentence: > Having John Smith over for dinner. --- This will look at "Having John" > Getting Jane Smith ready for school. --- This will look at "Getting > Jane" > > Do you know how to do a RegEx where it will ignore the first word > whenever three capitalized words are next to each other? Thanks!You know this is not something you''re going to solve with regular expressions, though, right? :-) "San Francisco''s Jane Smith, quoted in Broder''s Washington Post article, said ..." You need a lot more heuristics than a simple RegEx to reliably find names in a block of text. -- Hassan Schroeder ------------------------ hassan.schroeder-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org twitter: @hassan -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
On Fri, Mar 5, 2010 at 12:16 PM, Hassan Schroeder <hassan.schroeder-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> On Thu, Mar 4, 2010 at 10:53 PM, Allan Last <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote: > >> I run into problems where the name is close to the beginning of the >> sentence: >> Having John Smith over for dinner. --- This will look at "Having John" >> Getting Jane Smith ready for school. --- This will look at "Getting >> Jane" >> >> Do you know how to do a RegEx where it will ignore the first word >> whenever three capitalized words are next to each other? Thanks! > > You know this is not something you''re going to solve with regular > expressions, though, right? :-) > > "San Francisco''s Jane Smith, quoted in Broder''s Washington Post > article, said ..." > > You need a lot more heuristics than a simple RegEx to reliably find > names in a block of text.Some other cases to consider John Phillip Sousa (or if you''re a kid a heart John Jacob Jingelheimer Smith) not to mention Spanish names which can have MANY parts. Robert De Niro Jesus Mary and Joseph Surnames with origins in some languages don''t start with a capital Michael Henry de Young - Dutch Wernher von Braun - German -- Rick DeNatale Blog: http://talklikeaduck.denhaven2.com/ Twitter: http://twitter.com/RickDeNatale WWR: http://www.workingwithrails.com/person/9021-rick-denatale LinkedIn: http://www.linkedin.com/in/rickdenatale -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Thanks for the suggestions. I''m going to play around with this. On the most part, I''m doing detection for scenarios with two names, so names like Robert De Niro will not come up. -A Rick Denatale wrote:> On Fri, Mar 5, 2010 at 12:16 PM, Hassan Schroeder > <hassan.schroeder-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >> >> You know this is not something you''re going to solve with regular >> expressions, though, right? �:-) >> >> "San Francisco''s Jane Smith, quoted in Broder''s Washington Post >> �article, said ..." >> >> You need a lot more heuristics than a simple RegEx to reliably find >> names in a block of text. > > Some other cases to consider > > John Phillip Sousa (or if you''re a kid a heart John Jacob Jingelheimer > Smith) not to mention Spanish names which can have MANY parts. > > Robert De Niro > > Jesus Mary and Joseph > > Surnames with origins in some languages don''t start with a capital > > Michael Henry de Young - Dutch > > Wernher von Braun - German > > > > > > -- > Rick DeNatale > > Blog: http://talklikeaduck.denhaven2.com/ > Twitter: http://twitter.com/RickDeNatale > WWR: http://www.workingwithrails.com/person/9021-rick-denatale > LinkedIn: http://www.linkedin.com/in/rickdenatale-- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
On Fri, Mar 5, 2010 at 10:06 PM, Allan Last <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> Thanks for the suggestions. I''m going to play around with this. > > On the most part, I''m doing detection for scenarios with two names, so > names like Robert De Niro will not come up.I''m pretty sure, though that the actor would say he HAD two names, and his first name was "Robert" and his last name was "De Niro" -- Rick DeNatale Blog: http://talklikeaduck.denhaven2.com/ Twitter: http://twitter.com/RickDeNatale WWR: http://www.workingwithrails.com/person/9021-rick-denatale LinkedIn: http://www.linkedin.com/in/rickdenatale -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.