hi all how to parse such email string to a array: #-------------------------------------------- "joe black"<joe_1@joe_black.com> joe_2@joe_black.com, joe_3@joe_black.com #-------------------------------------------- seems emails from user input include various format. and i have to split them to a array. any idea? regards. -- Posted via http://www.ruby-forum.com/.
Am Dienstag, den 14.03.2006, 04:05 +0100 schrieb Joe Black:> hi all > how to parse such email string to a array: > #-------------------------------------------- > "joe black"<joe_1@joe_black.com> joe_2@joe_black.com, > joe_3@joe_black.com > #-------------------------------------------- > seems emails from user input include various format. and i have to > split them to a array. > any idea?string = "\"joe black\"<joe_1@joe_black.com> joe_2@joe_black.com, joe_3@joe_black.com" email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+\.[a-z]+)/i) -- Norman Timmler http://blog.inlet-media.de
On Tue, Mar 14, 2006 at 10:53:18AM +0100, Norman Timmler wrote:> string = "\"joe black\"<joe_1@joe_black.com> joe_2@joe_black.com, > joe_3@joe_black.com" > email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+\.[a-z]+)/i)That will catch many addresses, but it will exclude other perfectly valid ones. RFC2822 http://www.faqs.org/rfcs/rfc2822.html is the place to look, the local part (left-hand-side) of an address can include :- atext = ALPHA / DIGIT / ; Any character except controls, "!" / "#" / ; SP, and specials. "$" / "%" / ; Used for atoms "&" / "''" / "*" / "+" / "-" / "/" / "=" / "?" / "^" / "_" / "`" / "{" / "|" / "}" / "~" as well as the "." character. Hopefully there is some library that parses addresses into canonical form that can be used, rather than a simple regexp? -jim
Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim Cheetham:> On Tue, Mar 14, 2006 at 10:53:18AM +0100, Norman Timmler wrote: > > string = "\"joe black\"<joe_1@joe_black.com> joe_2@joe_black.com, > > joe_3@joe_black.com" > > email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+\.[a-z]+)/i) > > That will catch many addresses, but it will exclude other perfectly > valid ones. > > RFC2822 http://www.faqs.org/rfcs/rfc2822.html is the place to look, the > local part (left-hand-side) of an address can include :- > atext = ALPHA / DIGIT / ; Any character except controls, > "!" / "#" / ; SP, and specials. > "$" / "%" / ; Used for atoms > "&" / "''" / > "*" / "+" / > "-" / "/" / > "=" / "?" / > "^" / "_" / > "`" / "{" / > "|" / "}" / > "~" > as well as the "." character. > > Hopefully there is some library that parses addresses into canonical > form that can be used, rather than a simple regexp?@Jim Is there such a library for ruby? Can you provide a link? If a simple regexp is not enough for you, you can find a complex here: http://ex-parrot.com/~pdw/Mail-RFC822-Address.html But i think this is a bit disproportionated. @Joe The regexp on the the top matches most email addresses in the wildlife with nearly no loss. If you expect some of the special characters in the local part of your email addresses, just add them to the regexp. It should fit your needs. -- Norman Timmler http://blog.inlet-media.de
On Wed, Mar 15, 2006 at 09:25:02AM +0100, Norman Timmler wrote:> Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim Cheetham: > > Hopefully there is some library that parses addresses into canonical > > form that can be used, rather than a simple regexp? > > @Jim > Is there such a library for ruby? Can you provide a link?I have no idea - I would expect there is one, I hope someone on the list could provide a reference :-) Perhaps RhizMail.valid_address? would do it - I''m surprised no test is to be found in ActionMailer, but I can''t see one in the API docs.> The regexp on the the top matches most email addresses in the wildlife > with nearly no loss. If you expect some of the special characters in the > local part of your email addresses, just add them to the regexp. It > should fit your needs.If you''re not going to check the full validity of email addresses, you should document it, accept it, and write some tests that clearly show that failing to validate email addresses to the spec is *expected* behaviour of your app. There''s nothing worse for a user to see their perfectly valid email address rejected by a website, when there''s nothing wrong with it. -jim
I remember seeing a PHP script a while back that would actually initiate a SMTP connection to the host to verify if the address was correct. I thought that was a pretty cool trick to actually verify not only that the address was syntactically correct, but that it was also a valid email address. I''ll have to see if I can dig it up. Brandon On 3/15/06, Jim Cheetham <jim@gonzul.net> wrote:> On Wed, Mar 15, 2006 at 09:25:02AM +0100, Norman Timmler wrote: > > Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim Cheetham: > > > Hopefully there is some library that parses addresses into canonical > > > form that can be used, rather than a simple regexp? > > > > @Jim > > Is there such a library for ruby? Can you provide a link? > > I have no idea - I would expect there is one, I hope someone on the list > could provide a reference :-) > > Perhaps RhizMail.valid_address? would do it - I''m surprised no test is > to be found in ActionMailer, but I can''t see one in the API docs. > > > The regexp on the the top matches most email addresses in the wildlife > > with nearly no loss. If you expect some of the special characters in the > > local part of your email addresses, just add them to the regexp. It > > should fit your needs. > > If you''re not going to check the full validity of email addresses, you > should document it, accept it, and write some tests that clearly show > that failing to validate email addresses to the spec is *expected* > behaviour of your app. > > There''s nothing worse for a user to see their perfectly valid email > address rejected by a website, when there''s nothing wrong with it. > > -jim > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails >
On Wed, Mar 15, 2006 at 04:35:50PM -0500, Brandon Keepers wrote:> I remember seeing a PHP script a while back that would actually > initiate a SMTP connection to the host to verify if the address was > correct. I thought that was a pretty cool trick to actually verifyIt''s not possible to verify that an email address "actually exists". There are lots of reasons for this, all to do with SMTP server delivery behaviour, DNS failure and so on. In any case, the AUP of many hosting services requires email communications with customers to be double-opt-in, because if you''re not allowing the user to *confirm* that they want to receive email from your app, it''s spam, and the ISP might get blacklisted. Plus I believe there are some laws governing this sort of thing in many jurisdictions. So, when a user enters an email address into your app, that you intend to use for sending messages later, you *should* :- * Send a message to them that they need to reply to * (Decide how hard you will try to deliver if there are problems. Many people give up on the first failure, which is reasonable) * Wait for the reply, and change their status to ''verified'' This is supposed to help you verify that the user really wants mail at that address -- which can eliminate the problem of someone using another person''s email address, either by accident of maliciusly. If the only thing you intend to use the email address for is something like lost password announcements, then don''t bother checking too hard. Make sure that you give them some other mechanism for recovering account access -- like custom answers to questions, or direct contact with the site administrators. -jim
http://www.regexlib.com/ is an excellent resource for regular expressions. example: http://regexlib.com/Search.aspx?k=rfc%202822 Chris On 3/15/06, Norman Timmler <lists@inlet-media.de> wrote:> > Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim Cheetham: > > On Tue, Mar 14, 2006 at 10:53:18AM +0100, Norman Timmler wrote: > > > string = "\"joe black\"<joe_1@joe_black.com> joe_2@joe_black.com, > > > joe_3@joe_black.com" > > > email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+\.[a-z]+)/i) > > > > That will catch many addresses, but it will exclude other perfectly > > valid ones. > > > > RFC2822 http://www.faqs.org/rfcs/rfc2822.html is the place to look, the > > local part (left-hand-side) of an address can include :- > > atext = ALPHA / DIGIT / ; Any character except controls, > > "!" / "#" / ; SP, and specials. > > "$" / "%" / ; Used for atoms > > "&" / "''" / > > "*" / "+" / > > "-" / "/" / > > "=" / "?" / > > "^" / "_" / > > "`" / "{" / > > "|" / "}" / > > "~" > > as well as the "." character. > > > > Hopefully there is some library that parses addresses into canonical > > form that can be used, rather than a simple regexp? > > @Jim > Is there such a library for ruby? Can you provide a link? > > If a simple regexp is not enough for you, you can find a complex here: > > http://ex-parrot.com/~pdw/Mail-RFC822-Address.html > > But i think this is a bit disproportionated. > > @Joe > The regexp on the the top matches most email addresses in the wildlife > with nearly no loss. If you expect some of the special characters in the > local part of your email addresses, just add them to the regexp. It > should fit your needs. > > -- > Norman Timmler > > http://blog.inlet-media.de > > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060315/fbc5a7fb/attachment.html
Correct me if im wrong, but it is possible to check the domain of the email using an MX record check, so if the domain is valid that will get you a lot closer to establishing weather or not the email is. However im not sure how this is implimented with ruby, its possible on unix based boxes using PHP. Cheers -- Posted via http://www.ruby-forum.com/.