Folks: I''ve posted a really basic e-mail address verifier to http://wiki.rubygarden.org/Ruby/page/show/VerifyEmailAddress I''d appreciate folks who understand DNS and SMTP having a look at it to see if it looks reasonable. You could comment here or, possibly more usefully, comment on the wiki page itself. Thanks Dave
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Dave,> I''ve posted a really basic e-mail address verifier to > > http://wiki.rubygarden.org/Ruby/page/show/VerifyEmailAddress > > I''d appreciate folks who understand DNS and SMTP having a look at > it to see if it looks reasonable. You could comment here or, > possibly more usefully, comment on the wiki page itself.I think this may fail with a email addresses that have quoted local parts that contain "@". Here''s a more restrictive regexp that handles quoted local parts: http://tfletcher.com/lib/rfc822.rb My only problem with this regexp is it tests the email according to the RFC 822 spec, which is looser than what is allowed in real-life. For example, this regexp allows a host with any TLD to match, but there are a limited number of TLDs that are issued by ICANN. Also, the part of the code that checks the MX and A records can probably be shortened to something like: mx_hosts = dns.getresources(domain, Resolv::DNS::Resource::IN::MX) rescue [] mx_hosts.sort_by { |mx| mx.preference }.map { |mx| mx.exchange }.push(domain).each do |host| a_records = dns.getresources(host.to_s, Resolv::DNS::Resource::IN::A) rescue [] return false if check_hosts(a_records) end - -- Thanks, Dan __________________________________________________________________ Dan Kubb Autopilot Marketing Inc. Email: dan.kubb@autopilotmarketing.com Phone: 1 (604) 820-0212 Web: http://autopilotmarketing.com/ vCard: http://autopilotmarketing.com/~dan.kubb/vcard __________________________________________________________________ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (Darwin) iD8DBQFEbQQ94DfZD7OEWk0RAqq+AJsHJaD2D4XoLCJ57iQu0PeIe91dbQCgszCl 1/tPyLvgy4iIg7MLyoZO2RM=4Oqi -----END PGP SIGNATURE-----
>>>>> "Dave" == Dave Thomas <dave@pragprog.com> writes:> I''d appreciate folks who understand DNS and SMTP having a look at it > to see if it looks reasonable.It has at least one fairly serious flaw, as far as I can see. A proper MTA will queue messages to a domain that it can''t get a DNS response for, while your code will reject it. This situation if reasonably common, since it''ll occur every time all the DNS servers for a domain become non-responsive. Also, just seeing if something answers on port 25 seems a bit dodgy to me. I know of plenty of places that have SSH servers on that port, for example. There''s also a distinct possibility that while there is a proper SMTP server there, all it''s going to tell you is "Bugger off!", since it doesn''t like where you connected from, or something. Since you''re already connected, running through a HELO/MAIL FROM/RCPT TO/QUIT sequence to see if it''d really accept mail from you would increase the confidence in the answer considerably. In case of failure, you''d also know if it was permanent (5xx reply) or temporary (4xx reply). To match the way things work, your function really needs to return one of three possible results: "can send to that domain", "can''t send to that domain at all" and "can''t send to that domain right now". -- Calle Dybedahl <calle@cyberpomo.com> http://www.livejournal.com/users/cdybedahl/ "Women. They don''t even make sense when you are one." -- babycola
On May 20, 2006, at 2:36 AM, Calle Dybedahl wrote:> It has at least one fairly serious flaw, as far as I can see. A proper > MTA will queue messages to a domain that it can''t get a DNS response > for, while your code will reject it. This situation if reasonably > common, since it''ll occur every time all the DNS servers for a domain > become non-responsive. > > Also, just seeing if something answers on port 25 seems a bit dodgy to > me. I know of plenty of places that have SSH servers on that port, for > example. There''s also a distinct possibility that while there is a > proper SMTP server there, all it''s going to tell you is "Bugger off!", > since it doesn''t like where you connected from, or something. Since > you''re already connected, running through a HELO/MAIL FROM/RCPT > TO/QUIT sequence to see if it''d really accept mail from you would > increase the confidence in the answer considerably. In case of > failure, you''d also know if it was permanent (5xx reply) or temporary > (4xx reply). > > To match the way things work, your function really needs to return one > of three possible results: "can send to that domain", "can''t send to > that domain at all" and "can''t send to that domain right now"I should probably explain the context in which it''s being used. About one in 30 people signing up for a PDF from us mistype their email addresses, and about 70% of those mistype the domain. So I figured that quick sanity check before I accepted the form might be in order. In this case, I''m thinking I''m OK to pass any error on to the end user: I''m not trying to be bullet proof as much as provide them with a sanity check. Should they happen to have SSHD running on 25, then at least I know there''s a domain there. But... do people think I should move on and do the full RCPT TO sequence? Anyone happen to have the code that does it reliably? Dave
>>>>> "Dave" == Dave Thomas <dave@pragprog.com> writes:> I should probably explain the context in which it''s being used.> About one in 30 people signing up for a PDF from us mistype their > email addresses, and about 70% of those mistype the domain. So I > figured that quick sanity check before I accepted the form might be > in order.Ok, that makes sense, and for that use the code looks all right (except possibly that DNS server timeouts might be too long). The problem that the pessimist in me sees is that if you distribute the code, people will use it without reading the documentation and accept its answers as gospel truth. Experience from the Perl world tells me that it would probably be a better thing in the long run to have easily available, easy-to-use and as correct as humanly possible mail address checking code out there. Which also means that I should probably help write it... -- Calle Dybedahl <calle@cyberpomo.com> http://www.livejournal.com/users/cdybedahl/ "Facts are for people with weak opinions." -- Lars Willf?r, I]M