Hi, I ran into a problem wherein my mail client (RoundCube) would not display a message from a Dovecot IMAP server (claiming that the message had no content). The raw source of the message looked fine, but the body structure returned by Dovecot only had the first text/plain part and not the alternative text/html part. The message looks like: ... headers removed ... X-Mailer: Lotus Notes Release 6.5.1 January 21, 2004 Message-ID: <...> From: user at host.domain Date: Mon, 20 Oct 2008 14:15:55 -0600 Content-Type: multipart/alternative; boundary="=_alternative 006F3A73872574E8_=" This is a multipart message in MIME format. --=_alternative 006F3A73872574E8_ Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii blah blah blah --=_alternative 006F3A73872574E8_ Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=us-ascii <br><font size=2 face="sans-serif">blah blah blah in HTML</font> --=_alternative 006F3A73872574E8_=-- I did a little bit of tracing through the parsing code (message-header-parser.c:message_parse_header_next()) and it appeared that the boundary in the Content-Type header was not parsed correctly, evidently because the header line was folded in the middle of the boundary string. RFC 822 appears to allow folding in a quoted string like this (?3.3 "quoted-string"), so I'm curious whether the parsing is working correctly. Thanks for your help! Here is my Dovecot information: version: 1.1.4 "dovecot -n" output: # 1.1.4: /usr/local/etc/dovecot.conf Warning: fd limit 256 is lower than what Dovecot can use under full load (more than 384). Either grow the limit or change login_max_processes_count and max_mail_processes settings base_dir: /var/dovecot/ info_log_path: /var/log/dovecot.log listen: *, [::] ssl_cert_file: /System/Library/OpenSSL/certs/imapd.pem ssl_key_file: /System/Library/OpenSSL/certs/privkey.out login_dir: /var/dovecot/login login_executable: /usr/local/libexec/dovecot/imap-login max_mail_processes: 256 mail_location: maildir:%h/Maildir namespace: type: private separator: / inbox: yes list: yes subscriptions: yes namespace: type: shared separator: / prefix: Shared/ location: maildir:/Users/Shared/Maildir list: yes subscriptions: yes auth default: passdb: driver: pam args: imap userdb: driver: passwd -- *Eric Stadtherr* estadtherr at gmail.com <mailto:estadtherr at gmail.com>
On Wed, 2008-10-22 at 20:59 -0600, Eric Stadtherr wrote:> Content-Type: multipart/alternative; boundary="=_alternative > 006F3A73872574E8_="Is there one space, two spaces or a TAB at the beginning of the second line?> I did a little bit of tracing through the parsing code > (message-header-parser.c:message_parse_header_next()) and it appeared > that the boundary in the Content-Type header was not parsed correctly, > evidently because the header line was folded in the middle of the > boundary string. RFC 822 appears to allow folding in a quoted string > like this (?3.3 "quoted-string"), so I'm curious whether the parsing is > working correctly.Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3 But I'm not sure if I should convert the following TAB to a space. UW-IMAP seems to do that, but RFC just says that the CRLF should be dropped. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20081023/e9b66ed7/attachment-0002.bin>
On Thu, 23 Oct 2008 19:06:19 +0300, Timo Sirainen <tss at iki.fi> wrote:> On Wed, 2008-10-22 at 20:59 -0600, Eric Stadtherr wrote: >> Content-Type: multipart/alternative; boundary="=_alternative >> 006F3A73872574E8_=" > > Is there one space, two spaces or a TAB at the beginning of the second > line? >There is one space at the beginning of the continuation line. The parsed full_value basically looks like: [multipart/alternative; boundary="=_alternative\n 006F3A73872574E8_="]>> I did a little bit of tracing through the parsing code >> (message-header-parser.c:message_parse_header_next()) and it appeared >> that the boundary in the Content-Type header was not parsed correctly, >> evidently because the header line was folded in the middle of the >> boundary string. RFC 822 appears to allow folding in a quoted string >> like this (?3.3 "quoted-string"), so I'm curious whether the parsing is>> working correctly. > > Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3 > > But I'm not sure if I should convert the following TAB to a space. > UW-IMAP seems to do that, but RFC just says that the CRLF should be > dropped.I always prefer strict adherence to the RFC, which says: The process of moving from this folded multiple-line representation of a header field to its single line represen- tation is called "unfolding". Unfolding is accomplished by regarding CRLF immediately followed by a LWSP-char as equivalent to the LWSP-char. So, what you did looks good! -- Eric Stadtherr estadtherr at gmail.com
Timo Sirainen wrote:> But I'm not sure if I should convert the following TAB to a space. > UW-IMAP seems to do that, but RFC just says that the CRLF should be > dropped.As pointed out in https://bugzilla.mozilla.org/show_bug.cgi?id=240924#c7, this could lead to strange behaviour. So I'd vote for replacing the folding tab to a space.
On Thu, 23 Oct 2008 19:06:19 +0300, Timo Sirainen <tss at iki.fi> wrote:> On Wed, 2008-10-22 at 20:59 -0600, Eric Stadtherr wrote: >> Content-Type: multipart/alternative; boundary="=_alternative >> 006F3A73872574E8_=" > > Is there one space, two spaces or a TAB at the beginning of the second > line? > >> I did a little bit of tracing through the parsing code >> (message-header-parser.c:message_parse_header_next()) and it appeared >> that the boundary in the Content-Type header was not parsed correctly, >> evidently because the header line was folded in the middle of the >> boundary string. RFC 822 appears to allow folding in a quoted string >> like this (?3.3 "quoted-string"), so I'm curious whether the parsing is>> working correctly. > > Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3 > > But I'm not sure if I should convert the following TAB to a space. > UW-IMAP seems to do that, but RFC just says that the CRLF should be > dropped.I grabbed a snapshot of the CM baseline with that fix, but that message still doesn't display correctly. I ran it through the message_parser test case and your fix look like it resulted in correct header values and correct body parsing, but the BODYSTRUCTURE response from the server still only contains the first part (plus the boundary name). Any suggestions where to look? I looked through the code that handles the BODYSTRUCTURE fetch command and it looked like it eventually filtered down to the same parser functions used by the test case, so I'm not sure where else the problem could be introduced... -- Eric Stadtherr estadtherr at gmail.com