Hi,
I ran into a problem wherein my mail client (RoundCube) would not
display a message from a Dovecot IMAP server (claiming that the message
had no content). The raw source of the message looked fine, but the body
structure returned by Dovecot only had the first text/plain part and not
the alternative text/html part. The message looks like:
... headers removed ...
X-Mailer: Lotus Notes Release 6.5.1 January 21, 2004
Message-ID: <...>
From: user at host.domain
Date: Mon, 20 Oct 2008 14:15:55 -0600
Content-Type: multipart/alternative; boundary="=_alternative
006F3A73872574E8_="
This is a multipart message in MIME format.
--=_alternative 006F3A73872574E8_ Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=us-ascii
blah blah blah
--=_alternative 006F3A73872574E8_ Content-Transfer-Encoding: 7bit
Content-Type: text/html;
charset=us-ascii
<br><font size=2 face="sans-serif">blah blah blah in
HTML</font>
--=_alternative 006F3A73872574E8_=--
I did a little bit of tracing through the parsing code
(message-header-parser.c:message_parse_header_next()) and it appeared
that the boundary in the Content-Type header was not parsed correctly,
evidently because the header line was folded in the middle of the
boundary string. RFC 822 appears to allow folding in a quoted string
like this (?3.3 "quoted-string"), so I'm curious whether the
parsing is
working correctly.
Thanks for your help!
Here is my Dovecot information:
version: 1.1.4
"dovecot -n" output:
# 1.1.4: /usr/local/etc/dovecot.conf
Warning: fd limit 256 is lower than what Dovecot can use under full load
(more than 384). Either grow the limit or change
login_max_processes_count and max_mail_processes settings
base_dir: /var/dovecot/
info_log_path: /var/log/dovecot.log
listen: *, [::]
ssl_cert_file: /System/Library/OpenSSL/certs/imapd.pem
ssl_key_file: /System/Library/OpenSSL/certs/privkey.out
login_dir: /var/dovecot/login
login_executable: /usr/local/libexec/dovecot/imap-login
max_mail_processes: 256
mail_location: maildir:%h/Maildir
namespace:
type: private
separator: /
inbox: yes
list: yes
subscriptions: yes
namespace:
type: shared
separator: /
prefix: Shared/
location: maildir:/Users/Shared/Maildir
list: yes
subscriptions: yes
auth default:
passdb:
driver: pam
args: imap
userdb:
driver: passwd
--
*Eric Stadtherr*
estadtherr at gmail.com <mailto:estadtherr at gmail.com>
On Wed, 2008-10-22 at 20:59 -0600, Eric Stadtherr wrote:> Content-Type: multipart/alternative; boundary="=_alternative > 006F3A73872574E8_="Is there one space, two spaces or a TAB at the beginning of the second line?> I did a little bit of tracing through the parsing code > (message-header-parser.c:message_parse_header_next()) and it appeared > that the boundary in the Content-Type header was not parsed correctly, > evidently because the header line was folded in the middle of the > boundary string. RFC 822 appears to allow folding in a quoted string > like this (?3.3 "quoted-string"), so I'm curious whether the parsing is > working correctly.Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3 But I'm not sure if I should convert the following TAB to a space. UW-IMAP seems to do that, but RFC just says that the CRLF should be dropped. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20081023/e9b66ed7/attachment-0002.bin>
On Thu, 23 Oct 2008 19:06:19 +0300, Timo Sirainen <tss at iki.fi> wrote:> On Wed, 2008-10-22 at 20:59 -0600, Eric Stadtherr wrote: >> Content-Type: multipart/alternative; boundary="=_alternative >> 006F3A73872574E8_=" > > Is there one space, two spaces or a TAB at the beginning of the second > line? >There is one space at the beginning of the continuation line. The parsed full_value basically looks like: [multipart/alternative; boundary="=_alternative\n 006F3A73872574E8_="]>> I did a little bit of tracing through the parsing code >> (message-header-parser.c:message_parse_header_next()) and it appeared >> that the boundary in the Content-Type header was not parsed correctly, >> evidently because the header line was folded in the middle of the >> boundary string. RFC 822 appears to allow folding in a quoted string >> like this (?3.3 "quoted-string"), so I'm curious whether the parsing is>> working correctly. > > Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3 > > But I'm not sure if I should convert the following TAB to a space. > UW-IMAP seems to do that, but RFC just says that the CRLF should be > dropped.I always prefer strict adherence to the RFC, which says: The process of moving from this folded multiple-line representation of a header field to its single line represen- tation is called "unfolding". Unfolding is accomplished by regarding CRLF immediately followed by a LWSP-char as equivalent to the LWSP-char. So, what you did looks good! -- Eric Stadtherr estadtherr at gmail.com
Timo Sirainen wrote:> But I'm not sure if I should convert the following TAB to a space. > UW-IMAP seems to do that, but RFC just says that the CRLF should be > dropped.As pointed out in https://bugzilla.mozilla.org/show_bug.cgi?id=240924#c7, this could lead to strange behaviour. So I'd vote for replacing the folding tab to a space.
On Thu, 23 Oct 2008 19:06:19 +0300, Timo Sirainen <tss at iki.fi> wrote:> On Wed, 2008-10-22 at 20:59 -0600, Eric Stadtherr wrote: >> Content-Type: multipart/alternative; boundary="=_alternative >> 006F3A73872574E8_=" > > Is there one space, two spaces or a TAB at the beginning of the second > line? > >> I did a little bit of tracing through the parsing code >> (message-header-parser.c:message_parse_header_next()) and it appeared >> that the boundary in the Content-Type header was not parsed correctly, >> evidently because the header line was folded in the middle of the >> boundary string. RFC 822 appears to allow folding in a quoted string >> like this (?3.3 "quoted-string"), so I'm curious whether the parsing is>> working correctly. > > Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3 > > But I'm not sure if I should convert the following TAB to a space. > UW-IMAP seems to do that, but RFC just says that the CRLF should be > dropped.I grabbed a snapshot of the CM baseline with that fix, but that message still doesn't display correctly. I ran it through the message_parser test case and your fix look like it resulted in correct header values and correct body parsing, but the BODYSTRUCTURE response from the server still only contains the first part (plus the boundary name). Any suggestions where to look? I looked through the code that handles the BODYSTRUCTURE fetch command and it looked like it eventually filtered down to the same parser functions used by the test case, so I'm not sure where else the problem could be introduced... -- Eric Stadtherr estadtherr at gmail.com