Hi, I'm using Dovecot for my mail with pigeonhole's sieve extension, from debian stable (2.3.4.1 (f79e8e7e4)). One of the sieve stanzas that I use is for duplicate elimination: if duplicate { discard; stop; } This mostly works fine, but I seem to have trouble with some messages that come from certain domains where duplicates are not eliminated.>From /var/log/mail.log, I see:Feb 18 12:48:50 mail dovecot: lmtp(24320:linux at xxx.armlinux.org.uk): sieve: msgid=? <VI1PR04MB513558BF77192255CBE12102B0110 at VI1PR04MB5135.eurprd04.prod.outlook.com>: stored mail into mailbox 'INBOX' Feb 18 12:49:42 mail dovecot: lmtp(24320:linux at xxx.armlinux.org.uk): sieve: msgid=<VI1PR04MB513558BF77192255CBE12102B0110 at VI1PR04MB5135.eurprd04.prod.outlook.com>: stored mail into mailbox 'INBOX' The first was received direct from the recipient with a message-id line formatted thusly: Message-ID: <VI1PR04MB513558BF77192255CBE12102B0110 at VI1PR04MB5135.eurprd04.prod.outlook.com> The second was received from the mailing list with a message-id line formatted thusly: Message-ID: <VI1PR04MB513558BF77192255CBE12102B0110 at VI1PR04MB5135.eurprd04.prod.outlook.com> It would appear that the parsed message-id value that dovecot uses includes white-space (including newline characters). RFC5322 gives the message-id header format as: message-id = "Message-ID:" msg-id CRLF msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] It goes on to say: The message identifier (msg-id) itself MUST be a globally unique identifier for a message. and: Semantically, the angle bracket characters are not part of the msg-id; the msg-id is what is contained between the two angle bracket characters. However, it seems dovecot sieve is using the entire content of the msg-id, including CFWS, as the message id used for detecting duplicate messages. This seems wrong, and appears to lead to duplicates not being detected, and thus seems like a bug. Is there a workaround for this, and/or can it be changed? -- Russell King
On 18.2.2020 18.12, Russell King wrote:> Hi, > > I'm using Dovecot for my mail with pigeonhole's sieve extension, > from debian stable (2.3.4.1 (f79e8e7e4)). > > One of the sieve stanzas that I use is for duplicate elimination: > > if duplicate > { > discard; > stop; > } > > This mostly works fine, but I seem to have trouble with some messages > that come from certain domains where duplicates are not eliminated. > > From /var/log/mail.log, I see: > > Feb 18 12:48:50 mail dovecot: lmtp(24320:linux at xxx.armlinux.org.uk): sieve: msgid=? <VI1PR04MB513558BF77192255CBE12102B0110 at VI1PR04MB5135.eurprd04.prod.outlook.com>: stored mail into mailbox 'INBOX' > Feb 18 12:49:42 mail dovecot: lmtp(24320:linux at xxx.armlinux.org.uk): sieve: msgid=<VI1PR04MB513558BF77192255CBE12102B0110 at VI1PR04MB5135.eurprd04.prod.outlook.com>: stored mail into mailbox 'INBOX' > > The first was received direct from the recipient with a message-id > line formatted thusly: > > Message-ID: > <VI1PR04MB513558BF77192255CBE12102B0110 at VI1PR04MB5135.eurprd04.prod.outlook.com> > > The second was received from the mailing list with a message-id line > formatted thusly: > > Message-ID: <VI1PR04MB513558BF77192255CBE12102B0110 at VI1PR04MB5135.eurprd04.prod.outlook.com> > > It would appear that the parsed message-id value that dovecot uses > includes white-space (including newline characters). > > RFC5322 gives the message-id header format as: > > message-id = "Message-ID:" msg-id CRLF > msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] > > It goes on to say: > > The message identifier (msg-id) itself MUST be a globally unique > identifier for a message. > > and: > > Semantically, the angle bracket characters are not part of the > msg-id; the msg-id is what is contained between the two angle bracket > characters. > > However, it seems dovecot sieve is using the entire content of the > msg-id, including CFWS, as the message id used for detecting duplicate > messages. This seems wrong, and appears to lead to duplicates not > being detected, and thus seems like a bug. > > Is there a workaround for this, and/or can it be changed? >Hi! Thanks for taking time to report this, it appears this is an bug. We are tracking it as DOP-1697. Aki
Reasonably Related Threads
- Request: Pigeonhole - strip CWFS from message-id logs
- lda: Warning: Failed to parse return-path header: Invalid character in localpart
- lda: Warning: Failed to parse return-path header: Invalid character in localpart
- Another address-spec problem
- Request: Pigeonhole - strip CWFS from message-id logs