In a quest to remove ?duplicate? messages sent to both me and lists I subscribe to I came up with this, which I think should clean out my Archive folder, but I?ve been unable to get it to work for scanning all on my list-user email. $ doveadm -f table fetch -u kremels 'hdr.message-id guid uid hdr.x-listname' mailbox "Archive" | sort| awk 'cnt[$1]++{if (cnt[$1]==2) print prev[$1]; print} {prev[$1]=$0}' |grep -E "[0-9] +$" |awk '{print "doveadm expunge -u kremels MAILBOX-GUID "$2" UID "$3}? X-listname is a header that my list user applies to each message that comes from a list, and the output is just text to the screen that I can then run manually (I am not confident enough to automatically delete the messages). I include the X-listname header at the end so that I can exclude lines that don?t end in a number, which means the copies sent directly to me are the ones expunged and the ones from the list are preserved. So far so good. But ther are issues. First, even after expunging a message and running doveadm index -u kremels ?Archive?, subsequent runs still show the same duplicate messages. Second, what I really want to do is run this over ALL the mailboxes, except for Junk and Sent but if that is possible I can?t find the right syntax. -- "He has all the virtues I dislike and none of the vices I admire." Winston Churchill
OK, perhaps I tried to cover too much, so let's just look at this: If I run this command, I get no errors: doveadm expunge -u kremels MAILBOX-GUID 1488800748.47633_1.mail.covisp.net UID 22908 But, if I search again doveadm -f table fetch -u kremels 'hdr.message-id guid uid hdr.x-listname' mailbox 'Archive' | sort| awk 'cnt[$1]++{if (cnt[$1]==2) print prev[$1]; print} {prev[$1]=$0}' |grep -E "[0-9] +$" |awk '{print "doveadm expunge -u kremels MAILBOX-GUID "$2" UID "$3}' | grep 22908 hdr.message-id guid uid hdr.x-listname doveadm expunge -u kremels MAILBOX-GUID 1488800748.47633_1.mail.covisp.net UID 22908 The message is still listed. Am i misunderstanding something about how expunge works or what it does? How do i remove the messages in such a way that they will not show up in subsequent searches (as far as I can tell, assuming the 1488800748.47633_1 is the first part of the file name in the maildir, the message is actually deleted). $ find Maildir -name "1488800*" Maildir/.Archive/cur/1488800350.46962_1.mail.covisp.net:2,S Maildir/.Archive/cur/1488800633.47337_1.mail.covisp.net:2,S Maildir/.Sent/cur/1488800118.M2833P43167.mail.covisp.net,S=1221,W=1251:2,Sad -- Tragic heroes always moan when the gods take an interest in them, but it's the people the gods ignore who get the really tough deals. --Mort
On Fri, 23 Feb 2018, @lbutlr wrote:> $ doveadm -f table fetch -u kremels 'hdr.message-id guid uid > hdr.x-listname' mailbox "Archive" | sort| awk 'cnt[$1]++{if > (cnt[$1]==2) print prev[$1]; print} {prev[$1]=$0}' |grep -E "[0-9] +$" > |awk '{print "doveadm expunge -u kremels MAILBOX-GUID "$2" UID "$3}?I was unaware of the syntax "hdr.{header}" -- all the reference materials I've seen only refers to "hdr" which returns the entire header block. This is handy to know because up to now, I've been filtering "hdr" fetches through grep. I've tried updating the Wiki, but it's immutable, so would someone update the documentation: https://wiki.dovecot.org/Tools/Doveadm/Fetch (and man page in distribution) hdr[.{x}] Header {x} of message. If missing, the entire header is fetched.> First, even after expunging a message and running doveadm index -u > kremels ?Archive?, subsequent runs still show the same duplicate > messages.I suspect client side caching. If you query IMAP directly, does it report the correct number of messages? (Using openssl s_client, or netcat or telnet, or whatever) x1 LOGIN kremels yourpassword x2 SELECT INBOX ... look for "* {count} EXISTS" ... x3 LOGOUT If {count} is what you expected, then dovecot has the correct information and it's likely some client-side caching issue.> Second, what I really want to do is run this over ALL the mailboxes, > except for Junk and Sent but if that is possible I can?t find the right > syntax.You mean to remove duplicates from any 2 mailboxes, or remove duplicates in mailboxes also found in Archive? If the latter, try doveadm -f table fetch -u kremels \ hdr.message-id \ mailbox Archive \ | sort -b >list0 doveadm -f table fetch -u kremels \ 'hdr.message-id guid uid' \ NOT mailbox Archive \ NOT mailbox Junk \ NOT mailbox Sent \ | sort -b >list1 The list of duplicate message-id, guid and uid will then be ... join -j1 list0 list1 You can process it via awk with one invocation of doveadm (2nd form without exclusion of Archive) but you'll need to know the guid of Archive beforehand. Joseph Tam <jtam.home at gmail.com>
On 2018-02-23 (16:47 MST), Joseph Tam <jtam.home at gmail.com> wrote:> > On Fri, 23 Feb 2018, @lbutlr wrote: > >> $ doveadm -f table fetch -u kremels 'hdr.message-id guid uid >> hdr.x-listname' mailbox "Archive" | sort| awk 'cnt[$1]++{if >> (cnt[$1]==2) print prev[$1]; print} {prev[$1]=$0}' |grep -E "[0-9] +$" >> |awk '{print "doveadm expunge -u kremels MAILBOX-GUID "$2" UID "$3}? > > I was unaware of the syntax "hdr.{header}" -- all the reference materials > I've seen only refers to "hdr" which returns the entire header block.the error message from doveadm if you specify an invalid field is: Available fetch fields: hdr.<name> body.<section> binary.<section> user mailbox mailbox-guid seq uid guid flags modseq hdr body body.snippet text text.utf8 size.physical size.virtual date.received date.sent date.saved date.received.unixtime date.sent.unixtime date.saved.unixtime imap.envelope imap.body imap.bodystructure pop3.uidl pop3.order refcount storageid>> First, even after expunging a message and running doveadm index -u >> kremels ?Archive?, subsequent runs still show the same duplicate >> messages. > > I suspect client side caching.No, there is no client side involved. I am executing all of these these commands on the mail server. I expunge the messages, I index (or even force-resync) and the next search shows the same messages even though they are not in the Maildir anymore.> If {count} is what you expected, then dovecot has the correct information > and it's likely some client-side caching issue.I would have needed to check the count before doing this, and I did not.>> Second, what I really want to do is run this over ALL the mailboxes, >> except for Junk and Sent but if that is possible I can?t find the right >> syntax. > > You mean to remove duplicates from any 2 mailboxes, or remove duplicates > in mailboxes also found in Archive?I want to find any duplicates (based on msg ID) across all mailboxes, except Sent> doveadm -f table fetch -u kremels \ > 'hdr.message-id guid uid' \ > NOT mailbox Archive \ > NOT mailbox Junk \ > NOT mailbox Sent \ > | sort -b >list1Aha! Didn't know you could use NOT mailbox. That probably solves my issue on that score. -- "It's unacceptable to think" - George W Bush 15/Sep/2006