On Fri, May 07, 2021 at 06:15:55PM +0200, Florian Lohoff wrote:> > Dovecot: 2.3.4 > > Hi, > i am debugging a problem where after a force-resync removed all cache > files (All files are 44 byte). Trying to open a folder takes literally minutes > (Huge mailbox - 220K folder, 45GByte compressed storage)As i am running out of options i upgraded to 2.3.14-5+debian9 though no success. Between Debian/Stretch 2.2.27 and 2.3.4 something broke horribly for huge mdboxes. Opening the mailbox takes (on an otherwise idle machine, 32GByte RAM, RAID1 spinning rust) root at pax:~# time doveadm mailbox list -u inbox at wasteland.rfc822.org >/var/tmp/list doveadm(inbox at wasteland.rfc822.org): Warning: Transaction log file /var/vmail/wasteland.rfc822.org/inbox/mdbox/dovecot.list.index.log was locked for 107 seconds (Mailbox was synchronized) real 1m47.508s user 1m46.856s sys 0m0.480s We see the ~220K folders in this mailbox: root at pax:~# wc -l /var/tmp/list 213922 /var/tmp/list Looking at an strace on this command we see: root at pax:~# grep stat /var/tmp/strace.doveadm.list | wc -l 428696 So every folder will at least get 2 stat calls - Okay - may be because of the mailbox list command - lets try deleting a folder. root at pax:~# strace -f -o /var/tmp/strace.doveadm.mailbox.delete doveadm mailbox delete -e -u inbox at wasteland.rfc822.org bernd.bottich doveadm(inbox at wasteland.rfc822.org): Warning: Transaction log file /var/vmail/wasteland.rfc822.org/inbox/mdbox/dovecot.list.index.log was locked for 115 seconds (Mailbox was synchronized) doveadm(inbox at wasteland.rfc822.org): Error: Can't delete mailbox bernd.bottich: Mailbox doesn't exist: bernd.bottich root at pax:~# grep stat /var/tmp/strace.doveadm.mailbox.delete | wc -l 428702 Okay - so we opened the mailbox, stattet all folders to tell me my mailbox is not existent. Took again around 90 seconds. And this happens every, single, time. root at pax:~# time doveadm mailbox delete -e -u inbox at wasteland.rfc822.org bernd.bottich doveadm(inbox at wasteland.rfc822.org): Warning: Transaction log file /var/vmail/wasteland.rfc822.org/inbox/mdbox/dovecot.list.index.log was locked for 108 seconds (Mailbox was synchronized) doveadm(inbox at wasteland.rfc822.org): Error: Can't delete mailbox bernd.bottich: Mailbox doesn't exist: bernd.bottich real 1m48.232s user 1m47.472s sys 0m0.580s So logging is broken (Yes i had a look at the code. It complains about its own lock been taken to long on unlocking) So i am in a deadend. 2.2.27 eats mdboxes for breakfast and cant repair them 2.3.4+ is unusuable with this mailbox. All these operations were in the sub-second range with 2.2.27 ... Guess i need to find something else to store mails ... Flo -- Florian Lohoff f at zz.de Any sufficiently advanced technology is indistinguishable from magic. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <https://dovecot.org/pipermail/dovecot/attachments/20210508/a084f684/attachment.sig>
> On May 8, 2021, at 7:08 AM, Florian Lohoff <f at zz.de> wrote: > > On Fri, May 07, 2021 at 06:15:55PM +0200, Florian Lohoff wrote: >> >> Dovecot: 2.3.4 >> >> Hi, >> i am debugging a problem where after a force-resync removed all cache >> files (All files are 44 byte). Trying to open a folder takes literally minutes >> (Huge mailbox - 220K folder, 45GByte compressed storage)I don?t have an answer for you, unfortunately, as you all exercise your mail system far more thanI do. Taking a wild guess, I?d think that having a large number of folders has potential to be more of an issue than having a large amount of mail in fewer folders. Operating under that assumption, is it possible to bound the problem? (Hypothetically say 128k folders were good, more than that and something went out to lunch or something similar.) I am, however, watching this with interest. -- Coy Hile coy.hile at coyhile.com -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: Message signed with OpenPGP URL: <https://dovecot.org/pipermail/dovecot/attachments/20210508/1b474f7b/attachment.sig>
On 8. May 2021, at 13.08, Florian Lohoff <f at zz.de> wrote:> > On Fri, May 07, 2021 at 06:15:55PM +0200, Florian Lohoff wrote: >> >> Dovecot: 2.3.4 >> >> Hi, >> i am debugging a problem where after a force-resync removed all cache >> files (All files are 44 byte). Trying to open a folder takes literally minutes >> (Huge mailbox - 220K folder, 45GByte compressed storage) > > As i am running out of options i upgraded to 2.3.14-5+debian9 though no > success. Between Debian/Stretch 2.2.27 and 2.3.4 something broke > horribly for huge mdboxes. Opening the mailbox takes (on an otherwise > idle machine, 32GByte RAM, RAID1 spinning rust) > > <root at pax:~# time doveadm mailbox list -u inbox at wasteland.rfc822.org >/var/tmp/list > doveadm(inbox at wasteland.rfc822.org): Warning: Transaction log file /var/vmail/wasteland.rfc822.org/inbox/mdbox/dovecot.list.index.log was locked for 107 seconds (Mailbox was synchronized) > > real 1m47.508s > user 1m46.856s > sys 0m0.480s > > We see the ~220K folders in this mailbox: > > root at pax:~# wc -l /var/tmp/list > 213922 /var/tmp/list > > Looking at an strace on this command we see: > > root at pax:~# grep stat /var/tmp/strace.doveadm.list | wc -l > 428696 > > So every folder will at least get 2 stat calls - Okay - may be because of the > mailbox list command - lets try deleting a folder.Sounds like dovecot.list.index.> root at pax:~# strace -f -o /var/tmp/strace.doveadm.mailbox.delete doveadm mailbox delete -e -u inbox at wasteland.rfc822.org bernd.bottich > doveadm(inbox at wasteland.rfc822.org): Warning: Transaction log file /var/vmail/wasteland.rfc822.org/inbox/mdbox/dovecot.list.index.log was locked for 115 seconds (Mailbox was synchronized) > doveadm(inbox at wasteland.rfc822.org): Error: Can't delete mailbox bernd.bottich: Mailbox doesn't exist: bernd.bottich > > root at pax:~# grep stat /var/tmp/strace.doveadm.mailbox.delete | wc -l > 428702 > > Okay - so we opened the mailbox, stattet all folders to tell me my mailbox is not existent. Took again > around 90 seconds. And this happens every, single, time. > > root at pax:~# time doveadm mailbox delete -e -u inbox at wasteland.rfc822.org bernd.bottich > doveadm(inbox at wasteland.rfc822.org): Warning: Transaction log file /var/vmail/wasteland.rfc822.org/inbox/mdbox/dovecot.list.index.log was locked for 108 seconds (Mailbox was synchronized) > doveadm(inbox at wasteland.rfc822.org): Error: Can't delete mailbox bernd.bottich: Mailbox doesn't exist: bernd.bottich > > real 1m48.232s > user 1m47.472s > sys 0m0.580s > > So logging is broken (Yes i had a look at the code. It complains about its own lock been taken to long on unlocking)It's intentional. It warns that it kept the lock much longer than it was expected to. This helps figuring out why other processes could be seeing lock timeouts.> So i am in a deadend. > > 2.2.27 eats mdboxes for breakfast and cant repair them > 2.3.4+ is unusuable with this mailbox. > > All these operations were in the sub-second range with 2.2.27 ... Guess > i need to find something else to store mails ...v2.2 has mailbox_list_index=no by default, while v2.3 has changed it to "yes". So you could try changing it back to "no". Or maybe mailbox_list_index_very_dirty_syncs=yes would help also.> Reopening the same folder again is quick and seamless. So it seems this is some > cache recreation code looking into other folders.Do you mean that whenever any of the 220k folders is opened, it's first slow but then fast afterwards (even if other folders are opened)? So it's only when a new folder is opened it's being slow? That does sound like some kind of a bug. BTW. doveconf -n output could also be helpful. This behavior might also be due to some setting.