Gregor Zattler
2021-Mar-17 19:47 UTC
bug: chokes on long directory names (was: Re: out of memory on idle machine)
Hi David, Olly, notmuch and xapian developers, * David Bremner <david at tethera.net> [11. Feb. 2021]:> David Bremner <david at tethera.net> writes: > As a kind of desperation move, you could try bisecting your mailstore, > to see how small of a set of messages you can duplicate the problem > with.this I did, somehow. I found the culprit: It's a maildir with one single mail in it. The name of the maildir is exceptionally long [because generated from a List-Id: -Header] and the mail arrived at the very day, my notmuch database corrupted. This maildir alone provokes that every next notmuch new will rescan all (?) files. Then I tried to only index this maildir, it showed the same strange re-indexing but even when running notmuch new for a while in a loop (>1000 times), the database showed no corruption. When instead I shorten the name of the maildir to three characters with the very same email file in it, nothing happens, it indexes the file once and not again. Then I prolonged the name of the file instead of the directory and even with the longest possible filename (or path?) /home/grfz/Mail/nuk/new/1607641473.31514_2.no1607641473.31514_2.no1607641473.31514_2.no1607641473.31514_2.no1607641473.31514_2.no1607641473.31514_2.no1607641473.31514_2.no1607641473.31514_2.no1607641473.31514_2.no1607641473.31514_2.no1607641473.31514_2.no16076414734160.14_2.no notmuch has no problem indexing this and not to reindex it in the next run. So notmuch or xapian (I don't know) chokes on extreme long directory names. I consider this to be a bug. My scripts create this long names from List-Id and some such. The one which triggered the problems is from an online shop: u+mq6tamjqhe3cm2j5giydembrgiytamrtga2deojogexdsmzygm4egnbuifatcnrsgazdejjugbzgkylmfvxw43djnzsxg2dpoaxgizjgna6ton3bg4zdsobsgmytczlcme3dentehaydmnjxmy4doyrwha4tgobgoi6xizlmmvtxeylqnastimdhnv4c43tfoqthipldovzxi33nmvzhgllxmvwgg33nmu at real-onlineshop.de/ Since, as I tested, this can be reproduced with the simplest of email in a maildir with an extremly long name, I do not attach the maildir in question. But if anyone wants it I can send it. I then had a look at other long directory names and there is another one which also triggers the problem, it also has only one email in it and arrived on 12th of January: u+mq6wcodfgmygcjtjhuzdamrrgaytemjrhe2dqmbqfyys4mbxgazugnbsie3doobsgfcdmobfgqygg5ltorxw2zlsomxgo2lunrqweltdn5wsm2b5mu3tkmddhbrdoyrwgvsgeobymi2dszbtg4zdamztmm4dsmzvgjssm4r5orswyzlhojqxa2bfgqygo3lyfzxgk5bgoq6xa4tjozqwg6i at customers.gitlab.com Since I removed both on my laptop, notmuch new works again, yeah! Now I will have a look on my .procmailrc. Thanks for your attention, thanks for notmuch and for xapian, Grgeor -- -... --- .-. . -.. ..--.. ...-.-
David Bremner
2021-Mar-18 01:39 UTC
bug: chokes on long directory names (was: Re: out of memory on idle machine)
Gregor Zattler <telegraph at gmx.net> writes:> Hi David, Olly, notmuch and xapian developers, > * David Bremner <david at tethera.net> [11. Feb. 2021]: >> David Bremner <david at tethera.net> writes: >> As a kind of desperation move, you could try bisecting your mailstore, >> to see how small of a set of messages you can duplicate the problem >> with. > > this I did, somehow. I found the culprit: It's a maildir > with one single mail in it. The name of the maildir is > exceptionally long [because generated from a List-Id: > -Header] and the mail arrived at the very day, my notmuch > database corrupted. This maildir alone provokes that every > next notmuch new will rescan all (?) files.Hi Gregor; I am very impressed with your persistence. I suspect it is a bug in notmuch. I don't know all the details yet, but in the normal case the directory name is added to the database prefixed with XDIRECTORY. I noticed this isn't happening in the case of directories 234 or longer. That is roughly the Xapian term limit of 245 characters in total. I'm not sure why the discrepency of one character, but the main point is that notmuch is probably improperly ignoring an error from Xapian when adding these overlong terms. Thanks again for the debugging, I suspect would have never found this bug on my own. David