Joan Moreau
2019-Feb-17 08:56 UTC
[grosjo/fts-xapian] `doveadm fts rescan` removes all indices (#15)
In such case, as long as the API is not upgraded, should doveadm index -A -q \* be considered a replacement of doveadm fts rescan On 2019-02-14 16:24, Timo Sirainen via dovecot wrote:> Hi, > > The rescan() function is a bit badly designed. Currently what you could do what fts-lucene does and: > - Get list of UIDs for all mails in each folder > - If Xapian has UID that doesn't exist -> delete it from Xapian > - If UID is missing from Xapian -> expunge the rest of the UIDs in that folder, so the next indexing will cause them to be indexed > > The expunging of rest of the mails is rather ugly, yes.. A better API would be if backend simply had a way to iterate all mails in the index, preferrably sorted by folder. Then a more generic code could go through them and expunge the necessary mails and index the missing mails. Although not all FTS backends support indexing in the middle. Anyway, we don't really have time to implement this new API soon. > > I'm not sure if this is a big problem though. I don't think most people running FTS have ever run rescan. > > On 8 Feb 2019, at 9.54, Joan Moreau via dovecot <dovecot at dovecot.org> wrote: > > Hi, > > THis is a core problem in Dovecot in my understanding. > > In my opinion, the rescan in dovecot should send to the FTS plugin the list of "supposedly" indexed emails (UID), and the plugin shall purge the redundant UID (i..e UID present in the index but not in the list sent by dovecot) and send back the list of UID not in its indexes to dovecot, so Dovect can send one by one the missing emails > > WHat do you think ? > > -------- Original Message -------- > > SUBJECT: > [grosjo/fts-xapian] `doveadm fts rescan` removes all indices (#15) > > DATE: > 2019-02-08 08:28 > > FROM: > Leonard Lausen <notifications at github.com> > > TO: > grosjo/fts-xapian <fts-xapian at noreply.github.com> > > CC: > Subscribed <subscribed at noreply.github.com> > > REPLY-TO: > grosjo/fts-xapian <reply+0022e607fd2eb3ff93959543198455bc7db5bdd58aa0286b92cf000000011874f1ae92a169ce185221c2 at reply.github.com> > > doveadm fts rescan -A deletes all indices, ie. all folders and files in the xapian-indexes are deleted. However, according to man doveadm fts, the rescan command should only > > Scan what mails exist in the full text search index and compare those to what > actually exist in mailboxes. This removes mails from the index that have already > been expunged and makes sure that the next doveadm index will index all the > missing mails (if any). > > Deleting all indices does not seem to be the intended action, especially as constructing the index anew may take very long on large mailboxes. > > -- > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub [1], or mute the thread [2].Links: ------ [1] https://github.com/grosjo/fts-xapian/issues/15 [2] https://github.com/notifications/unsubscribe-auth/ACLmB9OB-7GaKIvhNc8sCgi7KQTrjNnoks5vLScugaJpZM4auCWp -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20190217/52e1ca73/attachment.html>
Aki Tuomi
2019-Feb-17 13:52 UTC
[grosjo/fts-xapian] `doveadm fts rescan` removes all indices (#15)
Not really, as the steps outlined by Timo would not get done. Aki> On 17 February 2019 at 10:56 Joan Moreau via dovecot <dovecot at dovecot.org> wrote: > > > In such case, as long as the API is not upgraded, should > > doveadm index -A -q \* > > be considered a replacement of > > doveadm fts rescan > > On 2019-02-14 16:24, Timo Sirainen via dovecot wrote: > > > Hi, > > > > The rescan() function is a bit badly designed. Currently what you could do what fts-lucene does and: > > - Get list of UIDs for all mails in each folder > > - If Xapian has UID that doesn't exist -> delete it from Xapian > > - If UID is missing from Xapian -> expunge the rest of the UIDs in that folder, so the next indexing will cause them to be indexed > > > > The expunging of rest of the mails is rather ugly, yes.. A better API would be if backend simply had a way to iterate all mails in the index, preferrably sorted by folder. Then a more generic code could go through them and expunge the necessary mails and index the missing mails. Although not all FTS backends support indexing in the middle. Anyway, we don't really have time to implement this new API soon. > > > > I'm not sure if this is a big problem though. I don't think most people running FTS have ever run rescan. > > > > On 8 Feb 2019, at 9.54, Joan Moreau via dovecot <dovecot at dovecot.org> wrote: > > > > Hi, > > > > THis is a core problem in Dovecot in my understanding. > > > > In my opinion, the rescan in dovecot should send to the FTS plugin the list of "supposedly" indexed emails (UID), and the plugin shall purge the redundant UID (i..e UID present in the index but not in the list sent by dovecot) and send back the list of UID not in its indexes to dovecot, so Dovect can send one by one the missing emails > > > > WHat do you think ? > > > > -------- Original Message -------- > > > > SUBJECT: > > [grosjo/fts-xapian] `doveadm fts rescan` removes all indices (#15) > > > > DATE: > > 2019-02-08 08:28 > > > > FROM: > > Leonard Lausen <notifications at github.com> > > > > TO: > > grosjo/fts-xapian <fts-xapian at noreply.github.com> > > > > CC: > > Subscribed <subscribed at noreply.github.com> > > > > REPLY-TO: > > grosjo/fts-xapian <reply+0022e607fd2eb3ff93959543198455bc7db5bdd58aa0286b92cf000000011874f1ae92a169ce185221c2 at reply.github.com> > > > > doveadm fts rescan -A deletes all indices, ie. all folders and files in the xapian-indexes are deleted. However, according to man doveadm fts, the rescan command should only > > > > Scan what mails exist in the full text search index and compare those to what > > actually exist in mailboxes. This removes mails from the index that have already > > been expunged and makes sure that the next doveadm index will index all the > > missing mails (if any). > > > > Deleting all indices does not seem to be the intended action, especially as constructing the index anew may take very long on large mailboxes. > > > > -- > > You are receiving this because you are subscribed to this thread. > > Reply to this email directly, view it on GitHub [1], or mute the thread [2]. > > > Links: > ------ > [1] https://github.com/grosjo/fts-xapian/issues/15 > [2] > https://github.com/notifications/unsubscribe-auth/ACLmB9OB-7GaKIvhNc8sCgi7KQTrjNnoks5vLScugaJpZM4auCWp
Joan Moreau
2019-Feb-17 14:58 UTC
[grosjo/fts-xapian] `doveadm fts rescan` removes all indices (#15)
Can you clarify the piece of code or give an example on how to "Get list of UIDs for all mails in each folder " and how to get the "list of all folder/mailbox" from a *backend input ? On 2019-02-17 14:52, Aki Tuomi wrote:> Not really, as the steps outlined by Timo would not get done. > > Aki > > On 17 February 2019 at 10:56 Joan Moreau via dovecot <dovecot at dovecot.org> wrote: > > In such case, as long as the API is not upgraded, should > > doveadm index -A -q \* > > be considered a replacement of > > doveadm fts rescan > > On 2019-02-14 16:24, Timo Sirainen via dovecot wrote: > > Hi, > > The rescan() function is a bit badly designed. Currently what you could do what fts-lucene does and: > - Get list of UIDs for all mails in each folder > - If Xapian has UID that doesn't exist -> delete it from Xapian > - If UID is missing from Xapian -> expunge the rest of the UIDs in that folder, so the next indexing will cause them to be indexed > > The expunging of rest of the mails is rather ugly, yes.. A better API would be if backend simply had a way to iterate all mails in the index, preferrably sorted by folder. Then a more generic code could go through them and expunge the necessary mails and index the missing mails. Although not all FTS backends support indexing in the middle. Anyway, we don't really have time to implement this new API soon. > > I'm not sure if this is a big problem though. I don't think most people running FTS have ever run rescan. > > On 8 Feb 2019, at 9.54, Joan Moreau via dovecot <dovecot at dovecot.org> wrote: > > Hi, > > THis is a core problem in Dovecot in my understanding. > > In my opinion, the rescan in dovecot should send to the FTS plugin the list of "supposedly" indexed emails (UID), and the plugin shall purge the redundant UID (i..e UID present in the index but not in the list sent by dovecot) and send back the list of UID not in its indexes to dovecot, so Dovect can send one by one the missing emails > > WHat do you think ? > > -------- Original Message -------- > > SUBJECT: > [grosjo/fts-xapian] `doveadm fts rescan` removes all indices (#15) > > DATE: > 2019-02-08 08:28 > > FROM: > Leonard Lausen <notifications at github.com> > > TO: > grosjo/fts-xapian <fts-xapian at noreply.github.com> > > CC: > Subscribed <subscribed at noreply.github.com> > > REPLY-TO: > grosjo/fts-xapian <reply+0022e607fd2eb3ff93959543198455bc7db5bdd58aa0286b92cf000000011874f1ae92a169ce185221c2 at reply.github.com> > > doveadm fts rescan -A deletes all indices, ie. all folders and files in the xapian-indexes are deleted. However, according to man doveadm fts, the rescan command should only > > Scan what mails exist in the full text search index and compare those to what > actually exist in mailboxes. This removes mails from the index that have already > been expunged and makes sure that the next doveadm index will index all the > missing mails (if any). > > Deleting all indices does not seem to be the intended action, especially as constructing the index anew may take very long on large mailboxes. > > -- > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub [1 [1]], or mute the thread [2]. > > Links: > ------ > [1] https://github.com/grosjo/fts-xapian/issues/15 > [2] > https://github.com/notifications/unsubscribe-auth/ACLmB9OB-7GaKIvhNc8sCgi7KQTrjNnoks5vLScugaJpZM4auCWpLinks: ------ [1] https://github.com/grosjo/fts-xapian/issues/15 -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20190217/51b4039b/attachment.html>