It's because you're misunderstanding how the lookup() function works. It gets ALL the search parameters, including the "mailbox inbox". This is intentional, and not a bug. Two reasons being: 1) The FTS plugin in theory could support indexing/searching any kinds of searches, not just regular word searches. So I didn't want to limit it unnecessarily. 2) Especially with "mailbox inbox" this is important when searching from virtual mailboxes. If you configure "All mails in all folders" virtual mailbox, you can do a search in there that restricts which physical mailboxes are matched. In this case the FTS backend can optimize this lookup so it can filter only the physical mailboxes that have matches, leaving the others out. And it can do this in a single query if all the mailboxes are in the same FTS index. So again: Your lookup() function needs to be changed to only use those search args that it really wants to search, and ignore the others. Use solr_add_definite_query_args() as the template. Also I see now the reason for the timeout problem. It's because you're not setting search_arg->match_always=TRUE. These need to be set for the search args that you're actually using to generate the Xapian query. If it's not set, then Dovecot core doesn't think that the arg was part of the FTS search and it processes it itself. Meaning that it opens all the emails and does the search the slow way, practically making the FTS lookup ignored.> On 21 Apr 2019, at 19.50, Joan Moreau <jom at grosjo.net> wrote: > > No, the parsing is made by dovecot core, that is nothing the backend can do about it. The backend shall *never* reveive this. (would it be buggy or no) > > > > PLease, have a look deeper > > And the loop is a very big problem as it times out all the time (and once again, this is not in any of the backend functions) > > > > > On 2019-04-21 10:42, Timo Sirainen via dovecot wrote: > >> Inbox appears in the list of arguments, because fts_backend_xapian_lookup() is parsing the search args wrong. Not sure about the other issue. >> >>> On 21 Apr 2019, at 19.31, Joan Moreau <jom at grosjo.net <mailto:jom at grosjo.net>> wrote: >>> >>> For this first point, the problem is that dovecot core sends TWICE the request and "Inbox" appears in the list of arguments ! (inbox shall serve to select teh right mailbox, never sent to the backend) >>> >>> And even if this would be solved, the dovecot core loops *after* the backend hs returneds the results >>> >>> >>> >>> # doveadm search -u jom at grosjo.net <mailto:jom at grosjo.net> mailbox inbox text milan >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Get last UID of INBOX = 315526 >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Get last UID of INBOX = 315526 >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Query: FLAG=AND >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Query(1): add term(wilcard) : inbox >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Query(2): add term(wilcard) : milan >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Testing if wildcard >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Query: set GLOBAL (no specified header) >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox ) AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR subject:milan OR to:milan ) >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Query: 2 results in 1 ms // THIS IS WHEN BACKEND HAS FOUND RESULTS AND STOPPED >>> d82b4b0f550d38593644000095331209 847 >>> d82b4b0f550d38593644000095331209 1569 >>> d82b4b0f550d38593644000095331209 2260 >>> d82b4b0f550d38593644000095331209 2575 >>> d82b4b0f550d38593644000095331209 2811 >>> d82b4b0f550d38593644000095331209 2885 >>> d82b4b0f550d38593644000095331209 3038 >>> d82b4b0f550d38593644000095331209 3121 -> LOOPING FOREVER >>> >>> >>> >>> >>> >>> >>> On 2019-04-21 09:57, Timo Sirainen via dovecot wrote: >>> >>> On 3 Apr 2019, at 20.30, Joan Moreau via dovecot <dovecot at dovecot.org <mailto:dovecot at dovecot.org>> wrote: >>> doveadm search -u jom at grosjo.net <mailto:jom at grosjo.net> mailbox inbox text milan >>> output >>> >>> doveadm(jom at grosjo.net <mailto:jom at grosjo.net>): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox OR uid:inbox ) AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR subject:milan OR to:milan OR uid:milan ) >>> >>> 1 - The query is wrong >>> >>> That's because fts_backend_xapian_lookup() isn't anywhere close to being correct. Try to copy the logic based on solr_add_definite_query_args(). >>> >>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20190421/05cec9cb/attachment.html>
Timo, A little of logic here : 1 - the mailbox is passed by dovecot to the backend as a mailbox * pointer , NOT as a search parameter. -> It works properly when entering a search from roundcube or evolution for instance. -> therefore this is a clear bug of the command line 2 - the loop : Actually, the timeout occurs because the dovecot core is DISCARDING the results of the backend and do its own search (ie. in my example , it search fo "milan" in my inbox , which is huge , without even considering the backend results -> This is a enormous error. On 2019-04-21 11:29, Timo Sirainen wrote:> It's because you're misunderstanding how the lookup() function works. It gets ALL the search parameters, including the "mailbox inbox". This is intentional, and not a bug. Two reasons being: > > 1) The FTS plugin in theory could support indexing/searching any kinds of searches, not just regular word searches. So I didn't want to limit it unnecessarily. > > 2) Especially with "mailbox inbox" this is important when searching from virtual mailboxes. If you configure "All mails in all folders" virtual mailbox, you can do a search in there that restricts which physical mailboxes are matched. In this case the FTS backend can optimize this lookup so it can filter only the physical mailboxes that have matches, leaving the others out. And it can do this in a single query if all the mailboxes are in the same FTS index. > > So again: Your lookup() function needs to be changed to only use those search args that it really wants to search, and ignore the others. Use solr_add_definite_query_args() as the template. > > Also I see now the reason for the timeout problem. It's because you're not setting search_arg->match_always=TRUE. These need to be set for the search args that you're actually using to generate the Xapian query. If it's not set, then Dovecot core doesn't think that the arg was part of the FTS search and it processes it itself. Meaning that it opens all the emails and does the search the slow way, practically making the FTS lookup ignored. > > On 21 Apr 2019, at 19.50, Joan Moreau <jom at grosjo.net> wrote: > > No, the parsing is made by dovecot core, that is nothing the backend can do about it. The backend shall *never* reveive this. (would it be buggy or no) > > PLease, have a look deeper > > And the loop is a very big problem as it times out all the time (and once again, this is not in any of the backend functions) > > On 2019-04-21 10:42, Timo Sirainen via dovecot wrote: > Inbox appears in the list of arguments, because fts_backend_xapian_lookup() is parsing the search args wrong. Not sure about the other issue. > > On 21 Apr 2019, at 19.31, Joan Moreau <jom at grosjo.net> wrote: > > For this first point, the problem is that dovecot core sends TWICE the request and "Inbox" appears in the list of arguments ! (inbox shall serve to select teh right mailbox, never sent to the backend) > > And even if this would be solved, the dovecot core loops *after* the backend hs returneds the results > > # doveadm search -u jom at grosjo.net mailbox inbox text milan > doveadm(jom at grosjo.net): Info: Get last UID of INBOX = 315526 > doveadm(jom at grosjo.net): Info: Get last UID of INBOX = 315526 > doveadm(jom at grosjo.net): Info: Query: FLAG=AND > doveadm(jom at grosjo.net): Info: Query(1): add term(wilcard) : inbox > doveadm(jom at grosjo.net): Info: Query(2): add term(wilcard) : milan > doveadm(jom at grosjo.net): Info: Testing if wildcard > doveadm(jom at grosjo.net): Info: Query: set GLOBAL (no specified header) > doveadm(jom at grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox ) AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR subject:milan OR to:milan ) > DOVEADM(JOM at GROSJO.NET): INFO: QUERY: 2 RESULTS IN 1 MS // THIS IS WHEN BACKEND HAS FOUND RESULTS AND STOPPED > d82b4b0f550d38593644000095331209 847 > d82b4b0f550d38593644000095331209 1569 > d82b4b0f550d38593644000095331209 2260 > d82b4b0f550d38593644000095331209 2575 > d82b4b0f550d38593644000095331209 2811 > d82b4b0f550d38593644000095331209 2885 > d82b4b0f550d38593644000095331209 3038 > D82B4B0F550D38593644000095331209 3121 -> LOOPING FOREVER > > On 2019-04-21 09:57, Timo Sirainen via dovecot wrote: > On 3 Apr 2019, at 20.30, Joan Moreau via dovecot <dovecot at dovecot.org> wrote: doveadm search -u jom at grosjo.net mailbox inbox text milan > output > > doveadm(jom at grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox OR uid:inbox ) AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR subject:milan OR to:milan OR uid:milan ) > > 1 - The query is wrong > That's because fts_backend_xapian_lookup() isn't anywhere close to being correct. Try to copy the logic based on solr_add_definite_query_args().-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20190421/50b45b44/attachment-0001.html>
for instance, if I do a search from roundcube, the inbo name is NOT passed to the backend (which is normal) the same search from the command line add the mailbox name ADDITIONALLY to the mailbox * pointer However, passing a search from roudcube ask TWICE the backend (first with AND flag, second with OR flag) THis is obviously a clear bug form the part calling the backend (even if the backend may need improvements ! this is really not the point here) Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Get last UID of Sent 61714 Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Get last UID of Sent 61714 Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query: FLAG=AND Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(1/1): add term(wilcard) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(2/1): add term(wilcard) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(3/1): add term(wilcard) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(4/1): add term(wilcard) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(5/1): add term(wilcard) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: SEARCH_OR Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: MATCH NOT : 0 Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Testing if wildcard Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query: set GLOBAL (no specified header) Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query : ( bcc:milao OR body:milao OR cc:milao OR from:milao OR message-id:milao OR subject:milao OR to:milao ) Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query: 0 results in 0 ms Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query: FLAG=OR Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(1): add term(SUBJECT) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: SEARCH_HEADER Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: MATCH NOT : 0 Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(2): add term(TO) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: SEARCH_HEADER Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: MATCH NOT : 0 Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(3): add term(FROM) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: SEARCH_HEADER Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: MATCH NOT : 0 Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(4): add term(CC) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: SEARCH_HEADER Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: MATCH NOT : 0 Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query(5): add term(BCC) : milao Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: SEARCH_HEADER Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: MATCH NOT : 0 Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Testing if wildcard Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query : ( bcc:milao ) OR ( cc:milao ) OR ( from:milao ) OR ( subject:milao ) OR ( to:milao ) Apr 21 11:08:39 gjserver dovecot[14251]: imap(jom at grosjo.net)<15709><Adi/XwiH+ON/AAAB>: Query: 0 results in 0 ms On 2019-04-21 11:56, Joan Moreau via dovecot wrote:> Timo, > > A little of logic here : > > 1 - the mailbox is passed by dovecot to the backend as a mailbox * pointer , NOT as a search parameter. > > -> It works properly when entering a search from roundcube or evolution for instance. > > -> therefore this is a clear bug of the command line > > 2 - the loop : Actually, the timeout occurs because the dovecot core is DISCARDING the results of the backend and do its own search (ie. in my example , it search fo "milan" in my inbox , which is huge , without even considering the backend results > > -> This is a enormous error. > > On 2019-04-21 11:29, Timo Sirainen wrote: It's because you're misunderstanding how the lookup() function works. It gets ALL the search parameters, including the "mailbox inbox". This is intentional, and not a bug. Two reasons being: > > 1) The FTS plugin in theory could support indexing/searching any kinds of searches, not just regular word searches. So I didn't want to limit it unnecessarily. > > 2) Especially with "mailbox inbox" this is important when searching from virtual mailboxes. If you configure "All mails in all folders" virtual mailbox, you can do a search in there that restricts which physical mailboxes are matched. In this case the FTS backend can optimize this lookup so it can filter only the physical mailboxes that have matches, leaving the others out. And it can do this in a single query if all the mailboxes are in the same FTS index. > > So again: Your lookup() function needs to be changed to only use those search args that it really wants to search, and ignore the others. Use solr_add_definite_query_args() as the template. > > Also I see now the reason for the timeout problem. It's because you're not setting search_arg->match_always=TRUE. These need to be set for the search args that you're actually using to generate the Xapian query. If it's not set, then Dovecot core doesn't think that the arg was part of the FTS search and it processes it itself. Meaning that it opens all the emails and does the search the slow way, practically making the FTS lookup ignored. > > On 21 Apr 2019, at 19.50, Joan Moreau <jom at grosjo.net> wrote: > > No, the parsing is made by dovecot core, that is nothing the backend can do about it. The backend shall *never* reveive this. (would it be buggy or no) > > PLease, have a look deeper > > And the loop is a very big problem as it times out all the time (and once again, this is not in any of the backend functions) > > On 2019-04-21 10:42, Timo Sirainen via dovecot wrote: > Inbox appears in the list of arguments, because fts_backend_xapian_lookup() is parsing the search args wrong. Not sure about the other issue. > > On 21 Apr 2019, at 19.31, Joan Moreau <jom at grosjo.net> wrote: > > For this first point, the problem is that dovecot core sends TWICE the request and "Inbox" appears in the list of arguments ! (inbox shall serve to select teh right mailbox, never sent to the backend) > > And even if this would be solved, the dovecot core loops *after* the backend hs returneds the results > > # doveadm search -u jom at grosjo.net mailbox inbox text milan > doveadm(jom at grosjo.net): Info: Get last UID of INBOX = 315526 > doveadm(jom at grosjo.net): Info: Get last UID of INBOX = 315526 > doveadm(jom at grosjo.net): Info: Query: FLAG=AND > doveadm(jom at grosjo.net): Info: Query(1): add term(wilcard) : inbox > doveadm(jom at grosjo.net): Info: Query(2): add term(wilcard) : milan > doveadm(jom at grosjo.net): Info: Testing if wildcard > doveadm(jom at grosjo.net): Info: Query: set GLOBAL (no specified header) > doveadm(jom at grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox ) AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR subject:milan OR to:milan ) > DOVEADM(JOM at GROSJO.NET): INFO: QUERY: 2 RESULTS IN 1 MS // THIS IS WHEN BACKEND HAS FOUND RESULTS AND STOPPED > d82b4b0f550d38593644000095331209 847 > d82b4b0f550d38593644000095331209 1569 > d82b4b0f550d38593644000095331209 2260 > d82b4b0f550d38593644000095331209 2575 > d82b4b0f550d38593644000095331209 2811 > d82b4b0f550d38593644000095331209 2885 > d82b4b0f550d38593644000095331209 3038 > D82B4B0F550D38593644000095331209 3121 -> LOOPING FOREVER > > On 2019-04-21 09:57, Timo Sirainen via dovecot wrote: > On 3 Apr 2019, at 20.30, Joan Moreau via dovecot <dovecot at dovecot.org> wrote: doveadm search -u jom at grosjo.net mailbox inbox text milan > output > > doveadm(jom at grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox OR uid:inbox ) AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR subject:milan OR to:milan OR uid:milan ) > > 1 - The query is wrong > That's because fts_backend_xapian_lookup() isn't anywhere close to being correct. Try to copy the logic based on solr_add_definite_query_args().-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20190421/62a44264/attachment-0001.html>