Patrik Peng
2022-Feb-09 11:31 UTC
Different handling of upper and lower case while indexing/searching with Solr
Woops, this time with better formatting. On 09.02.22 12:21, Patrik Peng wrote:> > Hello there > > We stumbled upon an user account with Solr FTS, which returned no > search results for any given search query. > Further investigation revealed an issue between indexing mails and > querying the index. > The user name contains upper and lower case characters (eg. > Some.User at domain.net). > > When new mail is indexed for this user, the user name used for Solr's > `user` and `id` fields are transformed into lowercase, as shown in the > Solr log: > > webapp=/solr path=/update > params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user at domain.net > (1724281617442144256), ... (162 adds)]} 0 44298 > > And can be confirmed by manually querying Solr. The Solr schema in use > performs no transformation for the affected fields. > When a search request is performed via IMAP, Dovecot queries Solr with > the original user name: > > GET > /solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User at domain.net > HTTP/1.1" > > Which (correctly) returns zero results. > > To summarize, I suspect dovecot transforms any user name to lower case > while indexing mails, but not when querying for results. > > Is this a bug, or caused by misconfiguration? > > Regards > Patrik-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20220209/824475fd/attachment-0001.htm> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: <https://dovecot.org/pipermail/dovecot/attachments/20220209/824475fd/attachment-0001.sig>
Christian Kivalo
2022-Feb-09 16:47 UTC
Re: Different handling of upper and lower case while indexing/searching with Solr
On February 9, 2022 12:31:23 PM GMT+01:00, Patrik Peng <patrik.peng at hostpoint.ch> wrote:>Woops, this time with better formatting. > >On 09.02.22 12:21, Patrik Peng wrote: >> >> Hello there >> >> We stumbled upon an user account with Solr FTS, which returned no >> search results for any given search query. >> Further investigation revealed an issue between indexing mails and >> querying the index. >> The user name contains upper and lower case characters (eg. >> Some.User at domain.net). >> >> When new mail is indexed for this user, the user name used for Solr's >> `user` and `id` fields are transformed into lowercase, as shown in the >> Solr log: >> >> webapp=/solr path=/update >> params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user at domain.net >> (1724281617442144256), ... (162 adds)]} 0 44298 >> >> And can be confirmed by manually querying Solr. The Solr schema in use >> performs no transformation for the affected fields. >> When a search request is performed via IMAP, Dovecot queries Solr with >> the original user name: >> >> GET >> /solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User at domain.net >> HTTP/1.1" >> >> Which (correctly) returns zero results. >> >> To summarize, I suspect dovecot transforms any user name to lower case >> while indexing mails, but not when querying for results. >> >> Is this a bug, or caused by my configuration?How are your users added to your auth backend? Please post your doveconf -n output>> Regards >> Patrik-- Christian Kivalo