Patrik Peng
2022-Feb-09  11:21 UTC
Different handling of upper and lower case while indexing/searching with Solr
Hello there
We stumbled upon an user account with Solr FTS, which returned no search 
results for any given search query.
Further investigation revealed an issue between indexing mails and 
querying the index.
The user name contains upper and lower case characters (eg. 
Some.User at domain.net).
When new mail is indexed for this user, the user name used for Solr's 
`user` and `id` fields are transformed into lowercase, as shown in the 
Solr log:
webapp=/solr path=/update 
params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user at domain.net 
(1724281617442144256), ... (162 adds)]} 0 44298
And can be confirmed by manually querying Solr. The Solr schema in use 
performs no transformation for the affected fields.
When a search request is performed via IMAP, Dovecot queries Solr with 
the original user name:
GET 
/solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User
at domain.net
HTTP/1.1"
Which (correctly) returns zero results.
To summarize, I suspect dovecot transforms any user name to lower case 
while indexing mails, but not when querying for results.
Is this a bug, or caused by misconfiguration?
Regards
Patrik
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://dovecot.org/pipermail/dovecot/attachments/20220209/861816b1/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL:
<https://dovecot.org/pipermail/dovecot/attachments/20220209/861816b1/attachment.sig>
Patrik Peng
2022-Feb-09  11:31 UTC
Different handling of upper and lower case while indexing/searching with Solr
Woops, this time with better formatting. On 09.02.22 12:21, Patrik Peng wrote:> > Hello there > > We stumbled upon an user account with Solr FTS, which returned no > search results for any given search query. > Further investigation revealed an issue between indexing mails and > querying the index. > The user name contains upper and lower case characters (eg. > Some.User at domain.net). > > When new mail is indexed for this user, the user name used for Solr's > `user` and `id` fields are transformed into lowercase, as shown in the > Solr log: > > webapp=/solr path=/update > params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user at domain.net > (1724281617442144256), ... (162 adds)]} 0 44298 > > And can be confirmed by manually querying Solr. The Solr schema in use > performs no transformation for the affected fields. > When a search request is performed via IMAP, Dovecot queries Solr with > the original user name: > > GET > /solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User at domain.net > HTTP/1.1" > > Which (correctly) returns zero results. > > To summarize, I suspect dovecot transforms any user name to lower case > while indexing mails, but not when querying for results. > > Is this a bug, or caused by misconfiguration? > > Regards > Patrik-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20220209/824475fd/attachment-0001.htm> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: <https://dovecot.org/pipermail/dovecot/attachments/20220209/824475fd/attachment-0001.sig>