Christian Kivalo
2022-Feb-09 16:47 UTC
Re: Different handling of upper and lower case while indexing/searching with Solr
On February 9, 2022 12:31:23 PM GMT+01:00, Patrik Peng <patrik.peng at hostpoint.ch> wrote:>Woops, this time with better formatting. > >On 09.02.22 12:21, Patrik Peng wrote: >> >> Hello there >> >> We stumbled upon an user account with Solr FTS, which returned no >> search results for any given search query. >> Further investigation revealed an issue between indexing mails and >> querying the index. >> The user name contains upper and lower case characters (eg. >> Some.User at domain.net). >> >> When new mail is indexed for this user, the user name used for Solr's >> `user` and `id` fields are transformed into lowercase, as shown in the >> Solr log: >> >> webapp=/solr path=/update >> params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user at domain.net >> (1724281617442144256), ... (162 adds)]} 0 44298 >> >> And can be confirmed by manually querying Solr. The Solr schema in use >> performs no transformation for the affected fields. >> When a search request is performed via IMAP, Dovecot queries Solr with >> the original user name: >> >> GET >> /solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User at domain.net >> HTTP/1.1" >> >> Which (correctly) returns zero results. >> >> To summarize, I suspect dovecot transforms any user name to lower case >> while indexing mails, but not when querying for results. >> >> Is this a bug, or caused by my configuration?How are your users added to your auth backend? Please post your doveconf -n output>> Regards >> Patrik-- Christian Kivalo
Patrik Peng
2022-Feb-10 09:36 UTC
Different handling of upper and lower case while indexing/searching with Solr
On 09.02.22 17:47, Christian Kivalo wrote:> How are your users added to your auth backend?We use a SQL DB as auth backend. Users are added by an external application. New accounts are all added as lowercase, but it could be possible that there was a time in the past where accounts were added without conversion. At least the DB contains a few accounts with uppercase letters in the localpart.> Please post your doveconf -n outputHere you go (I stripped a few irrelevant sections): # 2.3.15 (0503334ab1): /usr/local/etc/dovecot/dovecot.conf # Pigeonhole version 0.5.15 (e6a84e31) # OS: FreeBSD 12.2-RELEASE-p11 amd64 # Hostname: XXX auth_cache_negative_ttl = 5 mins auth_cache_size = 20 M auth_cache_ttl = 5 mins auth_username_chars = abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890.+-_@ auth_verbose = yes auth_worker_max_count = 90 config_cache_size = 50 M disable_plaintext_auth = no passdb { ? args = /usr/local/etc/dovecot/sql.conf ? driver = sql ? name = sql } userdb { ? args = /usr/local/etc/dovecot/sql.conf ? driver = sql ? name = sql } plugin { ? fts_autoindex = no ? fts_autoindex_exclude = \Junk ? fts_autoindex_exclude2 = \spam ? fts_autoindex_exclude3 = INBOX.spam ? fts_enforced = no ? fts_index_timeout = 120s ? fts_solr = url=https://XXX soft_commit=no batch_size=1000 ? mail_log_events = copy save delete undelete expunge mailbox_create mailbox_delete mailbox_rename ? mail_log_fields = uid box msgid from size flags ? quota = maildir:User quota ? quota_grace = 10%% ? quota_rule = *:storage=1G ? quota_warning = storage=95%% quota-warning 95 %u ? quota_warning2 = storage=80%% quota-warning 80 %u ? sieve = /var/empty/sieve.current ? sieve_before = /usr/local/etc/dovecot/sieve.before/ ? sieve_dir = /var/empty/sieve ? sieve_global_dir = /usr/local/etc/dovecot/sieve.global/ ? sieve_global_extensions = +editheader } service auth-worker { ? process_limit = 150 ? user = dovenull } service auth { ? client_limit = 65000 } ssl_cert = </etc/ssl/certs/xxx ssl_cipher_list = ECDHE+AESGCM:DHE+AESGCM:ECDHE+AES256:DHE+AES256:ECDHE+AES:DHE+AES:!LOW:!MEDIUM:!aNULL:!eNULL:!3DES:!DES:!DSS:!EXP:!MD5:!PSK:!RC4:!SRP ssl_client_ca_file = /etc/ssl/certs/xxx ssl_dh = # hidden, use -P to show it ssl_key = # hidden, use -P to show it ssl_min_protocol = TLSv1 ssl_prefer_server_ciphers = yes verbose_proctitle = yes protocol imap { ? imap_client_workarounds = delay-newmail tb-extra-mailbox-sep ? mail_max_userip_connections = 45 ? mail_plugins = mail_log notify quota fts fts_solr imap_quota } protocol pop3 { ? mail_max_userip_connections = 30 ? pop3_client_workarounds = outlook-no-nuls oe-ns-eoh ? pop3_uidl_format = UID%u-%v } protocol lmtp { ? mail_plugins = mail_log notify quota fts fts_solr sieve } -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: <https://dovecot.org/pipermail/dovecot/attachments/20220210/b0b643a6/attachment-0001.sig>