On 4/13/2019 4:29 AM, John Fawcett via dovecot wrote:> If this value was made configurable people could set it to what they > want. However the underlying problem is likely on solr configuration.The Jetty that is included in Solr has its idle timeout set to 50 seconds. But in practice, I have not seen this timeout trigger ... and if the OP is seeing a 60 second timeout, then the 50 second idle timeout in Jetty must not be occurring. There may be a socket timeout configured on inter-server requests -- distributed queries or the load balancing that SolrCloud does. I can never remember whether this is the case by default. I think it is.> If there is an issue on initial indexing, where you are not really > concerned about qucik visibility but just getting things into the index > efficiently, a better approach would be for dovecot plugin not to send > any commit or softCommit (or waitSearcher either) and that should speed > things up. You'd need to configure solr with a long autoSoftCommit > maxTime and a reasonable autoCommit maxTime, which you could then > reconfigure when the load was done.Solr ships with autoCommit set to 15 seconds and openSearcher set to false on the autoCommit. The autoSoftCommit setting is not enabled by default, but depending on how the index was created, Solr might try to set autoSoftCommit to 3 seconds ... which is WAY too short. I will usually increase the autoCommit time to 60 seconds, just to reduce the amount of work that Solr is doing. The autoSoftCommit time, if it is used, should be set to a reasonably long value ... values between two and five minutes would be good. Attempting to use a very short autoSoftCommit time will usually lead to problems. This thread says that dovecot is sending explicit commits. One thing that might be happening to exceed 60 seconds is an extremely long commit, which is usually caused by excessive cache autowarming, but might be related to insufficient memory. The max heap setting on an out-of-the-box Solr install (5.0 and later) is 512MB. That's VERY small, and it doesn't take much index data before a much larger heap is required. Thanks, Shawn
On 13/04/2019 17:16, Shawn Heisey via dovecot wrote:> On 4/13/2019 4:29 AM, John Fawcett via dovecot wrote: >> If this value was made configurable people could set it to what they >> want. However the underlying problem is likely on solr configuration. > > The Jetty that is included in Solr has its idle timeout set to 50 > seconds.? But in practice, I have not seen this timeout trigger ... > and if the OP is seeing a 60 second timeout, then the 50 second idle > timeout in Jetty must not be occurring. > > There may be a socket timeout configured on inter-server requests -- > distributed queries or the load balancing that SolrCloud does.? I can > never remember whether this is the case by default.? I think it is. > >> If there is an issue on initial indexing, where you are not really >> concerned about qucik visibility but just getting things into the index >> efficiently, a better approach would be for dovecot plugin not to send >> any commit or softCommit (or waitSearcher either) and that should speed >> things up. You'd need to configure solr with a long autoSoftCommit >> maxTime and a reasonable autoCommit maxTime, which you could then >> reconfigure when the load was done. > > Solr ships with autoCommit set to 15 seconds and openSearcher set to > false on the autoCommit.? The autoSoftCommit setting is not enabled by > default, but depending on how the index was created, Solr might try to > set autoSoftCommit to 3 seconds ... which is WAY too short. > > I will usually increase the autoCommit time to 60 seconds, just to > reduce the amount of work that Solr is doing.? The autoSoftCommit > time, if it is used, should be set to a reasonably long value ... > values between two and five minutes would be good.? Attempting to use > a very short autoSoftCommit time will usually lead to problems. > > This thread says that dovecot is sending explicit commits.? One thing > that might be happening to exceed 60 seconds is an extremely long > commit, which is usually caused by excessive cache autowarming, but > might be related to insufficient memory.? The max heap setting on an > out-of-the-box Solr install (5.0 and later) is 512MB.? That's VERY > small, and it doesn't take much index data before a much larger heap > is required. > > Thanks, > ShawnI looked into the code (version 2.3.5.1): the fts-solr plugin is not sending softCommit every 1000 emails. Emails from a single folder are batched in up to a maximum of 1000 emails per request, but the softCommit gets sent once per mailbox folder at the end of all the requests for that folder. I immagine that one of the reasons dovecot sends softCommits is because without autoindex active and even if mailboxes are periodically indexed from cron, the last emails received with be indexed at the moment of the search.? So while sending softCommit has the advantage of including recent mails in searches, it means that softCommits are being done upon user request. Frequency depends on user activity. Going back to the original problem: seems the first advice to Peter is to look into solr configuration as others have said.>From dovecot point of view I can see the following as potentially usefulfeatures: 1) a configurable batch size would enable to tune the number of emails per request and help stay under the 60 seconds hard coded http request timeout. A configurable http timeout would be less useful, since this will potentially run into other timeouts on solr side. 2) abilty to turn off softCommits so as to have a more predictable softCommit workload. In that case autoSoftCommit should be configured in solr. In order to minimize risk of recent emails not appearing in search results, periodic indexing could be set up by cron. I've attached a patch, any comments are welcome (especially about getting settings from the backend context). Example config plugin { ? fts = solr ? fts_solr url=https://user:password at solr.example.com:443/solr/dovecot/ batch_size=500 no_soft_commit } John -------------- next part -------------- --- src/plugins/fts-solr/fts-solr-plugin.h.orig 2019-04-14 15:12:07.694289402 +0200 +++ src/plugins/fts-solr/fts-solr-plugin.h 2019-04-14 14:04:17.213939414 +0200 @@ -12,8 +12,10 @@ struct fts_solr_settings { const char *url, *default_ns_prefix; + unsigned int batch_size; bool use_libfts; bool debug; + bool no_soft_commit; }; struct fts_solr_user { --- src/plugins/fts-solr/fts-solr-plugin.c.orig 2019-04-14 11:41:03.591782439 +0200 +++ src/plugins/fts-solr/fts-solr-plugin.c 2019-04-14 14:37:46.059433864 +0200 @@ -10,6 +10,8 @@ #include "fts-solr-plugin.h" +#define DEFAULT_SOLR_BATCH_SIZE 1000 + const char *fts_solr_plugin_version = DOVECOT_ABI_VERSION; struct http_client *solr_http_client = NULL; @@ -37,6 +39,10 @@ } else if (str_begins(*tmp, "default_ns=")) { set->default_ns_prefix p_strdup(user->pool, *tmp + 11); + } else if (str_begins(*tmp, "batch_size=")) { + set->batch_size = atoi(*tmp + 11); + } else if (str_begins(*tmp, "no_soft_commit")) { + set->no_soft_commit = TRUE; } else { i_error("fts_solr: Invalid setting: %s", *tmp); return -1; @@ -46,6 +52,7 @@ i_error("fts_solr: url setting missing"); return -1; } + if (set->batch_size <= 0) set->batch_size = DEFAULT_SOLR_BATCH_SIZE; return 0; } --- src/plugins/fts-solr/fts-backend-solr.c.orig 2019-04-14 13:27:56.694117159 +0200 +++ src/plugins/fts-solr/fts-backend-solr.c 2019-04-14 14:40:12.513938845 +0200 @@ -28,8 +28,6 @@ #define SOLR_HEADER_LINE_MAX_TRUNC_SIZE 1024 #define SOLR_QUERY_MAX_MAILBOX_COUNT 10 -/* How often to flush indexing request to Solr before beginning a new one. */ -#define SOLR_MAIL_FLUSH_INTERVAL 1000 struct solr_fts_backend { struct fts_backend backend; @@ -392,6 +390,9 @@ (struct solr_fts_backend_update_context *)_ctx; struct solr_fts_backend *backend (struct solr_fts_backend *)_ctx->backend; + struct fts_backend *_backend + (struct solr_fts_backend *)_ctx->backend; + struct fts_solr_user *fuser = FTS_SOLR_USER_CONTEXT(_backend->ns->user); struct solr_fts_field *field; const char *str; int ret = _ctx->failed ? -1 : 0; @@ -404,10 +405,12 @@ visible to the following search */ if (ctx->expunges) fts_backend_solr_expunge_flush(ctx); - str = t_strdup_printf("<commit softCommit=\"true\" waitSearcher=\"%s\"/>", + if(!fuser->set.no_soft_commit) { + str = t_strdup_printf("<commit softCommit=\"true\" waitSearcher=\"%s\"/>", ctx->documents_added ? "true" : "false"); - if (solr_connection_post(backend->solr_conn, str) < 0) - ret = -1; + if (solr_connection_post(backend->solr_conn, str) < 0) + ret = -1; + } } str_free(&ctx->cmd); @@ -494,10 +496,13 @@ { struct solr_fts_backend *backend (struct solr_fts_backend *)ctx->ctx.backend; - - if (ctx->mails_since_flush++ >= SOLR_MAIL_FLUSH_INTERVAL) { + struct fts_backend *_backend + (struct solr_fts_backend *)ctx->ctx.backend; + struct fts_solr_user *fuser = FTS_SOLR_USER_CONTEXT(_backend->ns->user); + if (ctx->mails_since_flush++ >= fuser->set.batch_size) { if (fts_backed_solr_build_flush(ctx) < 0) ctx->ctx.failed = TRUE; + ctx->mails_since_flush++; } if (ctx->post == NULL) { if (ctx->cmd == NULL) [root at server02 dovecot-2.3.5.1]#
<!doctype html> <html> <head> <meta charset="UTF-8"> </head> <body> <div> <br> </div> <blockquote type="cite"> <div> On 14 April 2019 16:59 John Fawcett via dovecot < <a href="mailto:dovecot@dovecot.org">dovecot@dovecot.org</a>> wrote: </div> <div> <br> </div> <div> <br> </div> <div> On 13/04/2019 17:16, Shawn Heisey via dovecot wrote: </div> <blockquote type="cite"> <div> On 4/13/2019 4:29 AM, John Fawcett via dovecot wrote: </div> <blockquote type="cite"> <div> If this value was made configurable people could set it to what they </div> <div> want. However the underlying problem is likely on solr configuration. </div> </blockquote> </blockquote> <blockquote type="cite"> <div> The Jetty that is included in Solr has its idle timeout set to 50 </div> <div> seconds. But in practice, I have not seen this timeout trigger ... </div> <div> and if the OP is seeing a 60 second timeout, then the 50 second idle </div> <div> timeout in Jetty must not be occurring. </div> </blockquote> <blockquote type="cite"> <div> There may be a socket timeout configured on inter-server requests -- </div> <div> distributed queries or the load balancing that SolrCloud does. I can </div> <div> never remember whether this is the case by default. I think it is. </div> </blockquote> <div> >> If there is an issue on initial indexing, where you are not really </div> <div> >> concerned about qucik visibility but just getting things into the index </div> <div> >> efficiently, a better approach would be for dovecot plugin not to send </div> <div> >> any commit or softCommit (or waitSearcher either) and that should speed </div> <div> >> things up. You'd need to configure solr with a long autoSoftCommit </div> <div> >> maxTime and a reasonable autoCommit maxTime, which you could then </div> <div> >> reconfigure when the load was done. </div> <div> > </div> <blockquote type="cite"> <div> Solr ships with autoCommit set to 15 seconds and openSearcher set to </div> <div> false on the autoCommit. The autoSoftCommit setting is not enabled by </div> <div> default, but depending on how the index was created, Solr might try to </div> <div> set autoSoftCommit to 3 seconds ... which is WAY too short. </div> </blockquote> <blockquote type="cite"> <div> I will usually increase the autoCommit time to 60 seconds, just to </div> <div> reduce the amount of work that Solr is doing. The autoSoftCommit </div> <div> time, if it is used, should be set to a reasonably long value ... </div> <div> values between two and five minutes would be good. Attempting to use </div> <div> a very short autoSoftCommit time will usually lead to problems. </div> </blockquote> <blockquote type="cite"> <div> This thread says that dovecot is sending explicit commits. One thing </div> <div> that might be happening to exceed 60 seconds is an extremely long </div> <div> commit, which is usually caused by excessive cache autowarming, but </div> <div> might be related to insufficient memory. The max heap setting on an </div> <div> out-of-the-box Solr install (5.0 and later) is 512MB. That's VERY </div> <div> small, and it doesn't take much index data before a much larger heap </div> <div> is required. </div> </blockquote> <blockquote type="cite"> <div> Thanks, </div> <div> Shawn </div> </blockquote> <div> I looked into the code (version 2.3.5.1): the fts-solr plugin is not </div> <div> sending softCommit every 1000 emails. Emails from a single folder are </div> <div> batched in up to a maximum of 1000 emails per request, but the </div> <div> softCommit gets sent once per mailbox folder at the end of all the </div> <div> requests for that folder. </div> <div> <br> </div> <div> I immagine that one of the reasons dovecot sends softCommits is because </div> <div> without autoindex active and even if mailboxes are periodically indexed </div> <div> from cron, the last emails received with be indexed at the moment of the </div> <div> search. So while sending softCommit has the advantage of including </div> <div> recent mails in searches, it means that softCommits are being done upon </div> <div> user request. Frequency depends on user activity. </div> <div> <br> </div> <div> Going back to the original problem: seems the first advice to Peter is </div> <div> to look into solr configuration as others have said. </div> <div> <br> </div> <div> From dovecot point of view I can see the following as potentially useful </div> <div> features: </div> <div> <br> </div> <div> 1) a configurable batch size would enable to tune the number of emails </div> <div> per request and help stay under the 60 seconds hard coded http request </div> <div> timeout. A configurable http timeout would be less useful, since this </div> <div> will potentially run into other timeouts on solr side. </div> <div> <br> </div> <div> 2) abilty to turn off softCommits so as to have a more predictable </div> <div> softCommit workload. In that case autoSoftCommit should be configured in </div> <div> solr. In order to minimize risk of recent emails not appearing in search </div> <div> results, periodic indexing could be set up by cron. </div> <div> <br> </div> <div> I've attached a patch, any comments are welcome (especially about </div> <div> getting settings from the backend context). </div> <div> <br> </div> <div> Example config </div> <div> <br> </div> <div> plugin { </div> <div> fts = solr </div> <div> fts_solr </div> <div> url <a href="https://user:password@solr.example.com:443/solr/dovecot/" rel="noopener" target="_blank">https://user:password@solr.example.com:443/solr/dovecot/</a> </div> <div> batch_size=500 no_soft_commit </div> <div> } </div> <div> <br> </div> <div> John </div> </blockquote> <div> <br> </div> <div> Can you please open a pull request to https://github.com/dovecot/core ? </div> <div class="io-ox-signature"> <pre>--- Aki Tuomi</pre> </div> </body> </html>
On 4/14/2019 7:59 AM, John Fawcett via dovecot wrote:> From dovecot point of view I can see the following as potentially useful > features: > > 1) a configurable batch size would enable to tune the number of emails > per request and help stay under the 60 seconds hard coded http request > timeout. A configurable http timeout would be less useful, since this > will potentially run into other timeouts on solr side.Even if several thousand emails are sent per batch, unless they're incredibly large, I can't imagine indexing them taking more than a few seconds. Does dovecot send attachments to Solr as well as the email itself? Hopefully it doesn't. If it does, then you would want a smaller batch size. But if the heap size for Solr is not big enough, that can cause major delays no matter what requests are being sent, because Java will be spending most of its time doing garbage collection. I'm also assuming that the Solr server is on the same LAN as dovecot and that transferring the update data does not take a long time. Thanks, Shawn