thr3ads.net - dovecot - Solr connection timeout hardwired to 60s [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Peter Mogensen

2019-Apr-04 08:21 UTC

Solr connection timeout hardwired to 60s

Hi,

What's the recommended way to handling timeouts on large mailboxes given
the hardwired request timeout of 60s in solr-connection.c:

   http_set.request_timeout_msecs = 60*1000;


/Peter

Shawn Heisey

2019-Apr-04 14:11 UTC

head link

Solr connection timeout hardwired to 60s

On 4/4/2019 2:21 AM, Peter Mogensen via dovecot wrote:> What's the recommended way to handling timeouts on large mailboxes
given
> the hardwired request timeout of 60s in solr-connection.c:
> 
>     http_set.request_timeout_msecs = 60*1000;
I'm a denizen of the solr-user at lucene.apache.org mailing list.

For a typical Solr index, 60 seconds is an eternity.  Most people aim 
for query times of 100 milliseconds or less, and they often achieve that 
goal.

If you have an index where queries really are taking longer than 60 
seconds, you're most likely going to need to get better hardware for 
Solr.  Memory is the resource that usually has the greatest impact on 
Solr performance.  Putting the index on SSD can help, but memory will 
help more.

Here's a wiki page that I wrote about that topic.  This wiki is going 
away next month, but for now you can still access it:

https://wiki.apache.org/solr/SolrPerformanceProblems

There's a section in that wiki page about asking for help on performance 
issues.  It describes how to create a particular process listing for a 
screenshot.  If you can get that screenshot and share it using a file 
sharing site (dropbox is usually a good choice), I may be able to offer 
some insight.

Thanks,
Shawn

Shawn Heisey

2019-Apr-04 14:12 UTC

head link

Solr connection timeout hardwired to 60s

On 4/4/2019 2:21 AM, Peter Mogensen via dovecot wrote:> What's the recommended way to handling timeouts on large mailboxes
given
> the hardwired request timeout of 60s in solr-connection.c:
> 
>     http_set.request_timeout_msecs = 60*1000;
I'm a denizen of the solr-user at lucene.apache.org mailing list.

For a typical Solr index, 60 seconds is an eternity.  Most people aim 
for query times of 100 milliseconds or less, and they often achieve that 
goal.

If you have an index where queries really are taking longer than 60 
seconds, you're most likely going to need to get better hardware for 
Solr.  Memory is the resource that usually has the greatest impact on 
Solr performance.  Putting the index on SSD can help, but memory will 
help more.

Here's a wiki page that I wrote about that topic.  This wiki is going 
away next month, but for now you can still access it:

https://wiki.apache.org/solr/SolrPerformanceProblems

There's a section in that wiki page about asking for help on performance 
issues.  It describes how to create a particular process listing for a 
screenshot.  If you can get that screenshot and share it using a file 
sharing site (dropbox is usually a good choice), I may be able to offer 
some insight.

Thanks,
Shawn

Daniel Lange

2019-Apr-04 14:40 UTC

head link

Solr connection timeout hardwired to 60s

Hi Shawn

Am 04.04.19 um 16:12 schrieb Shawn Heisey via dovecot:> On 4/4/2019 2:21 AM, Peter Mogensen via dovecot wrote:
> Here's a wiki page that I wrote about that topic.? This wiki is going 
> away next month, but for now you can still access it:
> 
> https://wiki.apache.org/solr/SolrPerformanceProblems
https://web.archive.org/web/20190404143817/https://wiki.apache.org/solr/SolrPerformanceProblems

That one will last longer :).

Best
Daniel

M. Balridge

2019-Apr-05 00:42 UTC

head link

Solr connection timeout hardwired to 60s

> I'm a denizen of the solr-user at lucene.apache.org mailing list.
> [...]
> Here's a wiki page that I wrote about that topic.  This wiki is going
> away next month, but for now you can still access it:
> 
> https://wiki.apache.org/solr/SolrPerformanceProblems
That's a great resource, Shawn.

I am about to put together a test case to provide a comprehensive FTS setup
around Dovecot with a goal towards exposing proximity keyword searching, with
email silos containing tens of terabytes (most of the "bulk" is
represented by
attachments, each of which get processed down to plaintext, if possible).
Figure thousands of users with decades of email (80,000 to 750,000) emails per
user).

My main background is in software engineering (C/C++/Python/Assembler), but I
have been forced into system admin tasks during many stretches of my work. I
do vividly remember the tedium of dealing with JAVA and GC, tuning it to avoid
stalls, and its ravenous appetite for RAM. 

It looks like those problems are still with us, many versions later.  For
corporations with infinite budgets, throwing lots of crazy money at the
problem is "fine" (>1TB RAM, all PCIe SSDs, etc), but I am worried
that I will
be shoved forcefully into a wall of having to spend a fortune just to keep FTS
performing reasonably well before I even get to the 10,000 user mark.

I realise the only way to keep performance reasonable is to heavily shard the
index database, but I am concerned about how well the process works in
practice without needing a great deal of sysadmin hand-holding. I would
ideally prefer the decisions of how/where to shard be based on
volume/heuristics than something that is done manually. I realise that a human
will be necessary to add more hardware to the pools, but what are my options
for scaling the system by orders of magnitude?

What is a general rule of thumb for RAM and SSD disk requirements as a
fraction of indexed document hive size to keep query performance at 200ms or
less? How do people deal with the JAVA GC world-stoppages, other than simply
doubling or tripling every instance?

I am wondering how well alternatives to Solr work in these situations
(ElasticSearch, Xapian, and any others I may have missed).

Regards,

=M=

Apparently Analagous Threads

Search for more maybe matching threads

dovecot - Apr 2019 - Solr connection timeout hardwired to 60s

Solr connection timeout hardwired to 60s

Solr connection timeout hardwired to 60s

Solr connection timeout hardwired to 60s

Solr connection timeout hardwired to 60s

Solr connection timeout hardwired to 60s

Apparently Analagous Threads