I have some ouzzling behaviour here - looks very much like I am running out of network resources of some kind, but I cant find out what, so am wondering if anyone has any ideas. All machines are running FreeBSD 9.2-STABLE r265427 - which is from the start of May (probbaly around heartbleed time!) We have 5 machines running webservers - apache24 serving cgi scripts, plyus nginx being used to drive uwsgi with some django/python based code. These are load balanced by pound on a machine which faces the internet. This all works as expected, except that if I modify the cgi-scripts running inside Apache so they make some https calls to the nginx server on 127.0.0.1 then what we see is that pound then stops being able to connect to Apache for a proportion of its calls - its returns 503's. The effect on the calls which fail are as if the webserver is not listening anymore. But this only applies to a fraction of the calls - most get through. If I disable the cuntionality which makes the intrenal call to 127.0.0.1 then the problem goes away. It looks to me like I am runing out of some network resource somwhow, but the load is very very low, and I cant see any obvious parameters hitting their limits. Nothing is looged out of the ordinary on the webservers, the only symptoom is the load balancer not being able to connect. Does anyone have any ideas where to look for a solution ? It is puzzling the hell out of me! -pete.
On Wed, 05 Nov 2014 19:23:31 +0100, Pete French <petefrench at ingresso.co.uk> wrote:> I have some ouzzling behaviour here - looks very much like > I am running out of network resources of some kind, but I cant find > out what, so am wondering if anyone has any ideas. > > All machines are running FreeBSD 9.2-STABLE r265427 - which is from > the start of May (probbaly around heartbleed time!) > > We have 5 machines running webservers - apache24 serving cgi scripts, > plyus nginx being used to drive uwsgi with some django/python based > code. These are load balanced by pound on a machine which faces > the internet. This all works as expected, except that if I modify > the cgi-scripts running inside Apache so they make some https > calls to the nginx server on 127.0.0.1 then what we see is that > pound then stops being able to connect to Apache for a proportion > of its calls - its returns 503's. > > The effect on the calls which fail are as if the webserver is > not listening anymore. But this only applies to a fraction of > the calls - most get through. If I disable the cuntionality > which makes the intrenal call to 127.0.0.1 then the problem > goes away. > > It looks to me like I am runing out of some network resource somwhow, > but the load is very very low, and I cant see any obvious parameters > hitting their limits. Nothing is looged out of the ordinary on > the webservers, the only symptoom is the load balancer not being > able to connect. > > Does anyone have any ideas where to look for a solution ? It is > puzzling the hell out of me! > > -pete.Do the https calls to nginx succeed? I can imagine that the https certificate is not valid on 127.0.0.1 and fetch/curl/wget asks for confirmation or something like that. Ronald.
On Wed, 05 Nov 2014 19:23:31 +0100, Pete French <petefrench at ingresso.co.uk> wrote:> make some https > calls to the nginx server on 127.0.0.1 then what we see is that > pound then stops being able to connect to Apache for a proportion > of its calls - its returns 503's.My guess from "https" + "proportion of calls" + "503": Maybe you run out of random data for ssl connections. Thus nginx waits for more random data, times out, apache times out -> 503 ? Michael