thr3ads.net - freebsd stable - kernel killing processes when out of swap [Apr 2005]

If this information is useful, please help other people find it:
Share via:

Steven Hartland

2005-Apr-11 09:41 UTC

kernel killing processes when out of swap

Just had a problem with a box where it looks like it ran out of swap due
to a problem process, not a problem. The problem was that it seems
the kernel on detecting this starts killing off seeming random processes,
the first one being sshd hence making the machine inaccessible.

So the question is: Does the kernel kill random processes when out of
swap or does it kill any processes that require more memory when out
of swap? Which leads to the question would it not be more sensible to
kill off the largest process first as its more than likely that it is
responsible
for the problem?

[quote]
Apr 10 20:09:25 appledore kernel: pid 414 (sshd), uid 0, was killed: out of swap
space
[/quote]

    Steve


===============================================This e.mail is private and
confidential between Multiplay (UK) Ltd. and the person or entity to whom it is
addressed. In the event of misdirection, the recipient is prohibited from using,
copying, printing or otherwise disseminating it or any information contained in
it.

In the event of misdirection, illegible or incomplete transmission please
telephone (023) 8024 3137
or return the E.mail to postmaster@multiplay.co.uk.

Vivek Khera

2005-Apr-12 06:53 UTC

head link

kernel killing processes when out of swap

On Apr 11, 2005, at 12:01 PM, Steven Hartland wrote:
> of swap? Which leads to the question would it not be more sensible to
> kill off the largest process first as its more than likely that it is 
> responsible
> for the problem?
>
so when this largest process is your production database server for 
your e-commerce site, what will you change your recommendation to be?

basically, there is no "right" choice of process to kill.  a machine 
that is out of resources is just a bad situation, and the right thing 
is to try to avoid getting there with careful monitoring and planning.

Vivek Khera, Ph.D.
+1-301-869-4449 x806

Nick Barnes

2005-Apr-12 07:06 UTC

head link

kernel killing processes when out of swap

At 2005-04-12 13:52:59+0000, Vivek Khera writes:
> > of swap? Which leads to the question would it not be more sensible to
> > kill off the largest process first as its more than likely that it is 
> > responsible
> > for the problem?
> >
> 
> so when this largest process is your production database server for 
> your e-commerce site, what will you change your recommendation to be?
> 
> basically, there is no "right" choice of process to kill.  a
machine
> that is out of resources is just a bad situation, and the right thing 
> is to try to avoid getting there with careful monitoring and planning.
The right choice is for mmap() to return ENOMEM, and then for malloc()
to return NULL, but almost no operating systems make this choice any
more.

Nick B

Nick Barnes

2005-Apr-12 08:09 UTC

head link

kernel killing processes when out of swap

At 2005-04-12 14:26:40+0000, Marc Olzheim writes:> On Tue, Apr 12, 2005 at 03:06:41PM +0100, Nick Barnes wrote:
> > The right choice is for mmap() to return ENOMEM, and then for malloc()
> > to return NULL, but almost no operating systems make this choice any
> > more.
> 
> No, the problem occurs only when previously allocated / mmap()d blocks
> are actually used (written) and when the total of virtual memory has
> been overcommitted: Physical pages are not allocated to processes at
> malloc() time, but at time of first usage (Copy On Write).
Yes, implicit in my statement is that the OS shouldn't overcommit.  I
remember when overcommit was new (maybe 1990), and some Unix (Irix,
perhaps, or AIX?) made it switchable.  There was a bit of flurry in
the OS community, as some people (myself included) felt that the OS
shouldn't make promises it couldn't fulfill, and that this "kill a
random process" behaviour was more of a bug than a solution.  Consider
a parallel design which allows (say) file descriptors to be
overcommitted.  You can open a billion files, but if you touch one of
them, that consumes a finite kernel resource, and if the kernel has
run out then a randomly chosen process gets killed.  Great.
> many programs have been programmed in a way that assumes this
> behaviour, for instance by sparsely using large allocations instead
> of adding the possible extra bookkeeping to allow for smaller
> allocations.
This is the well-known problem with my fantasy world in which the OS
doesn't overcommit any resources.  All those programs are broken, but
it's too costly to fix them.  If overcommit had been resisted more
effectively in the first place, those programs would have been written
properly.

My recollection, quite possibly faulty, is that FreeBSD came quite
late to the overcommit binge party.

Nick B

Nick Barnes

2005-Apr-12 14:34 UTC

head link

kernel killing processes when out of swap

At 2005-04-12 18:17:32+0000, Matthias Buelow writes:
> This stuff has been discussed in the past.
Indeed. For a couple of examples from the days before BSD systems got
overcommit, see these threads from 1990 and 1991:

<http://groups-beta.google.com/group/comp.unix.aix/browse_frm/thread/91541dbf6b658465/4c590978f1001507?q=overcommit&rnum=14#4c590978f1001507>

<http://groups-beta.google.com/group/comp.unix.aix/browse_frm/thread/38c9bb9996d30eb1/e8c30f78c44a3f62?q=overcommit&rnum=12#e8c30f78c44a3f62>

Nick B

freebsd stable - Apr 2005 - kernel killing processes when out of swap

kernel killing processes when out of swap

kernel killing processes when out of swap

kernel killing processes when out of swap

kernel killing processes when out of swap

kernel killing processes when out of swap