thr3ads.net - freebsd stable - swap space issues [Jul 2020]

If this information is useful, please help other people find it:
Share via:

Don Wilde

2020-Jul-12 14:23 UTC

swap space issues

On 7/11/20 11:28 PM, Scott Bennett via freebsd-stable
wrote:>       I have read this entire thread to date with growing dismay, and I
> thank Donald Wilde for reporting his ongoing troubles, although they
> spoil my hopes that the kernel's memory management bugs that first
became
> apparent in 11.2-RELEASE (and -STABLE around the same time) were not
> propagated into 12.x.  A recent update to stable/12 source tree made it
> finally possible for me to build 12.1-STABLE under 11.4-PRERELEASE, and I
> was just about to install the upgrade when this thread appeared.Spoiler alert. Since I gave up on Synth, I haven't had a single swap 
issue. It does appear to be one particular port that drove it nuts 
(apparently, one of the 'Google performance' bits, with a 
mismatched-brackets problem). I have rebuilt the machine several times, 
but that's more for my sense of tidiness than anything.

I've got a little Crystal script that walks the installed packages and 
ports and updates them with system() calls.
The machine is very slow, but it's not swapping at all.

It is quite usable now with 12-STABLE.>
>       On Fri, 26 Jun 2020 03:55:04 -0700 : Donald Wilde <dwilde1 at
gmail.com>
> wrote:
>
>> On 6/26/20, Peter Jeremy <peter at rulingia.com> wrote:
>>>
[snip]>>> I strongly suggest you don't have more than one swap device on
spinning
>>> rust - the VM system will stripe I/O across the available devices
and
>>> that will give particularly poor results when it has to seek
between the
>>> partitions.
>       True.  The only reason I can think of to use more than one swapping/
> paging area on the same device for the same OS instance is for emergencies
> or highly unusual, temporary situations in which more space is needed until
> those situations conclude. and even in such situations, if the space can be
> found on another device, it should be placed there.  Interleaving of swap
> space across multiple devices is intended as a performance enhancement
> akin to striping (a.k.a. RAID0), although the virtual memory isn't
> necessarily always actually striped across those devices.  Adding a paging
> area on the same device as an existing one is an abhorrent situation, as
> Peter Jeremy noted, and it should be eliminated via swapoff(8) as soon as
> the extraordinary situation has passed.  N.B. the GENERIC kernel sets a
> limit of four swap devices, although it can be rebuilt with a different
> limit.That's good data, Scott, thanks! The only reason I got into that 
situation of trying to add another swap device was that it was crashing 
with OO swap messages.>> My intent is to make this machine function -- getting the bear
>> dancing. How deftly she dances is less important than that she dances
>> at all. My for-real boxen will have real HP and real cores and RAM.
>>
>>> Also, you can't actually use 64GB swap with 4GB RAM.  If you
look back
>>> through your boot messages, I expect you'll find messages like:
>>> warning: total configured swap (524288 pages) exceeds maximum
recommended
>>> amount (498848 pages).
>>> warning: increase kern.maxswzone or reduce amount of swap.
>       Also true.  Unfortunately, no guidance whatsoever is provided to
advise
> system administrators who need more space as to how to increase the
relevant
> table sizes and limits.  However, that is a documentation bug, not a code
> bug.I've got both my kern.max* and CCACHE set up mostly correctly. 
Everything builds and runs well, although I've found that it's helpful 
to only use -j3 while building, not -j4 which would be appropriate for 
my HAMMER i3. I'd much rather have the bear *dancing* than running into 
walls. :D>> Yes, as I posted, those were part of the failure stream from the synth
>> program. When I had kern.maxswzone increased, it got through boot
>> without complaining.
>>
>>> or maybe:
>>> WARNING: reducing swap size to maximum of xxxxMB per unit
>> The warnings were there, in the as-it-failed complaints.
>>
>>> The absolute limit on swap space is vm.swap_maxpages pages but the
>>> realistic
>>> limit is about half that.  By default the realistic limit is about
4?RAM
>>> (on
>>> 64-bit architectures), but this can be adjusted via kern.maxswzone
(which
>>> defines the #bytes of RAM to allocate to swzone structures - the
actual
>>> space allocated is vm.swzone).
>>>
>>> As a further piece of arcana, vm.pageout_oom_seq is a count that
controls
>>> the number of passes before the pageout daemon gives up and starts
killing
>>> processes when it can't free up enough RAM.  "out of swap
space" messages
>>> generally mean that this number is too low, rather than there being
a
>>> shortage of swap - particularly if your swap device is rather slow.
>>>
>> Thanks, Peter!
>       A second round of thanks to Peter Jeremy for pointing out this sysctl
> variable (vm.pageout_oom_seq), although thus far I have yet to see that it
is
> actually effective in working around the memory management bugs.  I have
added
> the following lines to /etc/sysctl.conf.
>
> # Because FreeBSD 11.{2,3,4} tie up page frames unnecessarily, set value
high
> #vm.pageout_wakeup_thresh=14124 # Default value
> vm.pageout_wakeup_thresh=112640 # 410 MB
[snip]

I do totally agree that these are crucial issues for both operation and 
documentation, although my issues stemmed from bad _userland_ stack 
control.

Those who live on -CURRENT are used to OOPS, but the rest of us get paid 
not to have them.

I am happy with what the Core Team gives us, AND of course we want 
['more','better','faster','STABLE']. :D

Jonathan Chen

2020-Jul-12 19:39 UTC

head link

swap space issues

On Mon, 13 Jul 2020 at 02:24, Don Wilde <dwilde1 at gmail.com>
wrote:> On 7/11/20 11:28 PM, Scott Bennett via freebsd-stable wrote:
> >       I have read this entire thread to date with growing dismay, and
I
> > thank Donald Wilde for reporting his ongoing troubles, although they
> > spoil my hopes that the kernel's memory management bugs that first
became
> > apparent in 11.2-RELEASE (and -STABLE around the same time) were not
> > propagated into 12.x.  A recent update to stable/12 source tree made
it
> > finally possible for me to build 12.1-STABLE under 11.4-PRERELEASE,
and I
> > was just about to install the upgrade when this thread appeared.
> Spoiler alert. Since I gave up on Synth, I haven't had a single swap
> issue. It does appear to be one particular port that drove it nuts
> (apparently, one of the 'Google performance' bits, with a
> mismatched-brackets problem). I have rebuilt the machine several times,
> but that's more for my sense of tidiness than anything.
With synth you can reduce the number of workers to just "1" (ie:
Number_of_builders=1), if you just want your ports-build to complete
without any stress. However, one of the reasons why I use synth is
_because_ of the stress it can place on my 12-STABLE snapshots. If the
system is stable and performs well when under load, I feel just that
bit more assured about using it in production environments.

My 2 cents.
-- 
Jonathan Chen <jonc at chen.org.nz>

Scott Bennett

2020-Jul-13 05:45 UTC

head link

swap space issues

Don Wilde <dwilde1 at gmail.com> wrote:
>
> On 7/11/20 11:28 PM, Scott Bennett via freebsd-stable wrote:
> >       I have read this entire thread to date with growing dismay, and
I
> > thank Donald Wilde for reporting his ongoing troubles, although they
> > spoil my hopes that the kernel's memory management bugs that first
became
> > apparent in 11.2-RELEASE (and -STABLE around the same time) were not
> > propagated into 12.x.  A recent update to stable/12 source tree made
it
> > finally possible for me to build 12.1-STABLE under 11.4-PRERELEASE,
and I
> > was just about to install the upgrade when this thread appeared.
> Spoiler alert. Since I gave up on Synth, I haven't had a single swap 
> issue. It does appear to be one particular port that drove it nuts 
> (apparently, one of the 'Google performance' bits, with a 
> mismatched-brackets problem). I have rebuilt the machine several times, 
> but that's more for my sense of tidiness than anything.
>
> I've got a little Crystal script that walks the installed packages and 
> ports and updates them with system() calls.
> The machine is very slow, but it's not swapping at all.
     That's good.  I use portmaster, but not often at present because a
"portmaster -a" run can only be done two or three times per boot
before real
memory is locked down to the extent that the system is no longer functional
(i.e., even a scrub of ZFS pools comes to a halt in mid scrub due to lack of a
sufficient supply of free page frames).
     The build procedures of certain ports consistently get killed by the OOM
killer, along with much collateral damage.  I've noticed that lang/golang
and
lang/rust are prime examples now, although both used to build without
problems.>
> It is quite usable now with 12-STABLE.
     I don't see any good reason to go through the hassle and lost time of
an
upgrade across a major release boundary if I still won't have a production
OS
afterward.  I'm already dealing with a graphics stack rendered unsafe to use
by
the ongoing churn in X11 code.  (See PR #247441, kindly filed for me by Pau
Amma.)> >
> >       On Fri, 26 Jun 2020 03:55:04 -0700 : Donald Wilde <dwilde1 at
gmail.com>
> > wrote:
> >
> >> On 6/26/20, Peter Jeremy <peter at rulingia.com> wrote:
> >>>
> [snip]
> >>> I strongly suggest you don't have more than one swap
device on spinning
> >>> rust - the VM system will stripe I/O across the available
devices and
> >>> that will give particularly poor results when it has to seek
between the
> >>> partitions.
> >       True.  The only reason I can think of to use more than one
swapping/
> > paging area on the same device for the same OS instance is for
emergencies
> > or highly unusual, temporary situations in which more space is needed
until
> > those situations conclude. and even in such situations, if the space
can be
> > found on another device, it should be placed there.  Interleaving of
swap
> > space across multiple devices is intended as a performance enhancement
> > akin to striping (a.k.a. RAID0), although the virtual memory isn't
> > necessarily always actually striped across those devices.  Adding a
paging
> > area on the same device as an existing one is an abhorrent situation,
as
> > Peter Jeremy noted, and it should be eliminated via swapoff(8) as soon
as
> > the extraordinary situation has passed.  N.B. the GENERIC kernel sets
a
> > limit of four swap devices, although it can be rebuilt with a
different
> > limit.
> That's good data, Scott, thanks! The only reason I got into that 
> situation of trying to add another swap device was that it was crashing 
> with OO swap messages.
     I don't recall you posting those messages, but it sounds like exactly
the
*temporary* situation in which adding an inappropriately placed paging area can
be used long enough to get you out of a bind without a reboot, even though
performance will probably suffer until you have removed it again.  Poor
performance is usually preferable to no performance if it is only temporary.
     One cautionary note in such situations, though, applies to remote paging
areas.  Sparse files allocated on the remote system should not be used as
paging areas.  For example, I discovered the hard way (i.e., the problem was
not documented) that SunOS would crash if a sparse file via NFS were added as
a paging area and the SunOS system tried to write a page out to an unallocated
region of the file, which was essentially all of the file at first.
> >> My intent is to make this machine function -- getting the bear
> >> dancing. How deftly she dances is less important than that she
dances
> >> at all. My for-real boxen will have real HP and real cores and
RAM.
> >>
> >>> Also, you can't actually use 64GB swap with 4GB RAM.  If
you look back
> >>> through your boot messages, I expect you'll find messages
like:
> >>> warning: total configured swap (524288 pages) exceeds maximum
recommended
> >>> amount (498848 pages).
> >>> warning: increase kern.maxswzone or reduce amount of swap.
> >       Also true.  Unfortunately, no guidance whatsoever is provided to
advise
> > system administrators who need more space as to how to increase the
relevant
> > table sizes and limits.  However, that is a documentation bug, not a
code
> > bug.
> I've got both my kern.max* and CCACHE set up mostly correctly. 
> Everything builds and runs well, although I've found that it's
helpful
> to only use -j3 while building, not -j4 which would be appropriate for 
> my HAMMER i3. I'd much rather have the bear *dancing* than running into
> walls. :D
     I have encountered many ports where MAKE_JOBS_UNSAFE should have been set,
but hadn't been.  If you have installed ports-mgmt/portcont, you can set
this on
a per-port basis as you encounter these ports.  There are others that fail to
build with MAKE_JOBS_NO >= 4, but will build just fine with MAKE_JOBS_NO=3 or
2.
However, such failures to build are usually timing problems where one process
tries to put a file into a directory that doesn't exist yet or to read a
file
that hasn't yet been created.  These are not situations involving the OOM
killer.
If you'd like the lines from my /usr/local/etc/ports.conf file for those
I've
encountered to date, just email me privately for them.
> >> Yes, as I posted, those were part of the failure stream from the
synth
> >> program. When I had kern.maxswzone increased, it got through boot
> >> without complaining.
> >>
> >>> or maybe:
> >>> WARNING: reducing swap size to maximum of xxxxMB per unit
> >> The warnings were there, in the as-it-failed complaints.
> >>
> >>> The absolute limit on swap space is vm.swap_maxpages pages but
the
> >>> realistic
> >>> limit is about half that.  By default the realistic limit is
about 4?RAM
> >>> (on
> >>> 64-bit architectures), but this can be adjusted via
kern.maxswzone (which
> >>> defines the #bytes of RAM to allocate to swzone structures -
the actual
> >>> space allocated is vm.swzone).
> >>>
> >>> As a further piece of arcana, vm.pageout_oom_seq is a count
that controls
> >>> the number of passes before the pageout daemon gives up and
starts killing
> >>> processes when it can't free up enough RAM.  "out of
swap space" messages
     Yeah, those messages are half truth and half lie.  The true part is that
the processes mentioned have indeed been killed.  The lie is that the system is
out of swap space.  (I have seen these messages issued with as little as 217 MB
in use out of 24 GB available on my system.)  The kernel might not always
provide
all relevant information in error messages, but it should *never* LIE to us.
> >>> generally mean that this number is too low, rather than there
being a
> >>> shortage of swap - particularly if your swap device is rather
slow.
> >>>
> >> Thanks, Peter!
> >       A second round of thanks to Peter Jeremy for pointing out this
sysctl
> > variable (vm.pageout_oom_seq), although thus far I have yet to see
that it is
> > actually effective in working around the memory management bugs.  I
have added
> > the following lines to /etc/sysctl.conf.
> >
> > # Because FreeBSD 11.{2,3,4} tie up page frames unnecessarily, set
value high
> > #vm.pageout_wakeup_thresh=14124 # Default value
> > vm.pageout_wakeup_thresh=112640 # 410 MB
>
> [snip]
>
> I do totally agree that these are crucial issues for both operation and 
> documentation, although my issues stemmed from bad _userland_ stack 
> control.
     Yes, this is a frequent problem I've observed in the attitudes of
programmers
who never experienced working with real-memory-only OS.  They often lack any
awareness of wasteful memory usage, ordering of array accesses, locality of
reference issues, etc., resulting in truly ridiculous amounts of bloat and lost
performance, not to mention the failures to perform at all such as you
encountered.
In their minds, virtual memory frees them from all concerns about these issues,
so
their schoolteachers, now brought up the same way, don't even teach them
about such
things and perhaps still don't know about them themselves.
     Another problem, especially with programmers whose memories have not yet
accumulated many painful experiences, is the attraction toward newer, more
exciting
features accompanied by a disinterest in tracking down and fixing existing bugs,
even fairly critical bugs.  This problem, if left unchecked by management, can
lead
to terrible predicaments like the one FreeBSD is in now, namely, having no
production releases being supported.  DragonflyBSD, NetBSD, and OpenBSD do not,
AFAIK, suffer from this predicament at present.  They are behind to varying
degrees
in terms of newer, more exciting features, but at least they appear to work. 
For
example, sdf.org has well over 70,000 users and runs quite a few servers to do
so.
It runs

NetBSD miku 8.1_STABLE NetBSD 8.1_STABLE (GENERIC) #0: Wed Sep 11 03:47:45 UTC
2019  root at ol:/sdf/sys/NetBSD-8/sys/arch/amd64/compile/GENERIC amd64

at present.  (miku.sdf.org is one of the servers.)  Its uptime is currently 306
days.
They run several VMs of FreeBSD, OpenBSD, LINUX, and possibly others on some of
the
servers.  ZFS appeared in NetBSD 9.0.  I don't know the sysadmin's
reasons for not
upgrading to it so far, but I suspect they have to do with the number of systems
to
upgrade, the fact that it is a .0 release, and that root on ZFS and ZFS boot
environments are not yet supported, as used to be the case with FreeBSD. 
I'm not
ready to switch to NetBSD quite yet and would not enjoy doing so, but it has
been
a steadily improving alternative to FreeBSD of late, and if FreeBSD does not
release
a production system in the meantime, NetBSD may become a better choice for many
of
us who want to run a production OS.  It also offers an alternative to Micro$lop
for
the so-called "Internet of Things", which no other FOSS OS does,
AFAIK, although I
don't know enough about LINUX to be sure.>
> Those who live on -CURRENT are used to OOPS, but the rest of us get paid 
> not to have them.
     I've been using -STABLE for the last several major releases, but
because of
the vast numbers of conflicts and failures buried throughout the ports tree and
the horrendous amount of time it takes to rebuild most of my installed ports I
am
considering surrendering to using -RELEASE and using quarterly packages, in
spite
of the loss of features that doing so entails.  That would still not deal with
the
dependency conflicts or the installation of identically named files by different
ports, but it would reduce the time spent on building ports that fail to
install.>
> I am happy with what the Core Team gives us, AND of course we want 
> ['more','better','faster','STABLE']. :D
>     As Mark Linimon pointed out, the Core Team only does that indirectly. 
However,
it is the Core Team's job to give firm direction or redirection to those who
do the
designing and coding to avoid regressions, avoid ignoring the introduction of
bugs,
especially those that render a system unfit for production use, enhance testing,
and so on.


                                  Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet:   bennett at sdf.org   *xor*   bennett at freeshell.org  *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army."                                               *
*    -- Gov. John Hancock, New York Journal, 28 January 1790         *
**********************************************************************

freebsd stable - Jul 2020 - swap space issues

swap space issues

swap space issues

swap space issues