I'm running a couple of brand new servers ... 32G of RAM, very little load on it right now, and this morning it locked up with that 'kern.maxswzone' error on the console ... The server is running a reasonably current 7.2-STABLE: FreeBSD pluto.hub.org 7.2-STABLE FreeBSD 7.2-STABLE #0: Sun May 31 14:48:04 ADT And top right now, with everything running, shows no swappping, 19G of Free memory, 9G of Inact memory ... no reason to do any serious amount of swapping. last pid: 32159; load averages: 0.12, 0.21, 0.47 up 0+10:57:56 11:53:39 573 processes: 1 running, 571 sleeping, 1 zombie CPU: 2.0% user, 0.0% nice, 1.2% system, 0.0% interrupt, 96.8% idle Mem: 1331M Active, 9446M Inact, 659M Wired, 35M Cache, 399M Buf, 19G Free Swap: 32G Total, 32G Free In fact, my other server (same config), has been up 9 days (they were put online 9 days ago), and tops shows it doing a little bit of swapping, but, again, huge amounts of Inact memory: last pid: 26307; load averages: 0.36, 0.35, 0.36 up 9+17:03:48 11:57:54 680 processes: 2 running, 657 sleeping, 21 zombie CPU: 0.7% user, 0.0% nice, 0.4% system, 0.0% interrupt, 98.9% idle Mem: 2915M Active, 25G Inact, 778M Wired, 13M Cache, 399M Buf, 1771M Free Swap: 32G Total, 1044K Used, 32G Free So these servers right now are definitely not feeling any pain ... And, based on experiences with another server, I have my /boot/loader.conf set to: kern.maxswzone=67108864 So, the question is ... what am I missing? Is there some magical formula for calculating maxswzone that 7.2 is missing? Some nagios plug-in I shuld be using to monitor ... what? Help? ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664
On Wednesday 10 June 2009 11:04:48 am Marc G. Fournier wrote:> > I'm running a couple of brand new servers ... 32G of RAM, very little load > on it right now, and this morning it locked up with that 'kern.maxswzone' > error on the console ... > > The server is running a reasonably current 7.2-STABLE: > > FreeBSD pluto.hub.org 7.2-STABLE FreeBSD 7.2-STABLE #0: Sun May 31 > 14:48:04 ADT > > And top right now, with everything running, shows no swappping, 19G of > Free memory, 9G of Inact memory ... no reason to do any serious amount of > swapping. > > last pid: 32159; load averages: 0.12, 0.21, 0.47 up 0+10:57:5611:53:39> 573 processes: 1 running, 571 sleeping, 1 zombie > CPU: 2.0% user, 0.0% nice, 1.2% system, 0.0% interrupt, 96.8% idle > Mem: 1331M Active, 9446M Inact, 659M Wired, 35M Cache, 399M Buf, 19G Free > Swap: 32G Total, 32G Free > > In fact, my other server (same config), has been up 9 days (they were put > online 9 days ago), and tops shows it doing a little bit of swapping, but, > again, huge amounts of Inact memory: > > last pid: 26307; load averages: 0.36, 0.35, 0.36 up 9+17:03:48 > 11:57:54 > 680 processes: 2 running, 657 sleeping, 21 zombie > CPU: 0.7% user, 0.0% nice, 0.4% system, 0.0% interrupt, 98.9% idle > Mem: 2915M Active, 25G Inact, 778M Wired, 13M Cache, 399M Buf, 1771M Free > Swap: 32G Total, 1044K Used, 32G Free > > So these servers right now are definitely not feeling any pain ... > > And, based on experiences with another server, I have my /boot/loader.conf > set to: > > kern.maxswzone=67108864 > > So, the question is ... what am I missing? Is there some magical formula > for calculating maxswzone that 7.2 is missing? Some nagios plug-in I > shuld be using to monitor ... what? > > Help?There are changes in 8 that you can ask kib@ to MFC perhaps that help some. They make the kernel kill a process when maxswzone is empty similar to what happens when you run out of swap space. If you break into the debugger and get a crashdump, you can verify 1) that you were swapping, and 2) you can calculate a better value for maxswzone. The problem with making maxswzone really big is that it uses up wired memory that can't be reused for anything else, so you don't just want to blindly use the maximum amount for the swap you have. -- John Baldwin
On Wed, 10 Jun 2009, Marc G. Fournier wrote: MGF> MGF> I'm running a couple of brand new servers ... 32G of RAM, very little load MGF> on it right now, and this morning it locked up with that 'kern.maxswzone' MGF> error on the console ... MGF> MGF> The server is running a reasonably current 7.2-STABLE: MGF> MGF> FreeBSD pluto.hub.org 7.2-STABLE FreeBSD 7.2-STABLE #0: Sun May 31 14:48:04 MGF> ADT MGF> MGF> And top right now, with everything running, shows no swappping, 19G of Free MGF> memory, 9G of Inact memory ... no reason to do any serious amount of MGF> swapping. MGF> MGF> last pid: 32159; load averages: 0.12, 0.21, 0.47 up 0+10:57:56 MGF> 11:53:39 MGF> 573 processes: 1 running, 571 sleeping, 1 zombie MGF> CPU: 2.0% user, 0.0% nice, 1.2% system, 0.0% interrupt, 96.8% idle MGF> Mem: 1331M Active, 9446M Inact, 659M Wired, 35M Cache, 399M Buf, 19G Free MGF> Swap: 32G Total, 32G Free As a workaround, if your machine is usually not going to swap, you can decrease swap space significally, and use otherwise unused partition for crashdumps. For RELENG_7/amd64 with 8G RAM and 16G of swap, on stress tests with tmpfs to avoid such locks I had to tune kern.maxswzone up to 192M, which seems to be kinda overkill... -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------