On Fri, Jul 27, 2018 at 10:41 AM, Bowie Bailey <Bowie_Bailey at buc.com>
wrote:
> > On a lark, what kind of file systems is the system using and how long
g
> had
> > it been up before you rebooted?
>
> The filesystems are all XFS. I don't know for sure how long it had
been
> up previously, I'd guess at least 2 weeks. Current uptime is about 25
> hours and the system has already started getting into swap.
I've had multiple systems (and VMs) with XFS filesystems that had troubles
on the 693 series of kernels. Eventually the kernel xfs driver deadlocks
and blocks writes, which then pile up in memory waiting for the block to
clear. Eventually you run out of RAM and OOM killer kicks in. The only
solutions I had a the time was to revert to booting a 514 series kernel or
converting to EXT4, depending on the needs of the particular server.
Everything I've converted to EXT4 has been rock stable since, and the very
few I had to run a 514 kernel on have been stable, just not ideal. It may
be fixed on the newer 8xx series but I haven't dived into them on those
systems yet.
If it happens again the look for processes in the D state and see if
logging is continuing or if it just cuts off (when the block started).