On 7/27/2018 11:50 AM, Warren Young wrote:> On Jul 27, 2018, at 9:10 AM, Bowie Bailey <Bowie_Bailey at BUC.com> wrote: >> I have a CentOS 7 server that is running out of memory > How do you know that? Give a specific symptom.This was brought to my attention because one program was killed by the kernel to free memory and another program failed because it was unable to allocate enough memory.> >> Running "free -h" gives me this: >> total used free shared buff/cache >> available >> Mem: 3.4G 2.4G 123M 5.9M > This is such a common misunderstanding that it has its own web site: > > https://www.linuxatemyram.com/Right, and that website says that you should look at the "available" number in the results from "free", which I what I was referencing.? They say that a healthy system should have at least 20% of the memory available.? Mine was down to 17% in what I posted in my email and it was at about 8% when I rebooted yesterday. -- Bowie
Bowie Bailey wrote:> On 7/27/2018 11:50 AM, Warren Young wrote: >> On Jul 27, 2018, at 9:10 AM, Bowie Bailey <Bowie_Bailey at BUC.com> wrote: >> >>> I have a CentOS 7 server that is running out of memory >>> >> How do you know that? Give a specific symptom. >> > This was brought to my attention because one program was killed by the > kernel to free memory and another program failed because it was unable to > allocate enough memory.<snip> Um, wait a minute - are you saying the oom-killer was invoked? My reaction to that is to define the system, at that point, to be in an undefined state, because you don't know what some threads that were killed are. mark
On 7/27/2018 12:58 PM, mark wrote:> Bowie Bailey wrote: >> On 7/27/2018 11:50 AM, Warren Young wrote: >>> On Jul 27, 2018, at 9:10 AM, Bowie Bailey <Bowie_Bailey at BUC.com> wrote: >>> >>>> I have a CentOS 7 server that is running out of memory >>>> >>> How do you know that? Give a specific symptom. >>> >> This was brought to my attention because one program was killed by the >> kernel to free memory and another program failed because it was unable to >> allocate enough memory. > <snip> > Um, wait a minute - are you saying the oom-killer was invoked? My reaction > to that is to define the system, at that point, to be in an undefined > state, because you don't know what some threads that were killed are.Probably true, but the system has been rebooted since then and the oom-killer has not been activated since then.? When I first noticed the problem, I also found that my swap partition had been deactivated, which is why the oom-killer got involved in the first place instead of just having swap usage slow the system to a crawl. I think I have identified the program that is causing the problem (memory usage went back to normal when the process ended), but I'm still not sure how it ended up using 10x the memory that top reported for it. -- Bowie