I have a CentOS 7 server that is running out of memory and I can't figure out why. Running "free -h" gives me this: ????????????? total??????? used??????? free????? shared? buff/cache?? available Mem:?????????? 3.4G??????? 2.4G??????? 123M??????? 5.9M??????? 928M??????? 626M Swap:????????? 1.9G??????? 294M??????? 1.6G The problem is that I can't find 2.4G of usage.? If I look at resident memory usage using "top", the top 5 processes are using a total of 390M.? The next highest process is using 8M.? For simplicity, if I assume the other 168 processes are all using 8M (which is WAY too high), that still only gives a total of 1.7G.? The tmpfs filesystems are only using 18M, so that shouldn't be an issue. Yesterday, the available memory was down around 300M when I checked it.? After checking some things and stopping all of the major processes, available memory was still low.? I gave up and rebooted the machine, which brought available memory back up to 2.8G with everything running. How can I track what is using the memory when the usage doesn't show up in top? -- Bowie
On Fri, Jul 27, 2018, 10:10 AM Bowie Bailey <Bowie_Bailey at buc.com> wrote:> I have a CentOS 7 server that is running out of memory and I can't > figure out why. > <snip> > The problem is that I can't find 2.4G of usage. If I look at resident > memory usage using "top", the top 5 processes are using a total of > 390M. >On a lark, what kind of file systems is the system using and how long g had it been up before you rebooted?>
On 7/27/2018 11:14 AM, Jon Pruente wrote:> On Fri, Jul 27, 2018, 10:10 AM Bowie Bailey <Bowie_Bailey at buc.com> wrote: > >> I have a CentOS 7 server that is running out of memory and I can't >> figure out why. >> <snip> >> The problem is that I can't find 2.4G of usage. If I look at resident >> memory usage using "top", the top 5 processes are using a total of >> 390M. >> > On a lark, what kind of file systems is the system using and how long g had > it been up before you rebooted?The filesystems are all XFS.? I don't know for sure how long it had been up previously, I'd guess at least 2 weeks.? Current uptime is about 25 hours and the system has already started getting into swap. -- Bowie
On Jul 27, 2018, at 9:10 AM, Bowie Bailey <Bowie_Bailey at BUC.com> wrote:> > I have a CentOS 7 server that is running out of memoryHow do you know that? Give a specific symptom.> Running "free -h" gives me this: > total used free shared buff/cache > available > Mem: 3.4G 2.4G 123M 5.9MThis is such a common misunderstanding that it has its own web site: https://www.linuxatemyram.com/
On 07/27/2018 08:10 AM, Bowie Bailey wrote:> The problem is that I can't find 2.4G of usage.Are your results from "top" similar to: ? ps axu | sort -nr -k +6 If you don't see 2.4G of use from applications, maybe the kernel is using a lot of memory.? Check /proc/slabinfo.? You can simplify its content to bytes per object type and a total: ? grep -v ^# /proc/slabinfo | awk 'BEGIN {t=0;} {print $1 " " ($3 * $4); t=t+($3 * $4)} END {print "total " t/(1024 * 1024) " MB";}' | column -t
On 07/27/2018 08:50 AM, Warren Young wrote:> This is such a common misunderstanding that it has its own web site: > https://www.linuxatemyram.com/The misunderstanding was mostly related to an older version of "free" that included buffers/cache in the "used" column.? "used" in this case does not include buffers/cache, so it should be possible to account for the used memory by examining application and kernel memory use.
On 7/27/2018 11:50 AM, Warren Young wrote:> On Jul 27, 2018, at 9:10 AM, Bowie Bailey <Bowie_Bailey at BUC.com> wrote: >> I have a CentOS 7 server that is running out of memory > How do you know that? Give a specific symptom.This was brought to my attention because one program was killed by the kernel to free memory and another program failed because it was unable to allocate enough memory.> >> Running "free -h" gives me this: >> total used free shared buff/cache >> available >> Mem: 3.4G 2.4G 123M 5.9M > This is such a common misunderstanding that it has its own web site: > > https://www.linuxatemyram.com/Right, and that website says that you should look at the "available" number in the results from "free", which I what I was referencing.? They say that a healthy system should have at least 20% of the memory available.? Mine was down to 17% in what I posted in my email and it was at about 8% when I rebooted yesterday. -- Bowie
On 7/27/2018 12:13 PM, Gordon Messmer wrote:> On 07/27/2018 08:10 AM, Bowie Bailey wrote: >> The problem is that I can't find 2.4G of usage. > > > Are your results from "top" similar to: > > ? ps axu | sort -nr -k +6That looks the same.> > If you don't see 2.4G of use from applications, maybe the kernel is > using a lot of memory.? Check /proc/slabinfo.? You can simplify its > content to bytes per object type and a total: > > ? grep -v ^# /proc/slabinfo | awk 'BEGIN {t=0;} {print $1 " " ($3 * > $4); t=t+($3 * $4)} END {print "total " t/(1024 * 1024) " MB";}' | > column -tThe total number from that report is about 706M.? My available memory has now jumped up from 640M to 1.5G after one of the processes (which was reportedly using about 100M) finished. I'll have to wait until the problem re-occurs and see what it looks like then, but for now I used the numbers from "ps axu" to add up a real total and then added the 706M to it and got within 300M of the memory currently reported used by free. What could account for a process actually using much more memory than is reported by ps or top? -- Bowie