thr3ads.net - libvirt users - Re: [libvirt-users] /proc/meminfo [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Peter Steele

2016-Mar-23 16:10 UTC

[libvirt-users] /proc/meminfo

Has anyone seen this issue? We're running containers under CentOS 7.2 
and some of these containers are reporting incorrect memory allocation 
in /proc/meminfo. The output below comes from a system with 32G of 
memory and 84GB of swap. The values reported are completely wrong.

# cat /proc/meminfo
MemTotal:       9007199254740991 kB
MemFree:        9007199224543267 kB
MemAvailable:   12985680 kB
Buffers:               0 kB
Cached:           119744 kB
SwapCached:        10804 kB
Active:           110228 kB
Inactive:         111716 kB
Active(anon):      53840 kB
Inactive(anon):    57568 kB
Active(file):      56388 kB
Inactive(file):    54148 kB
Unevictable:           0 kB
Mlocked:        15347728 kB
SwapTotal:             0 kB
SwapFree:       18446744073709524600 kB
Dirty:             20304 kB
Writeback:         99596 kB
AnonPages:      18963368 kB
Mapped:           231472 kB
Shmem:             51852 kB
Slab:            1891324 kB
SReclaimable:    1805244 kB
SUnreclaim:        86080 kB
KernelStack:       60656 kB
PageTables:        81948 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    104487760 kB
Committed_AS:   31507444 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      354796 kB
VmallocChunk:   34359380456 kAnonHugePages:  15630336 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       81684 kB
DirectMap2M:     3031040 kB
DirectMap1G:    32505856 kB

Cole Robinson

2016-Mar-23 16:19 UTC

head link

Re: [libvirt-users] /proc/meminfo

On 03/23/2016 12:10 PM, Peter Steele wrote:> Has anyone seen this issue? We're running containers under CentOS 7.2
and some
> of these containers are reporting incorrect memory allocation in
> /proc/meminfo. The output below comes from a system with 32G of memory and
> 84GB of swap. The values reported are completely wrong.
> 
There was a meminfo bug here:

https://bugzilla.redhat.com/show_bug.cgi?id=1300781

The initial report is fixed in git, however the reporter also mentioned the
issue you are seeing. I suspect something is going wacky with the memory
values we are getting from host cgroups after some period of time. If you can
reproduce with Fedora (or RHEL) try filing a bug there

- Cole
> # cat /proc/meminfo
> MemTotal:       9007199254740991 kB
> MemFree:        9007199224543267 kB
> MemAvailable:   12985680 kB
> Buffers:               0 kB
> Cached:           119744 kB
> SwapCached:        10804 kB
> Active:           110228 kB
> Inactive:         111716 kB
> Active(anon):      53840 kB
> Inactive(anon):    57568 kB
> Active(file):      56388 kB
> Inactive(file):    54148 kB
> Unevictable:           0 kB
> Mlocked:        15347728 kB
> SwapTotal:             0 kB
> SwapFree:       18446744073709524600 kB
> Dirty:             20304 kB
> Writeback:         99596 kB
> AnonPages:      18963368 kB
> Mapped:           231472 kB
> Shmem:             51852 kB
> Slab:            1891324 kB
> SReclaimable:    1805244 kB
> SUnreclaim:        86080 kB
> KernelStack:       60656 kB
> PageTables:        81948 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    104487760 kB
> Committed_AS:   31507444 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:      354796 kB
> VmallocChunk:   34359380456 kAnonHugePages:  15630336 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:       81684 kB
> DirectMap2M:     3031040 kB
> DirectMap1G:    32505856 kB
> 
> _______________________________________________
> libvirt-users mailing list
> libvirt-users@redhat.com
> https://www.redhat.com/mailman/listinfo/libvirt-users

Peter Steele

2016-Mar-23 17:41 UTC

head link

Re: [libvirt-users] /proc/meminfo

On 03/23/2016 09:19 AM, Cole Robinson wrote:> On 03/23/2016 12:10 PM, Peter Steele wrote:
>> Has anyone seen this issue? We're running containers under CentOS
7.2 and some
>> of these containers are reporting incorrect memory allocation in
>> /proc/meminfo. The output below comes from a system with 32G of memory
and
>> 84GB of swap. The values reported are completely wrong.
>>
> There was a meminfo bug here:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1300781
>
> The initial report is fixed in git, however the reporter also mentioned the
> issue you are seeing. I suspect something is going wacky with the memory
> values we are getting from host cgroups after some period of time. If you
can
> reproduce with Fedora (or RHEL) try filing a bug there
>
> - Cole
>It's interesting that the value I see on my containers 
(9007199254740991) is the exact same value that's reported in this Red 
Hat bug. Clearly that is not a coincidence. We did not see this problem 
in 7.1 so apparently it is something introduced in 7.2. For the 
immediate term it looks like we'll have to roll back to 7.1.I'll look 
into getting it reproduced in Federa or RHEL.

Peter

mxs kolo

2016-Mar-24 09:26 UTC

head link

Re: [libvirt-users] /proc/meminfo

Hi all> Has anyone seen this issue? We're running containers under CentOS 7.2
> and some of these containers are reporting incorrect memory allocation
> in /proc/meminfo. The output below comes from a system with 32G of
> memory and 84GB of swap. The values reported are completely wrong.yes, it's occured time to time on our installations.
Centos 7.2 + libvirt 1.2.18 and probably on 1.3.2
We have workaround for fix it without reboot LXC container.
1) check that on HW node exists cgroups memory for container.
[root@node]# cat
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dpuppet.infra.scope/memory.limit_in_bytes
17179869184
[root@node]# cat
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dpuppet.infra.scope/memory.memsw.limit_in_bytes
18203869184

In our case limit exists and set to 16Gb mem and 16+1 Gb for mem +swap
Contaner name puppet.infra, substitute here your container name.
2) if exists - simple attach cgroups to libvirt_lxc pid :
node# yum install libcgroup-tools
node# cgclassify -g
memory:machine.slice/machine-lxc\\x2dLXC_CONTAINER_NAME.scope PID
where pid in my case found as:

[root@node]# ps ax | grep libvirt_lxc | grep -v grep | grep puppet
22254 ? Sl 296:25 /usr/libexec/libvirt_lxc --name
puppet.infra --console 24 --security=none --handshake 42 --veth
macvlan0 --veth macvlan1

After run cgclassify /proc/memory inside container show normal values

In some casees, on combination old libvirt 1.2.xx and old systemd in
some conditions (after systemctl restart libvirtd or daemon-reload)
we get same error.
But in such case we need first create/restore cgroups on node and
after this cgclassify process to groups.
For example:

make dir
# mkdir
"/sys/fs/cgroup/memory/machine.slice/machine-lxc\x2dpuppet.scope"

set limit:
[root@]# echo 8589934592 >
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dpuppet.scope/memory.limit_in_bytes
[root@]# echo 9663676416 >
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dpuppet.scope/memory.memsw.limit_in_bytes

classify:
#cgclassify -g memory:machine.slice/machine-lxc\\x2dpuppet.scope
PID_LIBVIRT_LXC

p.s.
use libvirt 1.3.2, it's more stable and never show negative values
for memory and swap inside container

b.r.
Maxim Kozin

Peter Steele

2016-Mar-24 12:48 UTC

head link

Re: [libvirt-users] /proc/meminfo

On 03/24/2016 02:26 AM, mxs kolo wrote:> use libvirt 1.3.2, it's more stable and never show negative values
> for memory and swap inside container
>The latest version available for CentOS/RHEL is version 1.2.17. What 
site are you using to get the rpm for version 1.3.2?

Peter

mxs kolo

2016-Apr-26 11:44 UTC

head link

Re: [libvirt-users] /proc/meminfo

Now reporduced with 100%
1) create contrainer with memory limit 1Gb
2) run inside simple memory test allocator:
#include <malloc.h>
#include <unistd.h>
#include <memory.h>
#define MB 1024 * 1024
int main() {
  int total = 0;
  while (1) {
    void *p = malloc( 100*MB );
    memset(p,0, 100*MB );
    total = total + 100;
    printf("Alloc %d Mb\n",total);
    sleep(1);
  }
}
[root@tst-mxs2 ~]# free
              total        used        free      shared  buff/cache   available
Mem:        1048576        7412     1028644       11112       12520     1028644
Swap:       1048576           0     1048576
[root@tst-mxs2 ~]# ./a.out
Alloc 100 Mb
Alloc 200 Mb
Alloc 300 Mb
Alloc 400 Mb
Alloc 500 Mb
Alloc 600 Mb
Alloc 700 Mb
Alloc 800 Mb
Alloc 900 Mb
Alloc 1000 Mb
Killed

As You can see, limit worked and "free" inside container show correct
values

3) Check situation outside container, from top hadrware  node:
[root@node01]# cat
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/memory.limit_in_bytes
1073741824
4) Check list of pid in cgroups (it's IMPOTANT moment):
[root@node01]# cat
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks
7445
7446
7480
7506
7510
7511
7512
7529
7532
7533
7723
7724
8251
8253
10455


First PID 7445 - it's pid of  libvirt process for container:
# ps ax | grep 7445
 7445 ?        Sl     0:00 /usr/libexec/libvirt_lxc --name
tst-mxs2.test --console 21 --security=none --handshake 24 --veth
macvlan5
[root@node01]# virsh list
 Id    Name                           State
----------------------------------------------------
 7445  tst-mxs2.test                running

5) Now broke /proc/meminfo inside container.
prepare simple systemd service:
# cat   /usr/lib/systemd/system/true.service
[Unit]
Description=simple test

[Service]
Type=simple
ExecStart=/bin/true

[Install]
WantedBy=multi-user.target

Enable service first time, disable and start:

[root@node01]# systemctl enable   /usr/lib/systemd/system/true.service
Created symlink from
/etc/systemd/system/multi-user.target.wants/true.service to
/usr/lib/systemd/system/true.service.
[root@node01]# systemctl disable true.service
Removed symlink /etc/systemd/system/multi-user.target.wants/true.service.
[root@node01]# systemctl start true.service

Now check memory inside container:
[root@tst-mxs2 ~]# free
              total        used        free      shared  buff/cache   available
Mem:    9007199254740991      190824 9007199254236179       11112
313988 9007199254236179
Swap:             0

6) Check tasks list in cgroups:
[root@node01]# cat
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks
7446
7480
7506
7510
7511
7512
7529
7532
7533
7723
7724
8251
8253

After start disabled systemd service, from task list removed libvirt PID 7445.
It's mean that inside LXC limit real still worked, 7446 - it's PID of
/sbin/init inside container.
Check that limit work:

[root@tst-mxs2 ~]# free
              total        used        free      shared  buff/cache   available
Mem:    9007199254740991      190824 9007199254236179       11112
313988 9007199254236179
Swap:             0           0           0
[root@tst-mxs2 ~]# ./a.out
Alloc 100 Mb
Alloc 200 Mb
Alloc 300 Mb
Alloc 400 Mb
Alloc 500 Mb
Alloc 600 Mb
Alloc 700 Mb
Alloc 800 Mb
Alloc 900 Mb
Alloc 1000 Mb
Killed

Broken only fuse mount.  It's positive news - process inside container
even in case 8Ptb can't allocate more memory that set in cgroups.
But negative news - that some java based sotfware (as puppetdb in our
case)  plan self strategy based on 8Ptb memory and collapsed after
reach real limit.

resume:
 1) don't start disabled service by systemd
 2) workaround by cglassify or by it's simple analog
[root@node01]# echo 7445 >
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks

p.s.
 I am not sure whose  bug - libvirtd or systemd.

b.r.
 Maxim Kozin

On 3/23/16, Peter Steele <pwsteele@gmail.com>
wrote:> Has anyone seen this issue? We're running containers under CentOS 7.2
> and some of these containers are reporting incorrect memory allocation
> in /proc/meminfo. The output below comes from a system with 32G of
> memory and 84GB of swap. The values reported are completely wrong.
>
> # cat /proc/meminfo
> MemTotal:       9007199254740991 kB
> MemFree:        9007199224543267 kB
> MemAvailable:   12985680 kB
> Buffers:               0 kB
> Cached:           119744 kB
> SwapCached:        10804 kB
> Active:           110228 kB
> Inactive:         111716 kB
> Active(anon):      53840 kB
> Inactive(anon):    57568 kB
> Active(file):      56388 kB
> Inactive(file):    54148 kB
> Unevictable:           0 kB
> Mlocked:        15347728 kB
> SwapTotal:             0 kB
> SwapFree:       18446744073709524600 kB
> Dirty:             20304 kB
> Writeback:         99596 kB
> AnonPages:      18963368 kB
> Mapped:           231472 kB
> Shmem:             51852 kB
> Slab:            1891324 kB
> SReclaimable:    1805244 kB
> SUnreclaim:        86080 kB
> KernelStack:       60656 kB
> PageTables:        81948 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    104487760 kB
> Committed_AS:   31507444 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:      354796 kB
> VmallocChunk:   34359380456 kAnonHugePages:  15630336 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:       81684 kB
> DirectMap2M:     3031040 kB
> DirectMap1G:    32505856 kB
>
> _______________________________________________
> libvirt-users mailing list
> libvirt-users@redhat.com
> https://www.redhat.com/mailman/listinfo/libvirt-users
>

Cole Robinson

2016-Apr-26 12:27 UTC

head link

Re: [libvirt-users] /proc/meminfo

On 04/26/2016 07:44 AM, mxs kolo wrote:> Now reporduced with 100%
> 1) create contrainer with memory limit 1Gb
> 2) run inside simple memory test allocator:
> #include <malloc.h>
> #include <unistd.h>
> #include <memory.h>
> #define MB 1024 * 1024
> int main() {
>   int total = 0;
>   while (1) {
>     void *p = malloc( 100*MB );
>     memset(p,0, 100*MB );
>     total = total + 100;
>     printf("Alloc %d Mb\n",total);
>     sleep(1);
>   }
> }
> [root@tst-mxs2 ~]# free
>               total        used        free      shared  buff/cache  
available
> Mem:        1048576        7412     1028644       11112       12520    
1028644
> Swap:       1048576           0     1048576
> [root@tst-mxs2 ~]# ./a.out
> Alloc 100 Mb
> Alloc 200 Mb
> Alloc 300 Mb
> Alloc 400 Mb
> Alloc 500 Mb
> Alloc 600 Mb
> Alloc 700 Mb
> Alloc 800 Mb
> Alloc 900 Mb
> Alloc 1000 Mb
> Killed
> 
> As You can see, limit worked and "free" inside container show
correct values
> 
> 3) Check situation outside container, from top hadrware  node:
> [root@node01]# cat
>
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/memory.limit_in_bytes
> 1073741824
> 4) Check list of pid in cgroups (it's IMPOTANT moment):
> [root@node01]# cat
>
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks
> 7445
> 7446
> 7480
> 7506
> 7510
> 7511
> 7512
> 7529
> 7532
> 7533
> 7723
> 7724
> 8251
> 8253
> 10455
> 
> 
> First PID 7445 - it's pid of  libvirt process for container:
> # ps ax | grep 7445
>  7445 ?        Sl     0:00 /usr/libexec/libvirt_lxc --name
> tst-mxs2.test --console 21 --security=none --handshake 24 --veth
> macvlan5
> [root@node01]# virsh list
>  Id    Name                           State
> ----------------------------------------------------
>  7445  tst-mxs2.test                running
> 
> 5) Now broke /proc/meminfo inside container.
> prepare simple systemd service:
> # cat   /usr/lib/systemd/system/true.service
> [Unit]
> Description=simple test
> 
> [Service]
> Type=simple
> ExecStart=/bin/true
> 
> [Install]
> WantedBy=multi-user.target
> 
> Enable service first time, disable and start:
> 
> [root@node01]# systemctl enable   /usr/lib/systemd/system/true.service
> Created symlink from
> /etc/systemd/system/multi-user.target.wants/true.service to
> /usr/lib/systemd/system/true.service.
> [root@node01]# systemctl disable true.service
> Removed symlink /etc/systemd/system/multi-user.target.wants/true.service.
> [root@node01]# systemctl start true.service
> 
> Now check memory inside container:
> [root@tst-mxs2 ~]# free
>               total        used        free      shared  buff/cache  
available
> Mem:    9007199254740991      190824 9007199254236179       11112
> 313988 9007199254236179
> Swap:             0
> 
> 6) Check tasks list in cgroups:
> [root@node01]# cat
>
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks
> 7446
> 7480
> 7506
> 7510
> 7511
> 7512
> 7529
> 7532
> 7533
> 7723
> 7724
> 8251
> 8253
> 
> After start disabled systemd service, from task list removed libvirt PID
7445.
> It's mean that inside LXC limit real still worked, 7446 - it's PID
of
> /sbin/init inside container.
> Check that limit work:
> 
> [root@tst-mxs2 ~]# free
>               total        used        free      shared  buff/cache  
available
> Mem:    9007199254740991      190824 9007199254236179       11112
> 313988 9007199254236179
> Swap:             0           0           0
> [root@tst-mxs2 ~]# ./a.out
> Alloc 100 Mb
> Alloc 200 Mb
> Alloc 300 Mb
> Alloc 400 Mb
> Alloc 500 Mb
> Alloc 600 Mb
> Alloc 700 Mb
> Alloc 800 Mb
> Alloc 900 Mb
> Alloc 1000 Mb
> Killed
> 
> Broken only fuse mount.  It's positive news - process inside container
> even in case 8Ptb can't allocate more memory that set in cgroups.
> But negative news - that some java based sotfware (as puppetdb in our
> case)  plan self strategy based on 8Ptb memory and collapsed after
> reach real limit.
> 
> resume:
>  1) don't start disabled service by systemd
>  2) workaround by cglassify or by it's simple analog
> [root@node01]# echo 7445 >
>
/sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks
> 
> p.s.
>  I am not sure whose  bug - libvirtd or systemd.
> 
> b.r.
>  Maxim Kozin
Cool, thanks for the info! Does this still affect libvirt 1.3.2 as well? You
mentioned elsewhere that you weren't hitting this issue with that version

- Cole

Daniel P. Berrange

2016-Apr-26 13:12 UTC

head link

Re: [libvirt-users] /proc/meminfo

On Tue, Apr 26, 2016 at 02:44:19PM +0300, mxs kolo
wrote:> Now reporduced with 100%
> 1) create contrainer with memory limit 1Gb
> 2) run inside simple memory test allocator:
[snip example]

I've seen this behaviour with LXC when running systemd inside the
container. /proc/meminfo is generated by a FUSE process libvirt
runs and determines the memory settings by reading the root cgroup
for the container. What I think is happening is that systemd is
reseting the memory limits in the root cgroup, so the values that
libvirt set are no longer present. This in turn causes us to report
the wrong data in /proc/meminfo. I've not yet decided whether this
is a systemd bug or not though.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

Seemingly Similar Threads

Search for more apparently analagous threads

libvirt users - Mar 2016 - Re: /proc/meminfo

[libvirt-users] /proc/meminfo

Re: [libvirt-users] /proc/meminfo

Re: [libvirt-users] /proc/meminfo

Re: [libvirt-users] /proc/meminfo

Re: [libvirt-users] /proc/meminfo

Re: [libvirt-users] /proc/meminfo

Re: [libvirt-users] /proc/meminfo

Re: [libvirt-users] /proc/meminfo

Seemingly Similar Threads