Has anyone seen this issue? We're running containers under CentOS 7.2 and some of these containers are reporting incorrect memory allocation in /proc/meminfo. The output below comes from a system with 32G of memory and 84GB of swap. The values reported are completely wrong. # cat /proc/meminfo MemTotal: 9007199254740991 kB MemFree: 9007199224543267 kB MemAvailable: 12985680 kB Buffers: 0 kB Cached: 119744 kB SwapCached: 10804 kB Active: 110228 kB Inactive: 111716 kB Active(anon): 53840 kB Inactive(anon): 57568 kB Active(file): 56388 kB Inactive(file): 54148 kB Unevictable: 0 kB Mlocked: 15347728 kB SwapTotal: 0 kB SwapFree: 18446744073709524600 kB Dirty: 20304 kB Writeback: 99596 kB AnonPages: 18963368 kB Mapped: 231472 kB Shmem: 51852 kB Slab: 1891324 kB SReclaimable: 1805244 kB SUnreclaim: 86080 kB KernelStack: 60656 kB PageTables: 81948 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 104487760 kB Committed_AS: 31507444 kB VmallocTotal: 34359738367 kB VmallocUsed: 354796 kB VmallocChunk: 34359380456 kAnonHugePages: 15630336 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 81684 kB DirectMap2M: 3031040 kB DirectMap1G: 32505856 kB
On 03/23/2016 12:10 PM, Peter Steele wrote:> Has anyone seen this issue? We're running containers under CentOS 7.2 and some > of these containers are reporting incorrect memory allocation in > /proc/meminfo. The output below comes from a system with 32G of memory and > 84GB of swap. The values reported are completely wrong. >There was a meminfo bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1300781 The initial report is fixed in git, however the reporter also mentioned the issue you are seeing. I suspect something is going wacky with the memory values we are getting from host cgroups after some period of time. If you can reproduce with Fedora (or RHEL) try filing a bug there - Cole> # cat /proc/meminfo > MemTotal: 9007199254740991 kB > MemFree: 9007199224543267 kB > MemAvailable: 12985680 kB > Buffers: 0 kB > Cached: 119744 kB > SwapCached: 10804 kB > Active: 110228 kB > Inactive: 111716 kB > Active(anon): 53840 kB > Inactive(anon): 57568 kB > Active(file): 56388 kB > Inactive(file): 54148 kB > Unevictable: 0 kB > Mlocked: 15347728 kB > SwapTotal: 0 kB > SwapFree: 18446744073709524600 kB > Dirty: 20304 kB > Writeback: 99596 kB > AnonPages: 18963368 kB > Mapped: 231472 kB > Shmem: 51852 kB > Slab: 1891324 kB > SReclaimable: 1805244 kB > SUnreclaim: 86080 kB > KernelStack: 60656 kB > PageTables: 81948 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 104487760 kB > Committed_AS: 31507444 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 354796 kB > VmallocChunk: 34359380456 kAnonHugePages: 15630336 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > DirectMap4k: 81684 kB > DirectMap2M: 3031040 kB > DirectMap1G: 32505856 kB > > _______________________________________________ > libvirt-users mailing list > libvirt-users@redhat.com > https://www.redhat.com/mailman/listinfo/libvirt-users
On 03/23/2016 09:19 AM, Cole Robinson wrote:> On 03/23/2016 12:10 PM, Peter Steele wrote: >> Has anyone seen this issue? We're running containers under CentOS 7.2 and some >> of these containers are reporting incorrect memory allocation in >> /proc/meminfo. The output below comes from a system with 32G of memory and >> 84GB of swap. The values reported are completely wrong. >> > There was a meminfo bug here: > > https://bugzilla.redhat.com/show_bug.cgi?id=1300781 > > The initial report is fixed in git, however the reporter also mentioned the > issue you are seeing. I suspect something is going wacky with the memory > values we are getting from host cgroups after some period of time. If you can > reproduce with Fedora (or RHEL) try filing a bug there > > - Cole >It's interesting that the value I see on my containers (9007199254740991) is the exact same value that's reported in this Red Hat bug. Clearly that is not a coincidence. We did not see this problem in 7.1 so apparently it is something introduced in 7.2. For the immediate term it looks like we'll have to roll back to 7.1.I'll look into getting it reproduced in Federa or RHEL. Peter
Hi all> Has anyone seen this issue? We're running containers under CentOS 7.2 > and some of these containers are reporting incorrect memory allocation > in /proc/meminfo. The output below comes from a system with 32G of > memory and 84GB of swap. The values reported are completely wrong.yes, it's occured time to time on our installations. Centos 7.2 + libvirt 1.2.18 and probably on 1.3.2 We have workaround for fix it without reboot LXC container. 1) check that on HW node exists cgroups memory for container. [root@node]# cat /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dpuppet.infra.scope/memory.limit_in_bytes 17179869184 [root@node]# cat /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dpuppet.infra.scope/memory.memsw.limit_in_bytes 18203869184 In our case limit exists and set to 16Gb mem and 16+1 Gb for mem +swap Contaner name puppet.infra, substitute here your container name. 2) if exists - simple attach cgroups to libvirt_lxc pid : node# yum install libcgroup-tools node# cgclassify -g memory:machine.slice/machine-lxc\\x2dLXC_CONTAINER_NAME.scope PID where pid in my case found as: [root@node]# ps ax | grep libvirt_lxc | grep -v grep | grep puppet 22254 ? Sl 296:25 /usr/libexec/libvirt_lxc --name puppet.infra --console 24 --security=none --handshake 42 --veth macvlan0 --veth macvlan1 After run cgclassify /proc/memory inside container show normal values In some casees, on combination old libvirt 1.2.xx and old systemd in some conditions (after systemctl restart libvirtd or daemon-reload) we get same error. But in such case we need first create/restore cgroups on node and after this cgclassify process to groups. For example: make dir # mkdir "/sys/fs/cgroup/memory/machine.slice/machine-lxc\x2dpuppet.scope" set limit: [root@]# echo 8589934592 > /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dpuppet.scope/memory.limit_in_bytes [root@]# echo 9663676416 > /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2dpuppet.scope/memory.memsw.limit_in_bytes classify: #cgclassify -g memory:machine.slice/machine-lxc\\x2dpuppet.scope PID_LIBVIRT_LXC p.s. use libvirt 1.3.2, it's more stable and never show negative values for memory and swap inside container b.r. Maxim Kozin
On 03/24/2016 02:26 AM, mxs kolo wrote:> use libvirt 1.3.2, it's more stable and never show negative values > for memory and swap inside container >The latest version available for CentOS/RHEL is version 1.2.17. What site are you using to get the rpm for version 1.3.2? Peter
Now reporduced with 100% 1) create contrainer with memory limit 1Gb 2) run inside simple memory test allocator: #include <malloc.h> #include <unistd.h> #include <memory.h> #define MB 1024 * 1024 int main() { int total = 0; while (1) { void *p = malloc( 100*MB ); memset(p,0, 100*MB ); total = total + 100; printf("Alloc %d Mb\n",total); sleep(1); } } [root@tst-mxs2 ~]# free total used free shared buff/cache available Mem: 1048576 7412 1028644 11112 12520 1028644 Swap: 1048576 0 1048576 [root@tst-mxs2 ~]# ./a.out Alloc 100 Mb Alloc 200 Mb Alloc 300 Mb Alloc 400 Mb Alloc 500 Mb Alloc 600 Mb Alloc 700 Mb Alloc 800 Mb Alloc 900 Mb Alloc 1000 Mb Killed As You can see, limit worked and "free" inside container show correct values 3) Check situation outside container, from top hadrware node: [root@node01]# cat /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/memory.limit_in_bytes 1073741824 4) Check list of pid in cgroups (it's IMPOTANT moment): [root@node01]# cat /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks 7445 7446 7480 7506 7510 7511 7512 7529 7532 7533 7723 7724 8251 8253 10455 First PID 7445 - it's pid of libvirt process for container: # ps ax | grep 7445 7445 ? Sl 0:00 /usr/libexec/libvirt_lxc --name tst-mxs2.test --console 21 --security=none --handshake 24 --veth macvlan5 [root@node01]# virsh list Id Name State ---------------------------------------------------- 7445 tst-mxs2.test running 5) Now broke /proc/meminfo inside container. prepare simple systemd service: # cat /usr/lib/systemd/system/true.service [Unit] Description=simple test [Service] Type=simple ExecStart=/bin/true [Install] WantedBy=multi-user.target Enable service first time, disable and start: [root@node01]# systemctl enable /usr/lib/systemd/system/true.service Created symlink from /etc/systemd/system/multi-user.target.wants/true.service to /usr/lib/systemd/system/true.service. [root@node01]# systemctl disable true.service Removed symlink /etc/systemd/system/multi-user.target.wants/true.service. [root@node01]# systemctl start true.service Now check memory inside container: [root@tst-mxs2 ~]# free total used free shared buff/cache available Mem: 9007199254740991 190824 9007199254236179 11112 313988 9007199254236179 Swap: 0 6) Check tasks list in cgroups: [root@node01]# cat /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks 7446 7480 7506 7510 7511 7512 7529 7532 7533 7723 7724 8251 8253 After start disabled systemd service, from task list removed libvirt PID 7445. It's mean that inside LXC limit real still worked, 7446 - it's PID of /sbin/init inside container. Check that limit work: [root@tst-mxs2 ~]# free total used free shared buff/cache available Mem: 9007199254740991 190824 9007199254236179 11112 313988 9007199254236179 Swap: 0 0 0 [root@tst-mxs2 ~]# ./a.out Alloc 100 Mb Alloc 200 Mb Alloc 300 Mb Alloc 400 Mb Alloc 500 Mb Alloc 600 Mb Alloc 700 Mb Alloc 800 Mb Alloc 900 Mb Alloc 1000 Mb Killed Broken only fuse mount. It's positive news - process inside container even in case 8Ptb can't allocate more memory that set in cgroups. But negative news - that some java based sotfware (as puppetdb in our case) plan self strategy based on 8Ptb memory and collapsed after reach real limit. resume: 1) don't start disabled service by systemd 2) workaround by cglassify or by it's simple analog [root@node01]# echo 7445 > /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks p.s. I am not sure whose bug - libvirtd or systemd. b.r. Maxim Kozin On 3/23/16, Peter Steele <pwsteele@gmail.com> wrote:> Has anyone seen this issue? We're running containers under CentOS 7.2 > and some of these containers are reporting incorrect memory allocation > in /proc/meminfo. The output below comes from a system with 32G of > memory and 84GB of swap. The values reported are completely wrong. > > # cat /proc/meminfo > MemTotal: 9007199254740991 kB > MemFree: 9007199224543267 kB > MemAvailable: 12985680 kB > Buffers: 0 kB > Cached: 119744 kB > SwapCached: 10804 kB > Active: 110228 kB > Inactive: 111716 kB > Active(anon): 53840 kB > Inactive(anon): 57568 kB > Active(file): 56388 kB > Inactive(file): 54148 kB > Unevictable: 0 kB > Mlocked: 15347728 kB > SwapTotal: 0 kB > SwapFree: 18446744073709524600 kB > Dirty: 20304 kB > Writeback: 99596 kB > AnonPages: 18963368 kB > Mapped: 231472 kB > Shmem: 51852 kB > Slab: 1891324 kB > SReclaimable: 1805244 kB > SUnreclaim: 86080 kB > KernelStack: 60656 kB > PageTables: 81948 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 104487760 kB > Committed_AS: 31507444 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 354796 kB > VmallocChunk: 34359380456 kAnonHugePages: 15630336 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > DirectMap4k: 81684 kB > DirectMap2M: 3031040 kB > DirectMap1G: 32505856 kB > > _______________________________________________ > libvirt-users mailing list > libvirt-users@redhat.com > https://www.redhat.com/mailman/listinfo/libvirt-users >
On 04/26/2016 07:44 AM, mxs kolo wrote:> Now reporduced with 100% > 1) create contrainer with memory limit 1Gb > 2) run inside simple memory test allocator: > #include <malloc.h> > #include <unistd.h> > #include <memory.h> > #define MB 1024 * 1024 > int main() { > int total = 0; > while (1) { > void *p = malloc( 100*MB ); > memset(p,0, 100*MB ); > total = total + 100; > printf("Alloc %d Mb\n",total); > sleep(1); > } > } > [root@tst-mxs2 ~]# free > total used free shared buff/cache available > Mem: 1048576 7412 1028644 11112 12520 1028644 > Swap: 1048576 0 1048576 > [root@tst-mxs2 ~]# ./a.out > Alloc 100 Mb > Alloc 200 Mb > Alloc 300 Mb > Alloc 400 Mb > Alloc 500 Mb > Alloc 600 Mb > Alloc 700 Mb > Alloc 800 Mb > Alloc 900 Mb > Alloc 1000 Mb > Killed > > As You can see, limit worked and "free" inside container show correct values > > 3) Check situation outside container, from top hadrware node: > [root@node01]# cat > /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/memory.limit_in_bytes > 1073741824 > 4) Check list of pid in cgroups (it's IMPOTANT moment): > [root@node01]# cat > /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks > 7445 > 7446 > 7480 > 7506 > 7510 > 7511 > 7512 > 7529 > 7532 > 7533 > 7723 > 7724 > 8251 > 8253 > 10455 > > > First PID 7445 - it's pid of libvirt process for container: > # ps ax | grep 7445 > 7445 ? Sl 0:00 /usr/libexec/libvirt_lxc --name > tst-mxs2.test --console 21 --security=none --handshake 24 --veth > macvlan5 > [root@node01]# virsh list > Id Name State > ---------------------------------------------------- > 7445 tst-mxs2.test running > > 5) Now broke /proc/meminfo inside container. > prepare simple systemd service: > # cat /usr/lib/systemd/system/true.service > [Unit] > Description=simple test > > [Service] > Type=simple > ExecStart=/bin/true > > [Install] > WantedBy=multi-user.target > > Enable service first time, disable and start: > > [root@node01]# systemctl enable /usr/lib/systemd/system/true.service > Created symlink from > /etc/systemd/system/multi-user.target.wants/true.service to > /usr/lib/systemd/system/true.service. > [root@node01]# systemctl disable true.service > Removed symlink /etc/systemd/system/multi-user.target.wants/true.service. > [root@node01]# systemctl start true.service > > Now check memory inside container: > [root@tst-mxs2 ~]# free > total used free shared buff/cache available > Mem: 9007199254740991 190824 9007199254236179 11112 > 313988 9007199254236179 > Swap: 0 > > 6) Check tasks list in cgroups: > [root@node01]# cat > /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks > 7446 > 7480 > 7506 > 7510 > 7511 > 7512 > 7529 > 7532 > 7533 > 7723 > 7724 > 8251 > 8253 > > After start disabled systemd service, from task list removed libvirt PID 7445. > It's mean that inside LXC limit real still worked, 7446 - it's PID of > /sbin/init inside container. > Check that limit work: > > [root@tst-mxs2 ~]# free > total used free shared buff/cache available > Mem: 9007199254740991 190824 9007199254236179 11112 > 313988 9007199254236179 > Swap: 0 0 0 > [root@tst-mxs2 ~]# ./a.out > Alloc 100 Mb > Alloc 200 Mb > Alloc 300 Mb > Alloc 400 Mb > Alloc 500 Mb > Alloc 600 Mb > Alloc 700 Mb > Alloc 800 Mb > Alloc 900 Mb > Alloc 1000 Mb > Killed > > Broken only fuse mount. It's positive news - process inside container > even in case 8Ptb can't allocate more memory that set in cgroups. > But negative news - that some java based sotfware (as puppetdb in our > case) plan self strategy based on 8Ptb memory and collapsed after > reach real limit. > > resume: > 1) don't start disabled service by systemd > 2) workaround by cglassify or by it's simple analog > [root@node01]# echo 7445 > > /sys/fs/cgroup/memory/machine.slice/machine-lxc\\x2d7445\\x2dtst\\x2dmxs2.test.scope/tasks > > p.s. > I am not sure whose bug - libvirtd or systemd. > > b.r. > Maxim KozinCool, thanks for the info! Does this still affect libvirt 1.3.2 as well? You mentioned elsewhere that you weren't hitting this issue with that version - Cole
On Tue, Apr 26, 2016 at 02:44:19PM +0300, mxs kolo wrote:> Now reporduced with 100% > 1) create contrainer with memory limit 1Gb > 2) run inside simple memory test allocator:[snip example] I've seen this behaviour with LXC when running systemd inside the container. /proc/meminfo is generated by a FUSE process libvirt runs and determines the memory settings by reading the root cgroup for the container. What I think is happening is that systemd is reseting the memory limits in the root cgroup, so the values that libvirt set are no longer present. This in turn causes us to report the wrong data in /proc/meminfo. I've not yet decided whether this is a systemd bug or not though. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|