Niklas Bivald
2013-Nov-20 14:57 UTC
Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
Hi, Me and another sysadmin has independently been researching a problem where DomU randomly locks (Can’t reach it via xl console, no ping / SSH connection, shown as stuck in running-state in xentop) on two of our separate machines (installed completely independently): Dom0: Debian 7.0 with Xen version: 4.1.4 and xen-utils 4.1.4-3+deb7u1 Debian 7.1 with Xen version: 4.1 DomU: Debian 7.0 Debian 7.1(.3) Common denominator appears to be qemu-dm consuming (leaking?) memory until the Dom0 swaps. When the Dom0 swap is full, the domU appears to be locked (see above) Dom0, at which time a hard reboot a.ka. xl destroy + xl create is the only way to get it back. This *could* be related to "[Xen-devel] qemu-system-i386: memory leak?" http://xen.markmail.org/message/chqpifrj46lxdxx2 DomU by themselves doesn’t use any abnormal memory or swap. All DomU are image-file based (disk.img, swap.img) To give an overview, currently Dom0 uses 26GB of swap with 8 active domU. Swap per process: Pid Swap Process Uptime 3766 98452 kB qemu-dm -d 29 -domain-name [hostname] -nographic -M xenpv 160 days 6100 276988 kB qemu-dm -d 42 -domain-name [hostname] -nographic -M xenpv 108 days 6790 121620 kB qemu-dm -d 46 -domain-name [hostname] -nographic -M xenpv 95 days 10616 791616 kB qemu-dm -d 51 -domain-name [hostname] -nographic -M xenpv 32 days 11588 3514436 kB qemu-dm -d 49 -domain-name [hostname] -nographic -M xenpv 73 days 16290 170436 kB qemu-dm -d 43 -domain-name [hostname] -nographic -M xenpv 107 days 26974 1647248 kB qemu-dm -d 48 -domain-name [hostname] -nographic -M xenpv 92 days 32403 21147060 kB qemu-dm -d 52 -domain-name [hostname] -nographic -M xenpv 29 days Generally, the higher usage the higher swap. Possibly, the higher IO the higher swap. DomU #32403 is a fairly low-utilized DomU with a 30GB database and log parsing as primary application. It currently increases roughly 2GB per day in swap. Only difference between it and the others is that this has (probably several times) more IO. Machine #1 (me): $ dmesg|grep qe [7548057.392504] qemu-dm[528]: segfault at ff0 ip 00007f1e39229ca0 sp 00007fffb9e36bb8 error 4 in libc-2.13.so[7f1e3910a000+180000] [11263387.091221] qemu-dm[7474]: segfault at ff0 ip 00007f695e32dca0 sp 00007fff5a3b27a8 error 4 in libc-2.13.so[7f695e20e000+180000] Machine #2: $ dmesg|grep qe [2593763.122800] Out of memory: Kill process 2778 (qemu-dm) score 892 or sacrifice child [2593763.122824] Killed process 2778 (qemu-dm) total-vm:3629932kB, anon-rss:1363584kB, file-rss:572kB [3166462.372758] Out of memory: Kill process 30974 (qemu-dm) score 868 or sacrifice child [3166462.372782] Killed process 30974 (qemu-dm) total-vm:3545568kB, anon-rss:1282888kB, file-rss:548kB _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Matthew Daley
2013-Nov-22 01:49 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
On Thu, Nov 21, 2013 at 3:57 AM, Niklas Bivald <niklas@bivald.com> wrote:> Hi, > > Me and another sysadmin has independently been researching a problem where > DomU randomly locks (Can’t reach it via xl console, no ping / SSH > connection, shown as stuck in running-state in xentop) on two of our > separate machines (installed completely independently): > > Dom0: > Debian 7.0 with Xen version: 4.1.4 and xen-utils 4.1.4-3+deb7u1 > Debian 7.1 with Xen version: 4.1 > > DomU: > Debian 7.0 > Debian 7.1(.3) > > Common denominator appears to be qemu-dm consuming (leaking?) memory until > the Dom0 swaps. When the Dom0 swap is full, the domU appears to be locked > (see above) Dom0, at which time a hard reboot a.ka. xl destroy + xl create > is the only way to get it back. This *could* be related to "[Xen-devel] > qemu-system-i386: memory leak?" > http://xen.markmail.org/message/chqpifrj46lxdxx2It would seem that the issue Roger fixed in upstream Qemu with the patch linked in his reply ( http://lists.nongnu.org/archive/html/qemu-devel/2012-12/msg03677.html ) could indeed be the problem here. Either way, that patch never made it into qemu-traditional, which still suffers the same original problem (see http://xenbits.xen.org/gitweb/?p=qemu-xen-unstable.git;a=blob;f=hw/xen_disk.c;h=ee8d36f9dbf3c754232d528485cbeff1fd66504e;hb=HEAD#l159 ). I''m not certain what the status of -traditional is, but surely it should be backported in? - Matthew
Niklas Bivald
2013-Nov-25 09:48 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
Hi, Do we know if the patch will make it into qemu-traditional? xen_disk.c appears to have been updated since the patch was released - or it’s simply because I can’t take patch from upstream on qemu-xen, giving me: niklas@unstable:~/xen/qemu-xen-4.1-testing$ patch -p1 < patch patching file hw/xen_disk.c Hunk #1 succeeded at 116 (offset 3 lines). Hunk #2 FAILED at 155. Hunk #3 FAILED at 179. 2 out of 3 hunks FAILED -- saving rejects to file hw/xen_disk.c.rej When I apply the patch manually, I get (on xen-setup or make dist-tools): CC i386-dm/xen_disk.o /home/niklas/xen/xen/tools/ioemu-dir/hw/xen_disk.c: In function ‘ioreq_reset’: /home/niklas/xen/xen/tools/ioemu-dir/hw/xen_disk.c:126:10: error: ‘struct ioreq’ has no member named ‘mapped’ /home/niklas/xen/xen/tools/ioemu-dir/hw/xen_disk.c:139:18: error: ‘struct ioreq’ has no member named ‘acct’ /home/niklas/xen/xen/tools/ioemu-dir/hw/xen_disk.c:139:41: error: ‘struct ioreq’ has no member named ‘acct’ make[4]: *** [xen_disk.o] Error 1 make[4]: Leaving directory `/home/niklas/xen/xen/tools/ioemu-remote/i386-dm'' make[3]: *** [subdir-i386-dm] Error 2 make[3]: Leaving directory `/home/niklas/xen/xen/tools/ioemu-remote'' make[2]: *** [subdir-install-ioemu-dir] Error 2 make[2]: Leaving directory `/home/niklas/xen/xen/tools'' make[1]: *** [subdirs-install] Error 2 make[1]: Leaving directory `/home/niklas/xen/xen/tools'' make: *** [install-tools] Error 2 This is on RELEASE-4.1.4, my manually patched xen_disk.c can be found on http://pastebin.com/WS6mSagi I can successfully build xen tools if I don’t use the patched xen_disk.c Regards, Niklas On 22 nov 2013, at 02:49, Matthew Daley <mattd@bugfuzz.com> wrote:> On Thu, Nov 21, 2013 at 3:57 AM, Niklas Bivald <niklas@bivald.com> wrote: >> Hi, >> >> Me and another sysadmin has independently been researching a problem where >> DomU randomly locks (Can’t reach it via xl console, no ping / SSH >> connection, shown as stuck in running-state in xentop) on two of our >> separate machines (installed completely independently): >> >> Dom0: >> Debian 7.0 with Xen version: 4.1.4 and xen-utils 4.1.4-3+deb7u1 >> Debian 7.1 with Xen version: 4.1 >> >> DomU: >> Debian 7.0 >> Debian 7.1(.3) >> >> Common denominator appears to be qemu-dm consuming (leaking?) memory until >> the Dom0 swaps. When the Dom0 swap is full, the domU appears to be locked >> (see above) Dom0, at which time a hard reboot a.ka. xl destroy + xl create >> is the only way to get it back. This *could* be related to "[Xen-devel] >> qemu-system-i386: memory leak?" >> http://xen.markmail.org/message/chqpifrj46lxdxx2 > > It would seem that the issue Roger fixed in upstream Qemu with the > patch linked in his reply ( > http://lists.nongnu.org/archive/html/qemu-devel/2012-12/msg03677.html > ) could indeed be the problem here. > > Either way, that patch never made it into qemu-traditional, which > still suffers the same original problem (see > http://xenbits.xen.org/gitweb/?p=qemu-xen-unstable.git;a=blob;f=hw/xen_disk.c;h=ee8d36f9dbf3c754232d528485cbeff1fd66504e;hb=HEAD#l159 > ). > > I''m not certain what the status of -traditional is, but surely it > should be backported in? > > - Matthew > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Ian Jackson
2013-Nov-25 11:58 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
Niklas Bivald writes ("Re: [Xen-devel] Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)"):> Do we know if the patch will make it into qemu-traditional? > xen_disk.c appears to have been updated since the patch was released > - or it s simply because I can t take patch from upstream on > qemu-xen, giving me: > > niklas@unstable:~/xen/qemu-xen-4.1-testing$ patch -p1 < patch > patching file hw/xen_disk.c > Hunk #1 succeeded at 116 (offset 3 lines). > Hunk #2 FAILED at 155. > Hunk #3 FAILED at 179. > 2 out of 3 hunks FAILED -- saving rejects to file hw/xen_disk.c.rejqemu-xen-traditional is still maintained for bug fixes and ought to get a backport of this if it is relevant. But: I haven''t looked at the code in any detail but I would be surprised if a leak of this magnitude had existed in it all of these years. Ian.
Niklas Bivald
2013-Nov-25 12:32 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
> Do we know if the patch will make it into qemu-traditional? >> xen_disk.c appears to have been updated since the patch was released >> - or it s simply because I can t take patch from upstream on >> qemu-xen, giving me: >> >> niklas@unstable:~/xen/qemu-xen-4.1-testing$ patch -p1 < patch >> patching file hw/xen_disk.c >> Hunk #1 succeeded at 116 (offset 3 lines). >> Hunk #2 FAILED at 155. >> Hunk #3 FAILED at 179. >> 2 out of 3 hunks FAILED -- saving rejects to file hw/xen_disk.c.rej > > qemu-xen-traditional is still maintained for bug fixes and ought to > get a backport of this if it is relevant. But: I haven''t looked at > the code in any detail but I would be surprised if a leak of this > magnitude had existed in it all of these years.Same here, I can’t figure it out. I’ll be more then happy to at least try to compile qemu-xen-traditional with the above mentioned patch. Unfortunately I can’t figure out how to patch the patch, so to speak. Can I help in the process of confirming this bug in qemu-xen-traditional? After doing IO intensive operations last week, two domains were killed, giving me this in syslog: Nov 22 15:24:16 nyx kernel: [14156192.365813] qemu-dm[32403]: segfault at ff0 ip 00007f9c8c962ca0 sp 00007fff7dd919a8 error 4 in libc-2.13.so[7f9c8c843000+180000] Nov 22 17:44:03 nyx kernel: [14164579.667240] qemu-dm[3362]: segfault at ff0 ip 00007f8596efeca0 sp 00007fff1eab1af8 error 4 in libc-2.13.so[7f8596ddf000+180000]
Ian Jackson
2013-Nov-25 12:40 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
Niklas Bivald writes ("Re: [Xen-devel] Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)"):> Can I help in the process of confirming this bug in qemu-xen-traditional?Do you mean to say that you have observed qemu-xen-traditional''s qemu-dm growing, as described in the bug report ? Thanks, Ian.
Niklas Bivald
2013-Nov-25 12:59 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
Assuming my Debian 7.0 with xen 4.1.4 uses qemu-xen-traditional then yes. Otherwise I’ve observed it in default xen qemu for debian 7. Currently all mine (and ilon@medinet.se’s) qemu-dm instances keeps growing, adding several GB in swap per day. Then dom0 runs out of swap and the qemu-dm segfaults and I have to xl destroy it. Then I start the domain again and qemu-dm starts growing in swap. On 25 nov 2013, at 13:40, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:> Niklas Bivald writes ("Re: [Xen-devel] Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)"): >> Can I help in the process of confirming this bug in qemu-xen-traditional? > > Do you mean to say that you have observed qemu-xen-traditional''s > qemu-dm growing, as described in the bug report ? > > Thanks, > Ian. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Niklas Bivald
2013-Nov-27 09:49 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
Sorry, this is a bit of a newbie question. Can I download xen-utils-4.3 for jessie (http://packages.debian.org/sv/jessie/amd64/xen-utils-4.3/download) and use that qemu-dm? Or will that wreck all kinds of mayhem. I’ve compiled my own qemu based on qemu stable, just need to get my VM cfg to accept the device_model_override (appears to be ignoring it for now) On 25 nov 2013, at 13:59, Niklas Bivald <niklas@bivald.com> wrote:> Assuming my Debian 7.0 with xen 4.1.4 uses qemu-xen-traditional then yes. Otherwise I’ve observed it in default xen qemu for debian 7. Currently all mine (and ilon@medinet.se’s) qemu-dm instances keeps growing, adding several GB in swap per day. Then dom0 runs out of swap and the qemu-dm segfaults and I have to xl destroy it. Then I start the domain again and qemu-dm starts growing in swap. > > On 25 nov 2013, at 13:40, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > >> Niklas Bivald writes ("Re: [Xen-devel] Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)"): >>> Can I help in the process of confirming this bug in qemu-xen-traditional? >> >> Do you mean to say that you have observed qemu-xen-traditional''s >> qemu-dm growing, as described in the bug report ? >> >> Thanks, >> Ian. >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Fabio Fantoni
2013-Nov-27 10:32 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
Il 27/11/2013 10:49, Niklas Bivald ha scritto:> Sorry, this is a bit of a newbie question. Can I download > xen-utils-4.3 for jessie > (http://packages.debian.org/sv/jessie/amd64/xen-utils-4.3/download) > and use that qemu-dm? Or will that wreck all kinds of mayhem.FWIK debian Sid xen 4.3 packages not add qemu traditional anymore but use only the upstream from qemu debian package (now at version 1.6.1). For now I not tested new xen debian package but I tested qemu debian package and is working and full features (xen upstream qemu instead have some features missed).> > I''ve compiled my own qemu based on qemu stable, just need to get my VM > cfg to accept the device_model_override (appears to be ignoring it for > now) > > On 25 nov 2013, at 13:59, Niklas Bivald <niklas@bivald.com > <mailto:niklas@bivald.com>> wrote: > >> Assuming my Debian 7.0 with xen 4.1.4 uses qemu-xen-traditional then >> yes. Otherwise I''ve observed it in default xen qemu for debian 7. >> Currently all mine (and ilon@medinet.se <mailto:ilon@medinet.se>''s) >> qemu-dm instances keeps growing, adding several GB in swap per day. >> Then dom0 runs out of swap and the qemu-dm segfaults and I have to xl >> destroy it. Then I start the domain again and qemu-dm starts growing >> in swap. >> >> On 25 nov 2013, at 13:40, Ian Jackson <Ian.Jackson@eu.citrix.com >> <mailto:Ian.Jackson@eu.citrix.com>> wrote: >> >>> Niklas Bivald writes ("Re: [Xen-devel] Possible memory leak in >>> qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)"): >>>> Can I help in the process of confirming this bug in >>>> qemu-xen-traditional? >>> >>> Do you mean to say that you have observed qemu-xen-traditional''s >>> qemu-dm growing, as described in the bug report ? >>> >>> Thanks, >>> Ian. >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org <mailto:Xen-devel@lists.xen.org> >>> http://lists.xen.org/xen-devel >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org <mailto:Xen-devel@lists.xen.org> >> http://lists.xen.org/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Ian Campbell
2013-Nov-27 11:06 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
On Mon, 2013-11-25 at 13:59 +0100, Niklas Bivald wrote:> Assuming my Debian 7.0 with xen 4.1.4 uses qemu-xen-traditional then > yes. Otherwise I’ve observed it in default xen qemu for debian 7. > Currently all mine (and ilon@medinet.se’s) qemu-dm instances keeps > growing, adding several GB in swap per day. Then dom0 runs out of swap > and the qemu-dm segfaults and I have to xl destroy it. Then I start > the domain again and qemu-dm starts growing in swap.Although I've not run it on qemu you might be able to use valgrind's support for Xen to help debug this. See: http://blog.xen.org/index.php/2013/01/18/using-valgrind-to-debug-xen-toolstacks/ It works for debugging xl create and similar, but I didn't try qemu, if it complains about hypercalls it doesn't understand please let me know and I'll see about implementing them. Ian.> > On 25 nov 2013, at 13:40, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > > > Niklas Bivald writes ("Re: [Xen-devel] Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)"): > >> Can I help in the process of confirming this bug in qemu-xen-traditional? > > > > Do you mean to say that you have observed qemu-xen-traditional's > > qemu-dm growing, as described in the bug report ? > > > > Thanks, > > Ian. > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Niklas Bivald
2013-Dec-02 20:49 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
Hi, Summary: Me and ilon@medinet.se has independently confirmed that the patch solves the memory leak and is running the patched binary live. Big thanks to everyone who helped - specially Matthew who "patched the patch" to work with xen. With the help of Matthew I''ve successfully compiled the patch on xen 4.1.4 (git checkout tags/RELEASE-4.1.4 and make dist-tools) and me and ilon@medinet.se has confirmed independently that the patch does solve the memory leak in qemu-dm. To make sure we''ve changed nothing except the patch, we also compiled from source without the patch to confirm the memory leak was actually there before (which is was) The final patch (again, thanks to Matthew) is available on https://gist.github.com/bivald/7691087 To use the patched version of qemu, we''ve: 1. Compiled binaries from source (with patch) 2. Used the qemu-dm binary (symlinked to /usr/lib/xen-4.1/bin/qemu-dm since 4.1.4 doesn''t support device_model_override) 3. Used the compiled binaries we were missing on dom0: libblktap.so.3.0, libxenctrl.so.4.0, libxenguest.so.4.0 We''re currently running the patched binaries since roughly a week ago and so far they are completely stable. Things that might means something to someone else then me: - We needed libblktap.so.3.0, libxenctrl.so.4.0, libxenguest.so.4.0 for some reason, instead of the 4.1 versions (libxenguest-4.1.so, libxenctrl-4.1.so, xen-4.1/lib/libblktapctl.so) - Memory usage of qemu-dm process for a freshly started VPS is slightly higher then the initial memory usage of the leaking binary but doesn''t leak Again, thanks to everyone involved for the help. In case someone else finds this in the archives and need the patched binary (or if I need to remember how to patch it in the future), I''ve uploaded the binaries and how they were compiled to https://github.com/bivald/xen-tools-4.1.4-patched-qemu Regards, Niklas On 27 Nov 2013, at 12:06, Ian Campbell <ian.campbell@citrix.com> wrote:> On Mon, 2013-11-25 at 13:59 +0100, Niklas Bivald wrote: >> Assuming my Debian 7.0 with xen 4.1.4 uses qemu-xen-traditional then >> yes. Otherwise I’ve observed it in default xen qemu for debian 7. >> Currently all mine (and ilon@medinet.se’s) qemu-dm instances keeps >> growing, adding several GB in swap per day. Then dom0 runs out of swap >> and the qemu-dm segfaults and I have to xl destroy it. Then I start >> the domain again and qemu-dm starts growing in swap. > > Although I''ve not run it on qemu you might be able to use valgrind''s > support for Xen to help debug this. See: > http://blog.xen.org/index.php/2013/01/18/using-valgrind-to-debug-xen-toolstacks/ > > It works for debugging xl create and similar, but I didn''t try qemu, if > it complains about hypercalls it doesn''t understand please let me know > and I''ll see about implementing them. > > Ian. > >> >> On 25 nov 2013, at 13:40, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: >> >>> Niklas Bivald writes ("Re: [Xen-devel] Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)"): >>>> Can I help in the process of confirming this bug in qemu-xen-traditional? >>> >>> Do you mean to say that you have observed qemu-xen-traditional''s >>> qemu-dm growing, as described in the bug report ? >>> >>> Thanks, >>> Ian. >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Matthew Daley
2013-Dec-02 22:24 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
On Tue, Dec 3, 2013 at 9:49 AM, Niklas Bivald <niklas@bivald.com> wrote:> Hi, > > Summary: Me and ilon@medinet.se has independently confirmed that the patch > solves the memory leak and is running the patched binary live. Big thanks to > everyone who helped - specially Matthew who "patched the patch" to work with > xen. > > > > With the help of Matthew I''ve successfully compiled the patch on xen 4.1.4 > (git checkout tags/RELEASE-4.1.4 and make dist-tools) and me and > ilon@medinet.se has confirmed independently that the patch does solve the > memory leak in qemu-dm. To make sure we''ve changed nothing except the patch, > we also compiled from source without the patch to confirm the memory leak > was actually there before (which is was)Awesome, thank you for reliably tracking it down. I''m surprised that the issue could have amounted to such a large memory leak in production.> > The final patch (again, thanks to Matthew) is available on > https://gist.github.com/bivald/7691087Roger did the real work in finding the bug originally and making the original patch! That qemu_iovec_init call wasn''t meant to be commented out however, just the call to qemu_iovec_reset in the following "get one from freelist" block. I''m happy to do a cleaned-up backport if no-one else here does it instead; all that was involved were a couple of missing members from struct ioreq IIRC. - Matthew
Niklas Bivald
2013-Dec-03 08:27 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
On 02 Dec 2013, at 23:24, Matthew Daley <mattd@bugfuzz.com> wrote:> Roger did the real work in finding the bug originally and making the > original patch!Very true, thank you Roger!> That qemu_iovec_init call wasn''t meant to be commented out however, > just the call to qemu_iovec_reset in the following "get one from > freelist" block. I''m happy to do a cleaned-up backport if no-one else > here does it instead; all that was involved were a couple of missing > members from struct ioreq IIRC.Me and ilon is happy to test the backport with the qemu_iovec_init. Is it something I could help you with except the testing? Do you want me to compile a version with qemu_iovec_init to see if anything changes? (memory usage, etc.) Regards, Niklas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Ian Campbell
2013-Dec-03 11:10 UTC
Re: Possible memory leak in qemu-dm (qemu-dm swapping 20GB+, adding 2gb+ per day)
On Tue, 2013-12-03 at 11:24 +1300, Matthew Daley wrote:> I''m happy to do a cleaned-up backport if no-one else > here does it instead; all that was involved were a couple of missing > members from struct ioreq IIRC.I think that would be useful, thanks. Ian.
On ioreq_release the full ioreq was memset to 0, loosing all the data and memory allocations inside the QEMUIOVector, which leads to a memory leak. Create a new function to specifically reset ioreq. Reported-by: Maik Wessler <maik.wessler@yahoo.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Backport to qemu-xen-traditional. Signed-off-by: Matthew Daley <mattd@bugfuzz.com> --- hw/xen_disk.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/hw/xen_disk.c b/hw/xen_disk.c index ee8d36f..250d806 100644 --- a/hw/xen_disk.c +++ b/hw/xen_disk.c @@ -116,6 +116,29 @@ struct XenBlkDev { /* ------------------------------------------------------------- */ +static void ioreq_reset(struct ioreq *ioreq) +{ + memset(&ioreq->req, 0, sizeof(ioreq->req)); + ioreq->status = 0; + ioreq->start = 0; + ioreq->presync = 0; + ioreq->postsync = 0; + + memset(ioreq->domids, 0, sizeof(ioreq->domids)); + memset(ioreq->refs, 0, sizeof(ioreq->refs)); + ioreq->prot = 0; + memset(ioreq->page, 0, sizeof(ioreq->page)); + ioreq->pages = NULL; + + ioreq->aio_inflight = 0; + ioreq->aio_errors = 0; + + ioreq->blkdev = NULL; + memset(&ioreq->list, 0, sizeof(ioreq->list)); + + qemu_iovec_reset(&ioreq->v); +} + static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) { struct ioreq *ioreq = NULL; @@ -132,7 +155,6 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) /* get one from freelist */ ioreq = LIST_FIRST(&blkdev->freelist); LIST_REMOVE(ioreq, list); - qemu_iovec_reset(&ioreq->v); } LIST_INSERT_HEAD(&blkdev->inflight, ioreq, list); blkdev->requests_inflight++; @@ -156,7 +178,7 @@ static void ioreq_release(struct ioreq *ioreq, bool finish) struct XenBlkDev *blkdev = ioreq->blkdev; LIST_REMOVE(ioreq, list); - memset(ioreq, 0, sizeof(*ioreq)); + ioreq_reset(ioreq); ioreq->blkdev = blkdev; LIST_INSERT_HEAD(&blkdev->freelist, ioreq, list); if (finish) { -- 1.7.10.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On ioreq_release the full ioreq was memset to 0, loosing all the data and memory allocations inside the QEMUIOVector, which leads to a memory leak. Create a new function to specifically reset ioreq. Reported-by: Maik Wessler <maik.wessler@yahoo.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Backport to qemu-xen-unstable. Signed-off-by: Matthew Daley <mattd@bugfuzz.com> --- v2: Fix the added commit message ("qemu-xen-traditional" -> "qemu-xen-unstable") hw/xen_disk.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/hw/xen_disk.c b/hw/xen_disk.c index ee8d36f..250d806 100644 --- a/hw/xen_disk.c +++ b/hw/xen_disk.c @@ -116,6 +116,29 @@ struct XenBlkDev { /* ------------------------------------------------------------- */ +static void ioreq_reset(struct ioreq *ioreq) +{ + memset(&ioreq->req, 0, sizeof(ioreq->req)); + ioreq->status = 0; + ioreq->start = 0; + ioreq->presync = 0; + ioreq->postsync = 0; + + memset(ioreq->domids, 0, sizeof(ioreq->domids)); + memset(ioreq->refs, 0, sizeof(ioreq->refs)); + ioreq->prot = 0; + memset(ioreq->page, 0, sizeof(ioreq->page)); + ioreq->pages = NULL; + + ioreq->aio_inflight = 0; + ioreq->aio_errors = 0; + + ioreq->blkdev = NULL; + memset(&ioreq->list, 0, sizeof(ioreq->list)); + + qemu_iovec_reset(&ioreq->v); +} + static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) { struct ioreq *ioreq = NULL; @@ -132,7 +155,6 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) /* get one from freelist */ ioreq = LIST_FIRST(&blkdev->freelist); LIST_REMOVE(ioreq, list); - qemu_iovec_reset(&ioreq->v); } LIST_INSERT_HEAD(&blkdev->inflight, ioreq, list); blkdev->requests_inflight++; @@ -156,7 +178,7 @@ static void ioreq_release(struct ioreq *ioreq, bool finish) struct XenBlkDev *blkdev = ioreq->blkdev; LIST_REMOVE(ioreq, list); - memset(ioreq, 0, sizeof(*ioreq)); + ioreq_reset(ioreq); ioreq->blkdev = blkdev; LIST_INSERT_HEAD(&blkdev->freelist, ioreq, list); if (finish) { -- 1.7.10.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Wed, 4 Dec 2013, Matthew Daley wrote:> On ioreq_release the full ioreq was memset to 0, loosing all the data > and memory allocations inside the QEMUIOVector, which leads to a > memory leak. Create a new function to specifically reset ioreq. > > Reported-by: Maik Wessler <maik.wessler@yahoo.com> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Backport to qemu-xen-unstable. > > Signed-off-by: Matthew Daley <mattd@bugfuzz.com>This patch is already in qemu-xen-unstable: commit 90c96d33c41e243d5f2c6cc197779f5ab744879e Author: Roger Pau Monne <roger.pau@citrix.com> Date: Mon Jan 14 18:26:53 2013 +0000 xen_disk: fix memory leak _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Thu, Dec 5, 2013 at 12:15 AM, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:> On Wed, 4 Dec 2013, Matthew Daley wrote: >> On ioreq_release the full ioreq was memset to 0, loosing all the data >> and memory allocations inside the QEMUIOVector, which leads to a >> memory leak. Create a new function to specifically reset ioreq. >> >> Reported-by: Maik Wessler <maik.wessler@yahoo.com> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> >> Backport to qemu-xen-unstable. >> >> Signed-off-by: Matthew Daley <mattd@bugfuzz.com> > > This patch is already in qemu-xen-unstable: > > commit 90c96d33c41e243d5f2c6cc197779f5ab744879e > Author: Roger Pau Monne <roger.pau@citrix.com> > Date: Mon Jan 14 18:26:53 2013 +0000 > > xen_disk: fix memory leakAre you sure? I can only see that commit in qemu-upstream-unstable... - Matthew
On Thu, 5 Dec 2013, Matthew Daley wrote:> On Thu, Dec 5, 2013 at 12:15 AM, Stefano Stabellini > <stefano.stabellini@eu.citrix.com> wrote: > > On Wed, 4 Dec 2013, Matthew Daley wrote: > >> On ioreq_release the full ioreq was memset to 0, loosing all the data > >> and memory allocations inside the QEMUIOVector, which leads to a > >> memory leak. Create a new function to specifically reset ioreq. > >> > >> Reported-by: Maik Wessler <maik.wessler@yahoo.com> > >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > >> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > >> > >> Backport to qemu-xen-unstable. > >> > >> Signed-off-by: Matthew Daley <mattd@bugfuzz.com> > > > > This patch is already in qemu-xen-unstable: > > > > commit 90c96d33c41e243d5f2c6cc197779f5ab744879e > > Author: Roger Pau Monne <roger.pau@citrix.com> > > Date: Mon Jan 14 18:26:53 2013 +0000 > > > > xen_disk: fix memory leak > > Are you sure? I can only see that commit in qemu-upstream-unstable...Right, sorry in xl we call qemu-upstream-unstable qemu-xen, so when I read qemu-xen-unstable I thought that you meant qemu-xen for unstable. Very confusing :S To clarify: qemu-xen (repository named qemu-upstream-unstable) has the patch. qemu-xen-traditional (repository named qemu-xen-unstable) does not have the patch. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Thu, Dec 5, 2013 at 12:35 AM, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:> On Thu, 5 Dec 2013, Matthew Daley wrote: >> On Thu, Dec 5, 2013 at 12:15 AM, Stefano Stabellini >> <stefano.stabellini@eu.citrix.com> wrote: >> > On Wed, 4 Dec 2013, Matthew Daley wrote: >> >> On ioreq_release the full ioreq was memset to 0, loosing all the data >> >> and memory allocations inside the QEMUIOVector, which leads to a >> >> memory leak. Create a new function to specifically reset ioreq. >> >> >> >> Reported-by: Maik Wessler <maik.wessler@yahoo.com> >> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> >> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> >> >> >> Backport to qemu-xen-unstable. >> >> >> >> Signed-off-by: Matthew Daley <mattd@bugfuzz.com> >> > >> > This patch is already in qemu-xen-unstable: >> > >> > commit 90c96d33c41e243d5f2c6cc197779f5ab744879e >> > Author: Roger Pau Monne <roger.pau@citrix.com> >> > Date: Mon Jan 14 18:26:53 2013 +0000 >> > >> > xen_disk: fix memory leak >> >> Are you sure? I can only see that commit in qemu-upstream-unstable... > > Right, sorry in xl we call qemu-upstream-unstable qemu-xen, so when I > read qemu-xen-unstable I thought that you meant qemu-xen for unstable. > Very confusing :S > > To clarify: qemu-xen (repository named qemu-upstream-unstable) has the > patch. > qemu-xen-traditional (repository named qemu-xen-unstable) does not have > the patch.Ah, OK. Indeed I was going by the repo name, as that''s what I saw on some other random backport I checked, IIRC. I''ll keep this is mind for the future, thanks! - Matthew
On Wed, 2013-12-04 at 11:35 +0000, Stefano Stabellini wrote:> On Thu, 5 Dec 2013, Matthew Daley wrote: > > On Thu, Dec 5, 2013 at 12:15 AM, Stefano Stabellini > > <stefano.stabellini@eu.citrix.com> wrote: > > > On Wed, 4 Dec 2013, Matthew Daley wrote: > > >> On ioreq_release the full ioreq was memset to 0, loosing all the data > > >> and memory allocations inside the QEMUIOVector, which leads to a > > >> memory leak. Create a new function to specifically reset ioreq. > > >> > > >> Reported-by: Maik Wessler <maik.wessler@yahoo.com> > > >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > >> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > >> > > >> Backport to qemu-xen-unstable. > > >> > > >> Signed-off-by: Matthew Daley <mattd@bugfuzz.com> > > > > > > This patch is already in qemu-xen-unstable: > > > > > > commit 90c96d33c41e243d5f2c6cc197779f5ab744879e > > > Author: Roger Pau Monne <roger.pau@citrix.com> > > > Date: Mon Jan 14 18:26:53 2013 +0000 > > > > > > xen_disk: fix memory leak > > > > Are you sure? I can only see that commit in qemu-upstream-unstable... > > Right, sorry in xl we call qemu-upstream-unstable qemu-xen, so when I > read qemu-xen-unstable I thought that you meant qemu-xen for unstable. > Very confusing :S > > To clarify: qemu-xen (repository named qemu-upstream-unstable) has the > patch. > qemu-xen-traditional (repository named qemu-xen-unstable) does not have > the patch.FWIW I was going to reply and suggest that the commit message say e.g. Backport to qemu-xen-traditional (qemu-xen-unstable.git) but maybe this confusion only exists until it is committed Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Thu, Dec 5, 2013 at 12:41 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:> On Wed, 2013-12-04 at 11:35 +0000, Stefano Stabellini wrote: >> On Thu, 5 Dec 2013, Matthew Daley wrote: >> > On Thu, Dec 5, 2013 at 12:15 AM, Stefano Stabellini >> > <stefano.stabellini@eu.citrix.com> wrote: >> > > On Wed, 4 Dec 2013, Matthew Daley wrote: >> > >> On ioreq_release the full ioreq was memset to 0, loosing all the data >> > >> and memory allocations inside the QEMUIOVector, which leads to a >> > >> memory leak. Create a new function to specifically reset ioreq. >> > >> >> > >> Reported-by: Maik Wessler <maik.wessler@yahoo.com> >> > >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> > >> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> > >> >> > >> Backport to qemu-xen-unstable. >> > >> >> > >> Signed-off-by: Matthew Daley <mattd@bugfuzz.com> >> > > >> > > This patch is already in qemu-xen-unstable: >> > > >> > > commit 90c96d33c41e243d5f2c6cc197779f5ab744879e >> > > Author: Roger Pau Monne <roger.pau@citrix.com> >> > > Date: Mon Jan 14 18:26:53 2013 +0000 >> > > >> > > xen_disk: fix memory leak >> > >> > Are you sure? I can only see that commit in qemu-upstream-unstable... >> >> Right, sorry in xl we call qemu-upstream-unstable qemu-xen, so when I >> read qemu-xen-unstable I thought that you meant qemu-xen for unstable. >> Very confusing :S >> >> To clarify: qemu-xen (repository named qemu-upstream-unstable) has the >> patch. >> qemu-xen-traditional (repository named qemu-xen-unstable) does not have >> the patch. > > FWIW I was going to reply and suggest that the commit message say e.g. > Backport to qemu-xen-traditional (qemu-xen-unstable.git) > but maybe this confusion only exists until it is committedActually, if you just use patch v1 instead of v2, you basically get what you want :) (See, I was confused after all) - Matthew> > Ian. > >
Matthew Daley writes ("[PATCH] xen_disk: fix memory leak"):> On ioreq_release the full ioreq was memset to 0, loosing all the data > and memory allocations inside the QEMUIOVector, which leads to a > memory leak. Create a new function to specifically reset ioreq. > > Reported-by: Maik Wessler <maik.wessler@yahoo.com> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Backport to qemu-xen-traditional. > > Signed-off-by: Matthew Daley <mattd@bugfuzz.com>Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com> Thanks! (I also fixed "loosing" to "losing" in the commit message.) Also, I have put this on my backport list. Ian.