Luca Lesinigo
2013-Aug-26 23:09 UTC
CONFIG_SMP required for save/restore and migration: bug?
Dear list, I was about to write here seeking for help because my Ubuntu domUs were migrating just fine while my Gentoo ones were just hanging "freezed" after attempting migration, or after saving and restoring on the same node. After some more tinkering around, I nailed the issue to my kernel config which was stripped down to the bare minimum: no hardware drivers, no loadable modules... and no SMP. After enabling SMP support my domU were able to migrate without problems. The same happens on 3.2.x and on 3.10.7 kernels. Before playing with many different kernel configurations, I searched Xen''s wiki and googled around but could not find anything related to this issue. So: with my config, non-SMP kernels will freeze after restore, while activating CONFIG_SMP and nothing else will make this problem disappear. Maybe non-SMP kernels are supposed to migrate but there''s a software problem, or maybe SMP is a requirement but is not documented (or at least I wasn''t able to find it). Maybe it''s not CONFIG_SMP but one of its dependencies. Either way, I''m writing to the list to report this experience, maybe it is a (software or documentation) bug? I''ll keep an eye to the list in the next days, if you need any other detail just ask. Thank you for developing Xen, keep up with the good work! -- Luca Lesinigo
Ian Campbell
2013-Aug-27 08:54 UTC
Re: CONFIG_SMP required for save/restore and migration: bug?
On Tue, 2013-08-27 at 01:09 +0200, Luca Lesinigo wrote:> So: with my config, non-SMP kernels will freeze after restore, while > activating CONFIG_SMP and nothing else will make this problem > disappear. Maybe non-SMP kernels are supposed to migrate but there''s a > software problem, or maybe SMP is a requirement but is not documented > (or at least I wasn''t able to find it). Maybe it''s not CONFIG_SMP but > one of its dependencies.I can''t see any explicit CONFIG_SMP so I think it must be one of its dependencies (or something even more complex). Perhaps we implicitly rely on CPU hotplug support or something similar. Can you diff the working and non-working configs to see if anything has become implicitly (de)activated by changing the CONFIG_SMP setting? The easiest way to get to the bottom of this is probably to do printk instrumentation of drivers/xen/manage.c:do_suspend(). For some of the really early bits you might want to use xen_raw_printk() rather than regular printk (to avoid problems with the console not being restored yet). That will require a debug hypervisor build though (or perhaps guest_loglvl=all on your h/v command line) Turning on CONFIG_PM_DEBUG might also be interesting. Ian.
Konrad Rzeszutek Wilk
2013-Aug-27 13:33 UTC
Re: CONFIG_SMP required for save/restore and migration: bug?
On Tue, Aug 27, 2013 at 01:09:41AM +0200, Luca Lesinigo wrote:> Dear list, I was about to write here seeking for help because my Ubuntu domUs were migrating just fine while my Gentoo ones were just hanging "freezed" after attempting migration, or after saving and restoring on the same node. > > After some more tinkering around, I nailed the issue to my kernel config which was stripped down to the bare minimum: no hardware drivers, no loadable modules... and no SMP. After enabling SMP support my domU were able to migrate without problems. The same happens on 3.2.x and on 3.10.7 kernels. > > Before playing with many different kernel configurations, I searched Xen''s wiki and googled around but could not find anything related to this issue. > > So: with my config, non-SMP kernels will freeze after restore, while activating CONFIG_SMP and nothing else will make this problem disappear. Maybe non-SMP kernels are supposed to migrate but there''s a software problem, or maybe SMP is a requirement but is not documented (or at least I wasn''t able to find it). Maybe it''s not CONFIG_SMP but one of its dependencies. > > Either way, I''m writing to the list to report this experience, maybe it is a (software or documentation) bug?It could also be a dependency issue. Meaning if you undef CONFIG_SMP it won''t pick up the rest of the code - say the code needed for migration. That would be odd and buggy. If you compare the two .config files what does CONFIG_SMP=y add extra? Are these PVHVM or PV domUs? Thanks!> > I''ll keep an eye to the list in the next days, if you need any other detail just ask. > > Thank you for developing Xen, keep up with the good work!Thank you.> > -- > Luca Lesinigo > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Luca Lesinigo
2013-Aug-27 19:47 UTC
Re: CONFIG_SMP required for save/restore and migration: bug?
Ian, Konrad, thanks for your replies. Here are some more details. Il giorno 27/ago/2013, alle ore 15:33, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> ha scritto:> Are these PVHVM or PV domUs?These are vanilla and recent linux kernels, directly loaded by Xen (no hvm loader): should be plain PV domUs. Il giorno 27/ago/2013, alle ore 10:54, Ian Campbell <Ian.Campbell@citrix.com> ha scritto:> Can you diff the working and non-working configs to see if anything has become implicitly (de)activated by changing the CONFIG_SMP setting?Enabling SMP will generate some diff lines, mainly because on UP you''re forced to use the TINY rcu and on SMP you''re forced to use something else (TREE being the only remaining choice nowadays). This is my good old trusty (and rusty) non-SMP config, from a vanilla kernel.org 3.10.9 source: http://pastebin.com/B7pC3gaC It will hang like I described if you do xm save / xm restore on a node or try to migrate between nodes. If you go in make menuconfig, turn on SMP support, change localversion, and do nothing else, you end up with this diff: http://pastebin.com/g0syXCVe And the resulting SMP kernel will save/restore and migrate just fine. These kernels do not use nor support loadable modules, and you will need an Intel CPU of the "Core" series or newer to run them - no Intel <= Pentium4 nor AMD processors. I compile them with gcc (Gentoo Hardened 4.6.3 p1.13, pie-0.5.2) 4.6.3 installed from Gentoo portage. My Xen hosts are 64bit Xen 4.3.0 installed from Gentoo portage with the same gcc as the kernels. Hypervisor command line is "watchdog nmi=dom0 com2=57600,8n1 vga=gfx-640x480x32 console=com2,vga dom0_max_vcpus=1 dom0_vcpus_pin dom0_mem=2048M tmem tmem_dedup tmem_compress" I tested them with a config like this: # -------- name = "foo" kernel = "/srv/xen/images/bzImage-3.10.9-domU-UP" memory = 1280 vcpus = 1 vif = [ ''bridge=bridge0'', ''bridge=bridge1'' ] disk = [ ''phy:/dev/mapper/foo,xvda,w'' ] root = "/dev/xvda1 ro" # -------- Note #1: yes I tested with vcpus=1 even with the SMP kernel Note #2: I usually run them with extra="tmem" and while I can see it actually using tmem, it does not change the ability (or lack thereof) to restore/migrate. I can provide the resulting binaries if anyone wants them, but I suspect/guess/hope it will be easily reproducible by others compiling their own kernels. I should be able to test the same kernels on an Ubuntu-shipped Xen sometime in the future, but not in the next days. thanks, -- Luca Lesinigo