Falko Tesch
2007-Mar-09 19:26 UTC
[Xen-users] Strange issue: DomU not saving when using direct HW access, plus non-working after restart
Hi Everyone, I do have a real strange problem here: My environment: Xen 3.02 on SuSE 10.1 In dom0 I disable eth0 with the following lines in /etc/init.d/boot.local: /sbin/modprobe pciback /bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/e1000/unbind /bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/pciback/new_slot /bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/pciback/bind Than I start my domU with the following parameters: [...] pci = [ ''01:00.0'' ] dhcp = ''dhcp'' [...] Basically every thing''s fine so far. domU is booting a accessing the net via dhcp over it''s HW assigned eth0. But when I reboot dom0 it tries to save domU (which seems to be OK). After rebooting dom0 starts to restart domU which fails and results in a "cold boot" of domU (incl. file check etc on its boot). Now if I try to save and restore the domU manually it fails and I get this messages: Error: pci: Invalid config setting bus: none Even stranger: If I then try to start domU manually with xm create domU -c, dhcp is just not working! domU finds the assigned HW (eth0) but is not able to set up the network at all! And I can''t get domU back to work until I reboot the whole system (dom0) completely! Any ideas? Regards Falko _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Williamson
2007-Mar-11 02:59 UTC
Re: [Xen-users] Strange issue: DomU not saving when using direct HW access, plus non-working after restart
> I do have a real strange problem here: > > My environment: Xen 3.02 on SuSE 10.1 > > In dom0 I disable eth0 with the following lines in /etc/init.d/boot.local: > /sbin/modprobe pciback > /bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/e1000/unbind > /bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/pciback/new_slot > /bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/pciback/bind > > > Than I start my domU with the following parameters: > [...] > pci = [ ''01:00.0'' ] > dhcp = ''dhcp'' > [...] > > Basically every thing''s fine so far. domU is booting a accessing the net > via dhcp over it''s HW assigned eth0.Cool.> But when I reboot dom0 it tries to save domU (which seems to be OK). > After rebooting dom0 starts to restart domU which fails and results in a > "cold boot" of domU (incl. file check etc on its boot). > > Now if I try to save and restore the domU manually it fails and I get > this messages: > Error: pci: Invalid config setting bus: none > > Even stranger: > If I then try to start domU manually with xm create domU -c, dhcp is > just not working! > domU finds the assigned HW (eth0) but is not able to set up the network > at all! And I can''t get domU back to work until I reboot the whole > system (dom0) completely!Suspend / resume isn''t supported for domains that have direct access to PCI devices - I''m surprised the tools even allow it (they probably shouldn''t!). It''s strange that subsequently starting the domain manually also fails - are you sure that the domain you attempted to restore wasn''t still hanging around somewhere? If it really is failing when there are no other domains fighting for that card, it could be that the state of the ethernet card (or, I guess, maybe that of the Xen PCI pciback driver) has been messed up by the failed operations and that''s why you need a whole machine reboot. The simple fix is to disable the automatic suspend/resume of that domain on reboot; have it shutdown and reboot by dom0 instead. Other domains that don''t have direct hardware access may still be safely suspend-resumed. Something that I''d be interested in is whether once you''ve got to the wedged state of requiring a dom0 reboot, whether you could bring up that ethernet device in dom0 (by rebinding it back to the e1000 driver). This would tell us if the device is wedged, vs pciback. Please note that trying this (or starting new driver domains once you''ve got into the wedged state or doing a resume of a saved driver domain either explicitly or at dom0 reboot) is quite possibly going to send weird commands to your NIC; I''d not expect this to actually harm modern hardware but it''s not impossible you could get some instability / corruption on the host system (not just the domU). So, if it''s *not* an important / production box containing any useful data, I''d be interested if you could experiment a bit more - otherwise just disable the automatic suspend/resume on dom0 reboot for that domain and your problem will be solved. Does that answer your question? It''s great to have users / testers of the driver domains functionality, so please let us know how you get on! Cheers, Mark -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Falko Tesch
2007-Mar-11 08:01 UTC
Re: [Xen-users] Strange issue: DomU not saving when using direct HW access, plus non-working after restart
Hi Mark, thanks a lot lot for your answer! Since it is somewhat of a productionbox (all email and web services running over it) I''ll need to restart it some coming night. I''ll definitely keep you updated. Thx again. Regards Falko Mark Williamson schrieb:>> I do have a real strange problem here: >> >> My environment: Xen 3.02 on SuSE 10.1 >> >> In dom0 I disable eth0 with the following lines in /etc/init.d/boot.local: >> /sbin/modprobe pciback >> /bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/e1000/unbind >> /bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/pciback/new_slot >> /bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/pciback/bind >> >> >> Than I start my domU with the following parameters: >> [...] >> pci = [ ''01:00.0'' ] >> dhcp = ''dhcp'' >> [...] >> >> Basically every thing''s fine so far. domU is booting a accessing the net >> via dhcp over it''s HW assigned eth0. > > Cool. > >> But when I reboot dom0 it tries to save domU (which seems to be OK). >> After rebooting dom0 starts to restart domU which fails and results in a >> "cold boot" of domU (incl. file check etc on its boot). >> >> Now if I try to save and restore the domU manually it fails and I get >> this messages: >> Error: pci: Invalid config setting bus: none >> >> Even stranger: >> If I then try to start domU manually with xm create domU -c, dhcp is >> just not working! >> domU finds the assigned HW (eth0) but is not able to set up the network >> at all! And I can''t get domU back to work until I reboot the whole >> system (dom0) completely! > > Suspend / resume isn''t supported for domains that have direct access to PCI > devices - I''m surprised the tools even allow it (they probably shouldn''t!). > > It''s strange that subsequently starting the domain manually also fails - are > you sure that the domain you attempted to restore wasn''t still hanging around > somewhere? If it really is failing when there are no other domains fighting > for that card, it could be that the state of the ethernet card (or, I guess, > maybe that of the Xen PCI pciback driver) has been messed up by the failed > operations and that''s why you need a whole machine reboot. > > The simple fix is to disable the automatic suspend/resume of that domain on > reboot; have it shutdown and reboot by dom0 instead. Other domains that > don''t have direct hardware access may still be safely suspend-resumed. > > Something that I''d be interested in is whether once you''ve got to the wedged > state of requiring a dom0 reboot, whether you could bring up that ethernet > device in dom0 (by rebinding it back to the e1000 driver). This would tell > us if the device is wedged, vs pciback. Please note that trying this (or > starting new driver domains once you''ve got into the wedged state or doing a > resume of a saved driver domain either explicitly or at dom0 reboot) is quite > possibly going to send weird commands to your NIC; I''d not expect this to > actually harm modern hardware but it''s not impossible you could get some > instability / corruption on the host system (not just the domU). > > So, if it''s *not* an important / production box containing any useful data, > I''d be interested if you could experiment a bit more - otherwise just disable > the automatic suspend/resume on dom0 reboot for that domain and your problem > will be solved. > > Does that answer your question? It''s great to have users / testers of the > driver domains functionality, so please let us know how you get on! > > Cheers, > Mark >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users