Hello, I have a minor problem with ib_mthca driver in linux with Xen in DomU. If I keep ib_mthca driver in kernel while shutting down the DomU, the next start of the DomU resets the machine. Trivial fix is possible: either to rmmod ib_mthca before shutting down the DomU or set .shutdown section to the same value as the .remove section in pci_driver structure. Are you willing apply a patch that sets .shutdown in the mainline of IB driver in Linux? Or is it something that should be fixed by Xen guys? -- Lukáš Hejtmánek _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> If I keep ib_mthca driver in kernel while shutting down the DomU, the next> start of the DomU resets the machine. > > Trivial fix is possible: either to rmmod ib_mthca before shutting down the > DomU or set .shutdown section to the same value as the .remove section in > pci_driver structure. > > Are you willing apply a patch that sets .shutdown in the mainline of IB driver > in Linux? Or is it something that should be fixed by Xen guys? I would like to understand the underlying problem before blindly setting the .shutdown method of the ib_mthca PCI driver section. The mthca driver should be able to handle the hardware being in an arbitrary state when it is reloaded -- that is why it resets the adapter very early during initialization. Do you have any idea what is going wrong in the case where the machine resets? Very few other PCI drivers have a .shutdown method, and I don''t know of any that just duplicate the .remove method. So rather than just having a bandaid for mthca that probably leaves the same problem for every other driver, I would prefer to understand the problem first, and if it is indeed something specific to mthca, then fix the underlying issue in mthca with a simpler shutdown method. I guess one way to debug this would be to delete operations from mthca_remove_one() one by one (starting from the end of the function), and each time try restarting your domU after doing rmmod ib_mthca. When you reach the really necessary thing, then you''ll see the reset. - R. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Lukas Hejtmanek
2008-Jan-18 20:48 UTC
[Xen-devel] Re: [ofa-general] MTHCA driver for Linux
On Fri, Jan 18, 2008 at 12:36:00PM -0800, Roland Dreier wrote:> I would like to understand the underlying problem before blindly > setting the .shutdown method of the ib_mthca PCI driver section. The > mthca driver should be able to handle the hardware being in an > arbitrary state when it is reloaded -- that is why it resets the > adapter very early during initialization. Do you have any idea what > is going wrong in the case where the machine resets?The pcifront-end of xen is wrong. It touches somehow the device when the DomU is starting. At that point, it resets the box hardly, if DomU has been started already with IB driver since the box start up. If the IB device is properly shut down (rmmod ib_mthca), pcifront-end driver does not reset the box at DomU start up. -- Lukáš Hejtmánek _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> The pcifront-end of xen is wrong. It touches somehow the device when the DomU> is starting. At that point, it resets the box hardly, if DomU has been started > already with IB driver since the box start up. I''m not sure I''m understanding what you''re saying. Do you mean that you''ve found a bug in the Xen pci front-end, or do you still think we should fix this by changing the mthca driver? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Lukas Hejtmanek
2008-Jan-18 22:50 UTC
[Xen-devel] Re: [ofa-general] MTHCA driver for Linux
On Fri, Jan 18, 2008 at 01:38:47PM -0800, Roland Dreier wrote:> I''m not sure I''m understanding what you''re saying. Do you mean that > you''ve found a bug in the Xen pci front-end, or do you still think we > should fix this by changing the mthca driver?I''m not sure where exactly the bug is. The bug is triggered by Xen PCI front-end driver in DomU. The workaround is to either rmmod mthca driver or merge .shutdown and .remove sections of the mthca driver (in the module that runs in DomU kernel). I''m not sure where the bug is as the driver should leave the device in correct state. The current Linux kernel does not do that for most devices. Similar problem was with e1000 driver. If the driver was not removed before reboot, the system froze in BIOS code. This one was fixed in the BIOS of motherboard. But I believe, the drivers should not leave the device as is. Maybe people from Xen could write their opinion what should be done here. -- Lukáš Hejtmánek _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel