Thomas Friebel
2007-May-23 14:49 UTC
[Xen-devel] [PV] PCI passthrough and interrupt sharing
Hi, I have a setup with two NICs. One of the cards is assigned to a guest using pcifront/back. The other one is used for host networking. When the guest NIC''s IRQ is not shared with another device, everything works fine. When both NICs share the same IRQ, sooner or later host and guest networking both hang. Is this a known issue or is it supposed to work correctly? I''m using 3.1-testing. Kind regards -- Thomas Friebel Operating System Research Center AMD Saxony, Dresden, Germany -- Legal Information: AMD Saxony Limited Liability Company & Co. KG Wilschdorfer Landstr. 101, 01109 Dresden, Germany Register Court Dresden: HRA 4896 General Partner authorized to represent: AMD Saxony LLC (Wilmington, Delaware, US) General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Friebel wrote:> When the guest NIC''s IRQ is not shared with another device, everything > works fine. When both NICs share the same IRQ, sooner or later host and > guest networking both hang.It''s not an issue that I''ve previously encountered, but I''d like to try to reproduce and fix it. Can you describe the steps you took to get the system into a state where host and guest networking will eventually hang. -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-May-24 08:40 UTC
Re: [Xen-devel] [PV] PCI passthrough and interrupt sharing
On 23/5/07 15:49, "Thomas Friebel" <thomas.friebel@amd.com> wrote:> When the guest NIC''s IRQ is not shared with another device, everything > works fine. When both NICs share the same IRQ, sooner or later host and > guest networking both hang. > > Is this a known issue or is it supposed to work correctly?It''s supposed to work but there have been enough complaints about this scenario that I''m sure there must be a bug somewhere. I haven''t tried to reproduce this problem as yet. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Friebel
2007-May-24 12:27 UTC
Re: [Xen-devel] [PV] PCI passthrough and interrupt sharing
Hello Chris, thanks for your interest in this issue. On Wed, 2007-05-23 at 14:42 -0400, Chris wrote:> Thomas Friebel wrote: > > When the guest NIC''s IRQ is not shared with another device, everything > > works fine. When both NICs share the same IRQ, sooner or later host and > > guest networking both hang. > > It''s not an issue that I''ve previously encountered, but I''d like to try > to reproduce and fix it. Can you describe the steps you took to get the > system into a state where host and guest networking will eventually hang.All I do is stress the network a bit by doing a ssh <physical-guest-NIC> cat /dev/zero > /dev/null on a second box. It usually takes 2-5 seconds until is stops working. It never took more than 30 seconds. IMO I did''t do anything special in this setup. I can send you the kernel configs and a more detailed description of the setup if you like. Kind regards -- Thomas Friebel Operating System Research Center AMD Saxony, Dresden, Germany -- Legal Information: AMD Saxony Limited Liability Company & Co. KG Wilschdorfer Landstr. 101, 01109 Dresden, Germany Register Court Dresden: HRA 4896 General Partner authorized to represent: AMD Saxony LLC (Wilmington, Delaware, US) General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Friebel wrote:> IMO I did''t do anything special in this setup. I can send you the kernel > configs and a more detailed description of the setup if you like.Sure, send me what you can (b/c of size, perhaps kernel config shouldn''t be CC''ed to the list, but directly to me is OK). I''m also curious what hardware you''re using, especially the NICs. Seems like finding a common theme is a good place to start looking for the cause. Any details you can provide about how the devices get on the same IRQ would be useful for recreating your environment. Do you do anything to explicitly force the NICs to share an IRQ or does PnP just cause this condition automatically? Cheers, Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Friebel
2007-May-24 15:54 UTC
Re: [Xen-devel] [PV] PCI passthrough and interrupt sharing
On Thu, 2007-05-24 at 10:26 -0400, Chris wrote:> Thomas Friebel wrote: > > IMO I did''t do anything special in this setup. I can send you the kernel > > configs and a more detailed description of the setup if you like. > > Sure, send me what you can (b/c of size, perhaps kernel config shouldn''t > be CC''ed to the list, but directly to me is OK). I''m also curious what > hardware you''re using, especially the NICs. Seems like finding a common > theme is a good place to start looking for the cause.I use an ASUS M2NPV-VM board with an onboard nVidia nforce 430 NIC, and an additional RTL 8169 PCI NIC by Netgear: 00:14.0 Ethernet controller: nVidia Corporation MCP51 Ethernet Controller (rev a3) (vendor/device = 0x10de/0x0268) 04:08.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10) (vendor/device = 0x10ec/0x8169) The same problems showed up when using a 3Com Fast EtherLink XL PCI NIC in combination with the same board.> Any details you can provide about how the devices get on the same IRQ > would be useful for recreating your environment. Do you do anything to > explicitly force the NICs to share an IRQ or does PnP just cause this > condition automatically?I don''t know if it''s possible to explicitly force the NICs to share IRQ. What I do is disabling ACPI in dom0 (acpi=off kernel parameter) and activating each onboard device in the BIOS setup. The other BIOS settings like "PnP aware OS" don''t matter. Cheers, -- Thomas Friebel Operating System Research Center AMD Saxony, Dresden, Germany -- Legal Information: AMD Saxony Limited Liability Company & Co. KG Wilschdorfer Landstr. 101, 01109 Dresden, Germany Register Court Dresden: HRA 4896 General Partner authorized to represent: AMD Saxony LLC (Wilmington, Delaware, US) General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Friebel wrote:> both config files - dom0, domU, are attached to this mail.Thanks for those configs. From the diversity of hardware on which this issue appears, it would seem this is very likely a software bug. It''ll take me a bit to collect and prepare hardware for a test system and start poking around, but it''s in the queue. -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Friebel wrote:> When both NICs share the same IRQ, sooner or later host and > guest networking both hang.I''m having trouble reproducing this behavior. My test setup contains two physical NICs that share an IRQ. Immediately after a fresh boot, ''cat /proc/interrupts'' shows: CPU0 CPU1 ... 10: 74 489 Phys-irq eth2, peth1 ... ACPI was disabled by appending ''acpi=off'' to the dom0 kernel parameters. Similarly, IRQ sharing was caused by adding the "pirq=" dom0 kernel parameter. Then I used pciback late-binding to "hide" eth2 from dom0 and make it accessible to domU. After completing the late-binding process (described in the users manual), /proc/interrupts gives: CPU0 CPU1 ... 10: 8408301 516 Phys-irq peth1 ... 264: 93 0 Dynamic-irq pciback ... Apparently IRQ sharing no longer occurs after the pciback driver binds to the device. Let me know if this is what you observed. With the NIC bound to pciback and domU started, I began the network stress-tested as per your suggestion: ssh <physical-guest-NIC> cat /dev/zero > /dev/null After a long wait (it''s been running for at least 15 minutes now) connectivity in both dom0 and domU are (a bit sluggish but) still working normally. -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Friebel
2007-Jun-07 13:56 UTC
Re: [Xen-devel] [PV] PCI passthrough and interrupt sharing
Hello Chris, thank you for spending time n this issue. On Tue, 2007-06-05 at 16:29 -0400, Chris wrote:> Thomas Friebel wrote: > > When both NICs share the same IRQ, sooner or later host and > > guest networking both hang. > > I''m having trouble reproducing this behavior.What I do is the following: Fetch 3.1-testing, compile a 64bit kernel with the default config plus sATA (to access the root fs), plus RTL8169 and nForce NIC support (the two used NICs). Start the kernel with acpi=off. Do a ''ifconfig eth2 up'' (eth0 is started by the init scripts) and ''cat /proc/interrupts'' shows: 11: 62182 0 Phys-irq peth0, eth2 The, after executing # SLOT=0000:04:08.0 # echo -n $SLOT > /sys/bus/pci/drivers/r8169/unbind # echo -n $SLOT > /sys/bus/pci/drivers/pciback/new_slot # echo -n $SLOT > /sys/bus/pci/drivers/pciback/bind ''cat /proc/interrupts'' shows: 11: 74742 0 Phys-irq peth0 and no line for pciback. Then I start the driver domain and ''cat /proc/interrupts'' shows: 11: 74742 0 Phys-irq peth0 ... 264: 126 0 Dynamic-irq pciback The guest then successfully loads the r8169 driver and the network is working. After some time, even without stressing the driver domain network, the host hangs.> After a long wait (it''s been running for at least 15 minutes now) > connectivity in both dom0 and domU are (a bit sluggish but) still > working normally.15 minutes are enough time. For me it didn''t work longer than 4 minutes a time. Maybe it''s a problem of one of the two involved NIC drivers? Kind regards -- Thomas Friebel Operating System Research Center AMD Saxony, Dresden, Germany -- Legal Information: AMD Saxony Limited Liability Company & Co. KG Wilschdorfer Landstr. 101, 01109 Dresden, Germany Register Court Dresden: HRA 4896 General Partner authorized to represent: AMD Saxony LLC (Wilmington, Delaware, US) General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Friebel wrote:> Fetch 3.1-testing, compile a 64bit kernel with the default config plus > sATA (to access the root fs), plus RTL8169 and nForce NIC support (the > two used NICs).A difference in my setup (that I can''t readily resolve) is that I compile a 32bit kernel. Also, the drivers for my NICs obviously aren''t the same as yours.> 15 minutes are enough time. For me it didn''t work longer than 4 minutes > a time.Just to be thorough I let the stress test run 15+ hours with no sign of problems; it was still running when I came in the next day.> Maybe it''s a problem of one of the two involved NIC drivers?Is it feasible for you to reproduce the problem on totally different hardware? At least different NICs, but the greater the difference the better. -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel