Andrew Bobulsky
2013-May-07 10:38 UTC
Some MSI related bugs when trying to use VT-d. Help identify software vs hardware problem?
Hello List! I''m having another [rather fruitless] go at trying to get PCIe passthrough to work on my Radeon 6990 card(s). I have an i7 920 chip in a Gigabyte GA-EX58-EXTREME board that I''ve flashed a modded BIOS into to add VT-d support... I found the BIOS image on a BIOS modding forum maybe a year or two ago. I stuck an extra Highpoint RocketU 1144A USB 3 card into the board, because I know it works *very well* with IOMMU and its architecture is really convenient... each port on the back is essentially its own PCIe device[1]. I was able to "xl pci-assignable-add" the usb controllers, and attach and detach them at will to a Server 2012 DomU. The dmesg output from that event looked like this: [49857.921550] xhci_hcd 0000:06:00.0: remove, state 4> [49857.921555] usb usb15: USB disconnect, device number 1 > [49857.921600] xHCI xhci_drop_endpoint called for root hub > [49857.921602] xHCI xhci_check_bandwidth called for root hub > [49857.921681] xhci_hcd 0000:06:00.0: USB bus 15 deregistered > [49857.921686] xhci_hcd 0000:06:00.0: remove, state 1 > [49857.921689] usb usb13: USB disconnect, device number 1 > [49857.921691] usb 13-1: USB disconnect, device number 4 > [49857.953278] xHCI xhci_drop_endpoint called for root hub > [49857.953280] xHCI xhci_check_bandwidth called for root hub > [49857.963158] xhci_hcd 0000:06:00.0: USB bus 13 deregistered > [49857.963455] pciback 0000:06:00.0: seizing device > [49857.963500] xen: registering gsi 17 triggering 0 polarity 1 > [49857.963503] Already setup the GSI :17 > [49857.963514] pciback 0000:06:00.0: MSI-X preparation failed (-38)Nonetheless, it works quite well! I played audio through a USB headset from the DomU to confirm it as well. However, when I try to "xl pci-assignable-add" one of my VGA controllers from the Radeon, the action completes, and "xl pci-assignable-list" shows the device as available, but "xl pci-attach" never completes, and attemping to "xl pci-assignable-add" the HDMI audo device never returns to the CLI either. There is no visible output in dmesg from the attempt on the HDMI audio device, but when I run it against the VGA controller, I get this: [55817.715309] pciback 0000:0e:00.0: seizing device> [55817.737444] ------------[ cut here ]------------ > [55817.737447] kernel BUG at drivers/pci/msi.c:346! > [55817.737449] invalid opcode: 0000 [#1] PREEMPT SMP > [55817.737451] Modules linked in: xt_physdev iptable_filter ip_tables > x_tables tun parport_pc ppdev lp parport bnep rfcomm bluetooth rfkill crc16 > cpufreq_stats binfmt_misc fuse bridge stp llc ext2 loop snd_hda_codec_hdmi > snd_hda_codec_realtek joydev mperf coretemp crc32c_intel fglrx(PO) > snd_hda_intel snd_hda_codec snd_usb_audio microcode hid_generic mxm_wmi > snd_usbmidi_lib evdev psmouse snd_seq_midi snd_seq_midi_event i2c_i801 > pcspkr tpm_tis snd_hwdep tpm snd_rawmidi serio_raw snd_pcm i2c_core > tpm_bios snd_seq snd_timer snd_seq_device lpc_ich mfd_core snd ehci_pci > soundcore snd_page_alloc wmi xhci_hcd button processor thermal_sys sg > sr_mod cdrom ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_mod sd_mod > crc_t10dif usb_storage usbhid hid ahci libahci uhci_hcd ehci_hcd usbcore > usb_common e1000e > [55817.737482] CPU 6 > [55817.737484] Pid: 18055, comm: xl Tainted: P O 3.8.11 #1 > Gigabyte Technology Co., Ltd. EX58-EXTREME/EX58-EXTREME > [55817.737485] RIP: e030:[<ffffffff811eab09>] [<ffffffff811eab09>] > free_msi_irqs+0x5d/0x11b > [55817.737490] RSP: e02b:ffff88034211dd08 EFLAGS: 00010282 > [55817.737491] RAX: ffff880420bbd600 RBX: ffff88042096fa80 RCX: > 0000000000000000 > [55817.737492] RDX: 0000000000000000 RSI: 0000000000000091 RDI: > 0000000000000011 > [55817.737493] RBP: ffff880421f81000 R08: ffff88042096fa80 R09: > ffff88034211dce4 > [55817.737494] R10: ffff88034211dd16 R11: 0000000000000000 R12: > ffff880421f81858 > [55817.737495] R13: 0000000000000001 R14: 0000000000000000 R15: > 0000000000000001 > [55817.737498] FS: 00007fa609475740(0000) GS:ffff88043a2c0000(0000) > knlGS:0000000000000000 > [55817.737499] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [55817.737500] CR2: ffffffffff600400 CR3: 000000033a673000 CR4: > 0000000000002660 > [55817.737501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [55817.737503] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [55817.737504] Process xl (pid: 18055, threadinfo ffff88034211c000, task > ffff8804217b5ca0) > [55817.737504] Stack: > [55817.737505] 00000000000000a2 ffff880421f81000 0000000000000000 > ffff8803ca2ab6c0 > [55817.737507] ffff880421f81098 ffff880421f810f8 ffff8803ca2abc00 > ffffffff811eb21d > [55817.737509] ffff880421f81000 ffffffff81240822 ffff880421f81098 > ffff880421f81000 > [55817.737510] Call Trace: > [55817.737513] [<ffffffff811eb21d>] ? pci_disable_msi+0x28/0x41 > [55817.737516] [<ffffffff81240822>] ? xen_pcibk_reset_device+0x3a/0xa0 > [55817.737518] [<ffffffff8123fd9f>] ? pcistub_init_device+0x167/0x19c > [55817.737521] [<ffffffff81108661>] ? __kmalloc+0xd6/0xe2 > [55817.737523] [<ffffffff8123ff08>] ? pcistub_probe+0x134/0x1b8 > [55817.737525] [<ffffffff811df0f5>] ? local_pci_probe+0x37/0x5d > [55817.737527] [<ffffffff811dff7c>] ? pci_device_probe+0xc2/0xe3 > [55817.737529] [<ffffffff81277d71>] ? driver_probe_device+0xa1/0x1ac > [55817.737532] [<ffffffff81276e38>] ? driver_bind+0x7e/0xc7 > [55817.737534] [<ffffffff81165859>] ? sysfs_write_file+0xd3/0x10f > [55817.737537] [<ffffffff8110fb7d>] ? vfs_write+0xa4/0xfe > [55817.737539] [<ffffffff813c22e5>] ? _raw_spin_lock+0xe/0x2a > [55817.737541] [<ffffffff8110fcc8>] ? sys_write+0x58/0x92 > [55817.737543] [<ffffffff813c7f29>] ? system_call_fastpath+0x16/0x1b > [55817.737544] Code: 8a 3b 45 31 f6 41 d0 ef 44 89 f9 45 89 ef 83 e1 07 41 > d3 e7 eb 1c 8b 7b 0c 44 01 f7 e8 59 74 eb ff 48 83 b8 90 00 00 00 00 74 04 > <0f> 0b eb fe 41 ff c6 45 39 fe 7c df 48 8b 5b >10 48 83 eb 10 48 > [55817.737560] RIP [<ffffffff811eab09>] free_msi_irqs+0x5d/0x11b > [55817.737562] RSP <ffff88034211dd08> > [55817.737563] ---[ end trace e1c5a8a903358804 ]---Any chance anyone could help me identify the source of this problem? Can I work around it with software, or do I need a different motherboard or video card to make it work? I''m running Debian 6.0.7 x86_64 with Kernel 3.8.11, compiled with the following config: http://tny.cz/e28f7351 Thanks! -Andrew Bobulsky 1) some lspci -t for anyone who would like: http://i.imgur.com/9HpiGjf.png _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
George Dunlap
2013-May-08 08:45 UTC
Re: Some MSI related bugs when trying to use VT-d. Help identify software vs hardware problem?
On Tue, May 7, 2013 at 11:38 AM, Andrew Bobulsky <rulerof@gmail.com> wrote:> Hello List! > > I''m having another [rather fruitless] go at trying to get PCIe passthrough > to work on my Radeon 6990 card(s). I have an i7 920 chip in a Gigabyte > GA-EX58-EXTREME board that I''ve flashed a modded BIOS into to add VT-d > support... I found the BIOS image on a BIOS modding forum maybe a year or > two ago. > > > I stuck an extra Highpoint RocketU 1144A USB 3 card into the board, because > I know it works *very well* with IOMMU and its architecture is really > convenient... each port on the back is essentially its own PCIe device[1]. > I was able to "xl pci-assignable-add" the usb controllers, and attach and > detach them at will to a Server 2012 DomU. The dmesg output from that event > looked like this: > >> [49857.921550] xhci_hcd 0000:06:00.0: remove, state 4 >> [49857.921555] usb usb15: USB disconnect, device number 1 >> [49857.921600] xHCI xhci_drop_endpoint called for root hub >> [49857.921602] xHCI xhci_check_bandwidth called for root hub >> [49857.921681] xhci_hcd 0000:06:00.0: USB bus 15 deregistered >> [49857.921686] xhci_hcd 0000:06:00.0: remove, state 1 >> [49857.921689] usb usb13: USB disconnect, device number 1 >> [49857.921691] usb 13-1: USB disconnect, device number 4 >> [49857.953278] xHCI xhci_drop_endpoint called for root hub >> [49857.953280] xHCI xhci_check_bandwidth called for root hub >> [49857.963158] xhci_hcd 0000:06:00.0: USB bus 13 deregistered >> [49857.963455] pciback 0000:06:00.0: seizing device >> [49857.963500] xen: registering gsi 17 triggering 0 polarity 1 >> [49857.963503] Already setup the GSI :17 >> [49857.963514] pciback 0000:06:00.0: MSI-X preparation failed (-38) > > > Nonetheless, it works quite well! I played audio through a USB headset from > the DomU to confirm it as well. > > > > However, when I try to "xl pci-assignable-add" one of my VGA controllers > from the Radeon, the action completes, and "xl pci-assignable-list" shows > the device as available, but "xl pci-attach" never completes, and attemping > to "xl pci-assignable-add" the HDMI audo device never returns to the CLI > either. > > There is no visible output in dmesg from the attempt on the HDMI audio > device, but when I run it against the VGA controller, I get this: > >> [55817.715309] pciback 0000:0e:00.0: seizing device >> [55817.737444] ------------[ cut here ]------------ >> [55817.737447] kernel BUG at drivers/pci/msi.c:346! >> [55817.737449] invalid opcode: 0000 [#1] PREEMPT SMP >> [55817.737451] Modules linked in: xt_physdev iptable_filter ip_tables >> x_tables tun parport_pc ppdev lp parport bnep rfcomm bluetooth rfkill crc16 >> cpufreq_stats binfmt_misc fuse bridge stp llc ext2 loop snd_hda_codec_hdmi >> snd_hda_codec_realtek joydev mperf coretemp crc32c_intel fglrx(PO) >> snd_hda_intel snd_hda_codec snd_usb_audio microcode hid_generic mxm_wmi >> snd_usbmidi_lib evdev psmouse snd_seq_midi snd_seq_midi_event i2c_i801 >> pcspkr tpm_tis snd_hwdep tpm snd_rawmidi serio_raw snd_pcm i2c_core tpm_bios >> snd_seq snd_timer snd_seq_device lpc_ich mfd_core snd ehci_pci soundcore >> snd_page_alloc wmi xhci_hcd button processor thermal_sys sg sr_mod cdrom >> ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_mod sd_mod crc_t10dif >> usb_storage usbhid hid ahci libahci uhci_hcd ehci_hcd usbcore usb_common >> e1000e >> [55817.737482] CPU 6 >> [55817.737484] Pid: 18055, comm: xl Tainted: P O 3.8.11 #1 >> Gigabyte Technology Co., Ltd. EX58-EXTREME/EX58-EXTREME >> [55817.737485] RIP: e030:[<ffffffff811eab09>] [<ffffffff811eab09>] >> free_msi_irqs+0x5d/0x11b >> [55817.737490] RSP: e02b:ffff88034211dd08 EFLAGS: 00010282 >> [55817.737491] RAX: ffff880420bbd600 RBX: ffff88042096fa80 RCX: >> 0000000000000000 >> [55817.737492] RDX: 0000000000000000 RSI: 0000000000000091 RDI: >> 0000000000000011 >> [55817.737493] RBP: ffff880421f81000 R08: ffff88042096fa80 R09: >> ffff88034211dce4 >> [55817.737494] R10: ffff88034211dd16 R11: 0000000000000000 R12: >> ffff880421f81858 >> [55817.737495] R13: 0000000000000001 R14: 0000000000000000 R15: >> 0000000000000001 >> [55817.737498] FS: 00007fa609475740(0000) GS:ffff88043a2c0000(0000) >> knlGS:0000000000000000 >> [55817.737499] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b >> [55817.737500] CR2: ffffffffff600400 CR3: 000000033a673000 CR4: >> 0000000000002660 >> [55817.737501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [55817.737503] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >> 0000000000000400 >> [55817.737504] Process xl (pid: 18055, threadinfo ffff88034211c000, task >> ffff8804217b5ca0) >> [55817.737504] Stack: >> [55817.737505] 00000000000000a2 ffff880421f81000 0000000000000000 >> ffff8803ca2ab6c0 >> [55817.737507] ffff880421f81098 ffff880421f810f8 ffff8803ca2abc00 >> ffffffff811eb21d >> [55817.737509] ffff880421f81000 ffffffff81240822 ffff880421f81098 >> ffff880421f81000 >> [55817.737510] Call Trace: >> [55817.737513] [<ffffffff811eb21d>] ? pci_disable_msi+0x28/0x41 >> [55817.737516] [<ffffffff81240822>] ? xen_pcibk_reset_device+0x3a/0xa0 >> [55817.737518] [<ffffffff8123fd9f>] ? pcistub_init_device+0x167/0x19c >> [55817.737521] [<ffffffff81108661>] ? __kmalloc+0xd6/0xe2 >> [55817.737523] [<ffffffff8123ff08>] ? pcistub_probe+0x134/0x1b8 >> [55817.737525] [<ffffffff811df0f5>] ? local_pci_probe+0x37/0x5d >> [55817.737527] [<ffffffff811dff7c>] ? pci_device_probe+0xc2/0xe3 >> [55817.737529] [<ffffffff81277d71>] ? driver_probe_device+0xa1/0x1ac >> [55817.737532] [<ffffffff81276e38>] ? driver_bind+0x7e/0xc7 >> [55817.737534] [<ffffffff81165859>] ? sysfs_write_file+0xd3/0x10f >> [55817.737537] [<ffffffff8110fb7d>] ? vfs_write+0xa4/0xfe >> [55817.737539] [<ffffffff813c22e5>] ? _raw_spin_lock+0xe/0x2a >> [55817.737541] [<ffffffff8110fcc8>] ? sys_write+0x58/0x92 >> [55817.737543] [<ffffffff813c7f29>] ? system_call_fastpath+0x16/0x1b >> [55817.737544] Code: 8a 3b 45 31 f6 41 d0 ef 44 89 f9 45 89 ef 83 e1 07 41 >> d3 e7 eb 1c 8b 7b 0c 44 01 f7 e8 59 74 eb ff 48 83 b8 90 00 00 00 00 74 04 >> <0f> 0b eb fe 41 ff c6 45 39 fe 7c df 48 8b 5b >10 48 83 eb 10 48 >> [55817.737560] RIP [<ffffffff811eab09>] free_msi_irqs+0x5d/0x11b >> [55817.737562] RSP <ffff88034211dd08> >> [55817.737563] ---[ end trace e1c5a8a903358804 ]--- > > > Any chance anyone could help me identify the source of this problem? Can I > work around it with software, or do I need a different motherboard or video > card to make it work?cc''ing Konrad and a couple of other people who might be able to take a look at the BUG> I''m running Debian 6.0.7 x86_64 with Kernel 3.8.11, compiled with the > following config: http://tny.cz/e28f7351What version of Xen are you using? Thanks, -George
Andrew Bobulsky
2013-May-08 16:48 UTC
Re: Some MSI related bugs when trying to use VT-d. Help identify software vs hardware problem?
On May 8, 2013, at 4:45 AM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> On Tue, May 7, 2013 at 11:38 AM, Andrew Bobulsky <rulerof@gmail.com> wrote: >> Hello List! >> >> I''m having another [rather fruitless] go at trying to get PCIe passthrough >> to work on my Radeon 6990 card(s). I have an i7 920 chip in a Gigabyte >> GA-EX58-EXTREME board that I''ve flashed a modded BIOS into to add VT-d >> support... I found the BIOS image on a BIOS modding forum maybe a year or >> two ago. >> >> >> I stuck an extra Highpoint RocketU 1144A USB 3 card into the board, because >> I know it works *very well* with IOMMU and its architecture is really >> convenient... each port on the back is essentially its own PCIe device[1]. >> I was able to "xl pci-assignable-add" the usb controllers, and attach and >> detach them at will to a Server 2012 DomU. The dmesg output from that event >> looked like this: >> >>> [49857.921550] xhci_hcd 0000:06:00.0: remove, state 4 >>> [49857.921555] usb usb15: USB disconnect, device number 1 >>> [49857.921600] xHCI xhci_drop_endpoint called for root hub >>> [49857.921602] xHCI xhci_check_bandwidth called for root hub >>> [49857.921681] xhci_hcd 0000:06:00.0: USB bus 15 deregistered >>> [49857.921686] xhci_hcd 0000:06:00.0: remove, state 1 >>> [49857.921689] usb usb13: USB disconnect, device number 1 >>> [49857.921691] usb 13-1: USB disconnect, device number 4 >>> [49857.953278] xHCI xhci_drop_endpoint called for root hub >>> [49857.953280] xHCI xhci_check_bandwidth called for root hub >>> [49857.963158] xhci_hcd 0000:06:00.0: USB bus 13 deregistered >>> [49857.963455] pciback 0000:06:00.0: seizing device >>> [49857.963500] xen: registering gsi 17 triggering 0 polarity 1 >>> [49857.963503] Already setup the GSI :17 >>> [49857.963514] pciback 0000:06:00.0: MSI-X preparation failed (-38) >> >> >> Nonetheless, it works quite well! I played audio through a USB headset from >> the DomU to confirm it as well. >> >> >> >> However, when I try to "xl pci-assignable-add" one of my VGA controllers >> from the Radeon, the action completes, and "xl pci-assignable-list" shows >> the device as available, but "xl pci-attach" never completes, and attemping >> to "xl pci-assignable-add" the HDMI audo device never returns to the CLI >> either. >> >> There is no visible output in dmesg from the attempt on the HDMI audio >> device, but when I run it against the VGA controller, I get this: >> >>> [55817.715309] pciback 0000:0e:00.0: seizing device >>> [55817.737444] ------------[ cut here ]------------ >>> [55817.737447] kernel BUG at drivers/pci/msi.c:346! >>> [55817.737449] invalid opcode: 0000 [#1] PREEMPT SMP >>> [55817.737451] Modules linked in: xt_physdev iptable_filter ip_tables >>> x_tables tun parport_pc ppdev lp parport bnep rfcomm bluetooth rfkill crc16 >>> cpufreq_stats binfmt_misc fuse bridge stp llc ext2 loop snd_hda_codec_hdmi >>> snd_hda_codec_realtek joydev mperf coretemp crc32c_intel fglrx(PO) >>> snd_hda_intel snd_hda_codec snd_usb_audio microcode hid_generic mxm_wmi >>> snd_usbmidi_lib evdev psmouse snd_seq_midi snd_seq_midi_event i2c_i801 >>> pcspkr tpm_tis snd_hwdep tpm snd_rawmidi serio_raw snd_pcm i2c_core tpm_bios >>> snd_seq snd_timer snd_seq_device lpc_ich mfd_core snd ehci_pci soundcore >>> snd_page_alloc wmi xhci_hcd button processor thermal_sys sg sr_mod cdrom >>> ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_mod sd_mod crc_t10dif >>> usb_storage usbhid hid ahci libahci uhci_hcd ehci_hcd usbcore usb_common >>> e1000e >>> [55817.737482] CPU 6 >>> [55817.737484] Pid: 18055, comm: xl Tainted: P O 3.8.11 #1 >>> Gigabyte Technology Co., Ltd. EX58-EXTREME/EX58-EXTREME >>> [55817.737485] RIP: e030:[<ffffffff811eab09>] [<ffffffff811eab09>] >>> free_msi_irqs+0x5d/0x11b >>> [55817.737490] RSP: e02b:ffff88034211dd08 EFLAGS: 00010282 >>> [55817.737491] RAX: ffff880420bbd600 RBX: ffff88042096fa80 RCX: >>> 0000000000000000 >>> [55817.737492] RDX: 0000000000000000 RSI: 0000000000000091 RDI: >>> 0000000000000011 >>> [55817.737493] RBP: ffff880421f81000 R08: ffff88042096fa80 R09: >>> ffff88034211dce4 >>> [55817.737494] R10: ffff88034211dd16 R11: 0000000000000000 R12: >>> ffff880421f81858 >>> [55817.737495] R13: 0000000000000001 R14: 0000000000000000 R15: >>> 0000000000000001 >>> [55817.737498] FS: 00007fa609475740(0000) GS:ffff88043a2c0000(0000) >>> knlGS:0000000000000000 >>> [55817.737499] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b >>> [55817.737500] CR2: ffffffffff600400 CR3: 000000033a673000 CR4: >>> 0000000000002660 >>> [55817.737501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>> 0000000000000000 >>> [55817.737503] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >>> 0000000000000400 >>> [55817.737504] Process xl (pid: 18055, threadinfo ffff88034211c000, task >>> ffff8804217b5ca0) >>> [55817.737504] Stack: >>> [55817.737505] 00000000000000a2 ffff880421f81000 0000000000000000 >>> ffff8803ca2ab6c0 >>> [55817.737507] ffff880421f81098 ffff880421f810f8 ffff8803ca2abc00 >>> ffffffff811eb21d >>> [55817.737509] ffff880421f81000 ffffffff81240822 ffff880421f81098 >>> ffff880421f81000 >>> [55817.737510] Call Trace: >>> [55817.737513] [<ffffffff811eb21d>] ? pci_disable_msi+0x28/0x41 >>> [55817.737516] [<ffffffff81240822>] ? xen_pcibk_reset_device+0x3a/0xa0 >>> [55817.737518] [<ffffffff8123fd9f>] ? pcistub_init_device+0x167/0x19c >>> [55817.737521] [<ffffffff81108661>] ? __kmalloc+0xd6/0xe2 >>> [55817.737523] [<ffffffff8123ff08>] ? pcistub_probe+0x134/0x1b8 >>> [55817.737525] [<ffffffff811df0f5>] ? local_pci_probe+0x37/0x5d >>> [55817.737527] [<ffffffff811dff7c>] ? pci_device_probe+0xc2/0xe3 >>> [55817.737529] [<ffffffff81277d71>] ? driver_probe_device+0xa1/0x1ac >>> [55817.737532] [<ffffffff81276e38>] ? driver_bind+0x7e/0xc7 >>> [55817.737534] [<ffffffff81165859>] ? sysfs_write_file+0xd3/0x10f >>> [55817.737537] [<ffffffff8110fb7d>] ? vfs_write+0xa4/0xfe >>> [55817.737539] [<ffffffff813c22e5>] ? _raw_spin_lock+0xe/0x2a >>> [55817.737541] [<ffffffff8110fcc8>] ? sys_write+0x58/0x92 >>> [55817.737543] [<ffffffff813c7f29>] ? system_call_fastpath+0x16/0x1b >>> [55817.737544] Code: 8a 3b 45 31 f6 41 d0 ef 44 89 f9 45 89 ef 83 e1 07 41 >>> d3 e7 eb 1c 8b 7b 0c 44 01 f7 e8 59 74 eb ff 48 83 b8 90 00 00 00 00 74 04 >>> <0f> 0b eb fe 41 ff c6 45 39 fe 7c df 48 8b 5b >10 48 83 eb 10 48 >>> [55817.737560] RIP [<ffffffff811eab09>] free_msi_irqs+0x5d/0x11b >>> [55817.737562] RSP <ffff88034211dd08> >>> [55817.737563] ---[ end trace e1c5a8a903358804 ]--- >> >> >> Any chance anyone could help me identify the source of this problem? Can I >> work around it with software, or do I need a different motherboard or video >> card to make it work? > > cc''ing Konrad and a couple of other people who might be able to take a > look at the BUG > >> I''m running Debian 6.0.7 x86_64 with Kernel 3.8.11, compiled with the >> following config: http://tny.cz/e28f7351 > > What version of Xen are you using? > > Thanks, > -GeorgeHeh, of all the details to leave out! I''m on Xen 4.2.1, built from the tar ball on the website. Thanks, Andrew
Konrad Rzeszutek Wilk
2013-May-10 13:36 UTC
Re: Some MSI related bugs when trying to use VT-d. Help identify software vs hardware problem?
On Wed, May 08, 2013 at 09:45:14AM +0100, George Dunlap wrote:> On Tue, May 7, 2013 at 11:38 AM, Andrew Bobulsky <rulerof@gmail.com> wrote: > > Hello List! > > > > I''m having another [rather fruitless] go at trying to get PCIe passthrough > > to work on my Radeon 6990 card(s). I have an i7 920 chip in a Gigabyte > > GA-EX58-EXTREME board that I''ve flashed a modded BIOS into to add VT-d > > support... I found the BIOS image on a BIOS modding forum maybe a year or > > two ago. > > > > > > I stuck an extra Highpoint RocketU 1144A USB 3 card into the board, because > > I know it works *very well* with IOMMU and its architecture is really > > convenient... each port on the back is essentially its own PCIe device[1]. > > I was able to "xl pci-assignable-add" the usb controllers, and attach and > > detach them at will to a Server 2012 DomU. The dmesg output from that event > > looked like this: > > > >> [49857.921550] xhci_hcd 0000:06:00.0: remove, state 4 > >> [49857.921555] usb usb15: USB disconnect, device number 1 > >> [49857.921600] xHCI xhci_drop_endpoint called for root hub > >> [49857.921602] xHCI xhci_check_bandwidth called for root hub > >> [49857.921681] xhci_hcd 0000:06:00.0: USB bus 15 deregistered > >> [49857.921686] xhci_hcd 0000:06:00.0: remove, state 1 > >> [49857.921689] usb usb13: USB disconnect, device number 1 > >> [49857.921691] usb 13-1: USB disconnect, device number 4 > >> [49857.953278] xHCI xhci_drop_endpoint called for root hub > >> [49857.953280] xHCI xhci_check_bandwidth called for root hub > >> [49857.963158] xhci_hcd 0000:06:00.0: USB bus 13 deregistered > >> [49857.963455] pciback 0000:06:00.0: seizing device > >> [49857.963500] xen: registering gsi 17 triggering 0 polarity 1 > >> [49857.963503] Already setup the GSI :17 > >> [49857.963514] pciback 0000:06:00.0: MSI-X preparation failed (-38) > > > > > > Nonetheless, it works quite well! I played audio through a USB headset from > > the DomU to confirm it as well. > > > > > > > > However, when I try to "xl pci-assignable-add" one of my VGA controllers > > from the Radeon, the action completes, and "xl pci-assignable-list" shows > > the device as available, but "xl pci-attach" never completes, and attemping > > to "xl pci-assignable-add" the HDMI audo device never returns to the CLI > > either.OK, and is 0e:00.0 your VGA controller? How do you assign the VGA controller? Do you do: echo "0000:0e.00.0" > /sys/../radeon/unbind echo "0000:0e.00.0" > /sys/../pciback/new_slot echo "0000:0e.00.0" > /sys/../pciback/bind ?> > > > There is no visible output in dmesg from the attempt on the HDMI audio > > device, but when I run it against the VGA controller, I get this: > > > >> [55817.715309] pciback 0000:0e:00.0: seizing device > >> [55817.737444] ------------[ cut here ]------------ > >> [55817.737447] kernel BUG at drivers/pci/msi.c:346! > >> [55817.737449] invalid opcode: 0000 [#1] PREEMPT SMP > >> [55817.737451] Modules linked in: xt_physdev iptable_filter ip_tables > >> x_tables tun parport_pc ppdev lp parport bnep rfcomm bluetooth rfkill crc16 > >> cpufreq_stats binfmt_misc fuse bridge stp llc ext2 loop snd_hda_codec_hdmi > >> snd_hda_codec_realtek joydev mperf coretemp crc32c_intel fglrx(PO) > >> snd_hda_intel snd_hda_codec snd_usb_audio microcode hid_generic mxm_wmi > >> snd_usbmidi_lib evdev psmouse snd_seq_midi snd_seq_midi_event i2c_i801 > >> pcspkr tpm_tis snd_hwdep tpm snd_rawmidi serio_raw snd_pcm i2c_core tpm_bios > >> snd_seq snd_timer snd_seq_device lpc_ich mfd_core snd ehci_pci soundcore > >> snd_page_alloc wmi xhci_hcd button processor thermal_sys sg sr_mod cdrom > >> ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_mod sd_mod crc_t10dif > >> usb_storage usbhid hid ahci libahci uhci_hcd ehci_hcd usbcore usb_common > >> e1000e > >> [55817.737482] CPU 6 > >> [55817.737484] Pid: 18055, comm: xl Tainted: P O 3.8.11 #1 > >> Gigabyte Technology Co., Ltd. EX58-EXTREME/EX58-EXTREME > >> [55817.737485] RIP: e030:[<ffffffff811eab09>] [<ffffffff811eab09>] > >> free_msi_irqs+0x5d/0x11b > >> [55817.737490] RSP: e02b:ffff88034211dd08 EFLAGS: 00010282 > >> [55817.737491] RAX: ffff880420bbd600 RBX: ffff88042096fa80 RCX: > >> 0000000000000000 > >> [55817.737492] RDX: 0000000000000000 RSI: 0000000000000091 RDI: > >> 0000000000000011 > >> [55817.737493] RBP: ffff880421f81000 R08: ffff88042096fa80 R09: > >> ffff88034211dce4 > >> [55817.737494] R10: ffff88034211dd16 R11: 0000000000000000 R12: > >> ffff880421f81858 > >> [55817.737495] R13: 0000000000000001 R14: 0000000000000000 R15: > >> 0000000000000001 > >> [55817.737498] FS: 00007fa609475740(0000) GS:ffff88043a2c0000(0000) > >> knlGS:0000000000000000 > >> [55817.737499] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > >> [55817.737500] CR2: ffffffffff600400 CR3: 000000033a673000 CR4: > >> 0000000000002660 > >> [55817.737501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > >> 0000000000000000 > >> [55817.737503] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > >> 0000000000000400 > >> [55817.737504] Process xl (pid: 18055, threadinfo ffff88034211c000, task > >> ffff8804217b5ca0) > >> [55817.737504] Stack: > >> [55817.737505] 00000000000000a2 ffff880421f81000 0000000000000000 > >> ffff8803ca2ab6c0 > >> [55817.737507] ffff880421f81098 ffff880421f810f8 ffff8803ca2abc00 > >> ffffffff811eb21d > >> [55817.737509] ffff880421f81000 ffffffff81240822 ffff880421f81098 > >> ffff880421f81000 > >> [55817.737510] Call Trace: > >> [55817.737513] [<ffffffff811eb21d>] ? pci_disable_msi+0x28/0x41 > >> [55817.737516] [<ffffffff81240822>] ? xen_pcibk_reset_device+0x3a/0xa0 > >> [55817.737518] [<ffffffff8123fd9f>] ? pcistub_init_device+0x167/0x19c > >> [55817.737521] [<ffffffff81108661>] ? __kmalloc+0xd6/0xe2 > >> [55817.737523] [<ffffffff8123ff08>] ? pcistub_probe+0x134/0x1b8 > >> [55817.737525] [<ffffffff811df0f5>] ? local_pci_probe+0x37/0x5d > >> [55817.737527] [<ffffffff811dff7c>] ? pci_device_probe+0xc2/0xe3 > >> [55817.737529] [<ffffffff81277d71>] ? driver_probe_device+0xa1/0x1ac > >> [55817.737532] [<ffffffff81276e38>] ? driver_bind+0x7e/0xc7 > >> [55817.737534] [<ffffffff81165859>] ? sysfs_write_file+0xd3/0x10f > >> [55817.737537] [<ffffffff8110fb7d>] ? vfs_write+0xa4/0xfe > >> [55817.737539] [<ffffffff813c22e5>] ? _raw_spin_lock+0xe/0x2a > >> [55817.737541] [<ffffffff8110fcc8>] ? sys_write+0x58/0x92 > >> [55817.737543] [<ffffffff813c7f29>] ? system_call_fastpath+0x16/0x1b > >> [55817.737544] Code: 8a 3b 45 31 f6 41 d0 ef 44 89 f9 45 89 ef 83 e1 07 41 > >> d3 e7 eb 1c 8b 7b 0c 44 01 f7 e8 59 74 eb ff 48 83 b8 90 00 00 00 00 74 04 > >> <0f> 0b eb fe 41 ff c6 45 39 fe 7c df 48 8b 5b >10 48 83 eb 10 48 > >> [55817.737560] RIP [<ffffffff811eab09>] free_msi_irqs+0x5d/0x11b > >> [55817.737562] RSP <ffff88034211dd08> > >> [55817.737563] ---[ end trace e1c5a8a903358804 ]--- > > > > > > Any chance anyone could help me identify the source of this problem? Can I > > work around it with software, or do I need a different motherboard or video > > card to make it work? > > cc''ing Konrad and a couple of other people who might be able to take a > look at the BUGThat looks to be: 344 #ifdef CONFIG_GENERIC_HARDIRQS 345 for (i = 0; i < nvec; i++) 346 BUG_ON(irq_has_action(entry->irq + i)); 347 #endif I have to say I hadn''t actually compiled the kernel with GENERIC_HARDIRQS in a while. Looking at the code the issue seems that the MSI is enabled when the PCI device was assigned to xen-pciback. That looks like a bug in the radeon driver. And based on our config (thanks!) you could also do this on your Linux command line: xen-pciback.hide=(0e:00.0) If you do that and try to pass in the radeon device does it work? Thanks!> > > I''m running Debian 6.0.7 x86_64 with Kernel 3.8.11, compiled with the > > following config: http://tny.cz/e28f7351 > > What version of Xen are you using? > > Thanks, > -George > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Andrew Bobulsky
2013-May-10 15:20 UTC
Re: Some MSI related bugs when trying to use VT-d. Help identify software vs hardware problem?
Hello Konrad, Thanks for getting back to me! On May 10, 2013, at 9:36 AM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: On Wed, May 08, 2013 at 09:45:14AM +0100, George Dunlap wrote: On Tue, May 7, 2013 at 11:38 AM, Andrew Bobulsky <rulerof@gmail.com> wrote: Hello List! I''m having another [rather fruitless] go at trying to get PCIe passthrough to work on my Radeon 6990 card(s). I have an i7 920 chip in a Gigabyte GA-EX58-EXTREME board that I''ve flashed a modded BIOS into to add VT-d support... I found the BIOS image on a BIOS modding forum maybe a year or two ago. I stuck an extra Highpoint RocketU 1144A USB 3 card into the board, because I know it works *very well* with IOMMU and its architecture is really convenient... each port on the back is essentially its own PCIe device[1]. I was able to "xl pci-assignable-add" the usb controllers, and attach and detach them at will to a Server 2012 DomU. The dmesg output from that event looked like this: [49857.921550] xhci_hcd 0000:06:00.0: remove, state 4 [49857.921555] usb usb15: USB disconnect, device number 1 [49857.921600] xHCI xhci_drop_endpoint called for root hub [49857.921602] xHCI xhci_check_bandwidth called for root hub [49857.921681] xhci_hcd 0000:06:00.0: USB bus 15 deregistered [49857.921686] xhci_hcd 0000:06:00.0: remove, state 1 [49857.921689] usb usb13: USB disconnect, device number 1 [49857.921691] usb 13-1: USB disconnect, device number 4 [49857.953278] xHCI xhci_drop_endpoint called for root hub [49857.953280] xHCI xhci_check_bandwidth called for root hub [49857.963158] xhci_hcd 0000:06:00.0: USB bus 13 deregistered [49857.963455] pciback 0000:06:00.0: seizing device [49857.963500] xen: registering gsi 17 triggering 0 polarity 1 [49857.963503] Already setup the GSI :17 [49857.963514] pciback 0000:06:00.0: MSI-X preparation failed (-38) Nonetheless, it works quite well! I played audio through a USB headset from the DomU to confirm it as well. However, when I try to "xl pci-assignable-add" one of my VGA controllers from the Radeon, the action completes, and "xl pci-assignable-list" shows the device as available, but "xl pci-attach" never completes, and attemping to "xl pci-assignable-add" the HDMI audo device never returns to the CLI either. OK, and is 0e:00.0 your VGA controller? How do you assign the VGA controller? Do you do: echo "0000:0e.00.0" > /sys/../radeon/unbind echo "0000:0e.00.0" > /sys/../pciback/new_slot echo "0000:0e.00.0" > /sys/../pciback/bind ? I tried both the sysfs methods suggested by the wiki, as well as trying to hide the device via grub. Pciback is compiled into the kernel (I can''t say I verified it, but it''s definitely loaded and doesn''t show up on lsmod :P). I didn''t check my dmesg output when using sysfs, as I just figured I was doing something wrong as certain objects either didn''t exist or wouldn''t accept input via the ">" operator. I can''t remember precisely, but I''d be willing to go back and find out if you like. I basically ended up doing "xl pci-assignable-add" to all of the functions I needed. In the case of the 6990, there are four, which for me are 0:d:0.0-1, and 0:e:0.0-1. Two VGA controllers and their respective HDMI audio devices. There are two of these cards in the system, a total of 8 functions: four VGA controllers and four HDMI audio devices. The four listed here are on my "unused" card. The AMD CCC does detect all four devices, so I assume that the driver is bound to all of them by the time I get around to running my xl commands. There is no visible output in dmesg from the attempt on the HDMI audio device, but when I run it against the VGA controller, I get this: [55817.715309] pciback 0000:0e:00.0: seizing device [55817.737444] ------------[ cut here ]------------ [55817.737447] kernel BUG at drivers/pci/msi.c:346! [55817.737449] invalid opcode: 0000 [#1] PREEMPT SMP [55817.737451] Modules linked in: xt_physdev iptable_filter ip_tables x_tables tun parport_pc ppdev lp parport bnep rfcomm bluetooth rfkill crc16 cpufreq_stats binfmt_misc fuse bridge stp llc ext2 loop snd_hda_codec_hdmi snd_hda_codec_realtek joydev mperf coretemp crc32c_intel fglrx(PO) snd_hda_intel snd_hda_codec snd_usb_audio microcode hid_generic mxm_wmi snd_usbmidi_lib evdev psmouse snd_seq_midi snd_seq_midi_event i2c_i801 pcspkr tpm_tis snd_hwdep tpm snd_rawmidi serio_raw snd_pcm i2c_core tpm_bios snd_seq snd_timer snd_seq_device lpc_ich mfd_core snd ehci_pci soundcore snd_page_alloc wmi xhci_hcd button processor thermal_sys sg sr_mod cdrom ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_mod sd_mod crc_t10dif usb_storage usbhid hid ahci libahci uhci_hcd ehci_hcd usbcore usb_common e1000e [55817.737482] CPU 6 [55817.737484] Pid: 18055, comm: xl Tainted: P O 3.8.11 #1 Gigabyte Technology Co., Ltd. EX58-EXTREME/EX58-EXTREME [55817.737485] RIP: e030:[<ffffffff811eab09>] [<ffffffff811eab09>] free_msi_irqs+0x5d/0x11b [55817.737490] RSP: e02b:ffff88034211dd08 EFLAGS: 00010282 [55817.737491] RAX: ffff880420bbd600 RBX: ffff88042096fa80 RCX: 0000000000000000 [55817.737492] RDX: 0000000000000000 RSI: 0000000000000091 RDI: 0000000000000011 [55817.737493] RBP: ffff880421f81000 R08: ffff88042096fa80 R09: ffff88034211dce4 [55817.737494] R10: ffff88034211dd16 R11: 0000000000000000 R12: ffff880421f81858 [55817.737495] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001 [55817.737498] FS: 00007fa609475740(0000) GS:ffff88043a2c0000(0000) knlGS:0000000000000000 [55817.737499] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [55817.737500] CR2: ffffffffff600400 CR3: 000000033a673000 CR4: 0000000000002660 [55817.737501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [55817.737503] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [55817.737504] Process xl (pid: 18055, threadinfo ffff88034211c000, task ffff8804217b5ca0) [55817.737504] Stack: [55817.737505] 00000000000000a2 ffff880421f81000 0000000000000000 ffff8803ca2ab6c0 [55817.737507] ffff880421f81098 ffff880421f810f8 ffff8803ca2abc00 ffffffff811eb21d [55817.737509] ffff880421f81000 ffffffff81240822 ffff880421f81098 ffff880421f81000 [55817.737510] Call Trace: [55817.737513] [<ffffffff811eb21d>] ? pci_disable_msi+0x28/0x41 [55817.737516] [<ffffffff81240822>] ? xen_pcibk_reset_device+0x3a/0xa0 [55817.737518] [<ffffffff8123fd9f>] ? pcistub_init_device+0x167/0x19c [55817.737521] [<ffffffff81108661>] ? __kmalloc+0xd6/0xe2 [55817.737523] [<ffffffff8123ff08>] ? pcistub_probe+0x134/0x1b8 [55817.737525] [<ffffffff811df0f5>] ? local_pci_probe+0x37/0x5d [55817.737527] [<ffffffff811dff7c>] ? pci_device_probe+0xc2/0xe3 [55817.737529] [<ffffffff81277d71>] ? driver_probe_device+0xa1/0x1ac [55817.737532] [<ffffffff81276e38>] ? driver_bind+0x7e/0xc7 [55817.737534] [<ffffffff81165859>] ? sysfs_write_file+0xd3/0x10f [55817.737537] [<ffffffff8110fb7d>] ? vfs_write+0xa4/0xfe [55817.737539] [<ffffffff813c22e5>] ? _raw_spin_lock+0xe/0x2a [55817.737541] [<ffffffff8110fcc8>] ? sys_write+0x58/0x92 [55817.737543] [<ffffffff813c7f29>] ? system_call_fastpath+0x16/0x1b [55817.737544] Code: 8a 3b 45 31 f6 41 d0 ef 44 89 f9 45 89 ef 83 e1 07 41 d3 e7 eb 1c 8b 7b 0c 44 01 f7 e8 59 74 eb ff 48 83 b8 90 00 00 00 00 74 04 <0f> 0b eb fe 41 ff c6 45 39 fe 7c df 48 8b 5b >10 48 83 eb 10 48 [55817.737560] RIP [<ffffffff811eab09>] free_msi_irqs+0x5d/0x11b [55817.737562] RSP <ffff88034211dd08> [55817.737563] ---[ end trace e1c5a8a903358804 ]--- Any chance anyone could help me identify the source of this problem? Can I work around it with software, or do I need a different motherboard or video card to make it work? cc''ing Konrad and a couple of other people who might be able to take a look at the BUG That looks to be: 344 #ifdef CONFIG_GENERIC_HARDIRQS 345 for (i = 0; i < nvec; i++) 346 BUG_ON(irq_has_action(entry->irq + i)); 347 #endif I have to say I hadn''t actually compiled the kernel with GENERIC_HARDIRQS in a while. Looking at the code the issue seems that the MSI is enabled when the PCI device was assigned to xen-pciback. That looks like a bug in the radeon driver. And based on our config (thanks!) you could also do this on your Linux command line: xen-pciback.hide=(0e:00.0) If you do that and try to pass in the radeon device does it work? Strangely, this was actually the second or third thing I tried... It''s like my system is ignoring it for some reason. I hid *all* of the devices I wanted to use via grub... But they weren''t hidden. I don''t recall if I tried it with my USB 3 controller, as *it* worked just fine when yanking it from Dom0 via xl. I ended up on kernel 3.8.11 because various other kernels I tried, including 3.4.9 and something from the 3.7.x line both resulted in X failing to start, but only when booting Xen. After playing kernel shuffle for a while, I''ve landed, late last night, on 3.4.44, and it actually resolved my MSI problem! Funny enough, every kernel I''ve tried has been built with the config I posted too, 3.4.44 included! If I was just "doing it wrong" with the 3.8.11 kernel, then the solution of "use 3.4.44 instead" is acceptable for me, but if I''ve truly hit on some kind of kernel bug that bears resolving, I still have the dpkg images that I built of all the kernels in question and can collect more data as needed. I''m rebooting a lot right now anyway. ;) Still, I''m troubled by my inability to hide devices via grub. I just expected it to work and am flabbergasted that it won''t... I''ll send a copy of my grub config if you like, but I don''t want to pollute the developers list with an issue that''s probably more appropriate for the users list. :) Thanks so much for your time and help, Konrad! I really, really appreciate it! Best Regards, Andrew Bobulsky Thanks! I''m running Debian 6.0.7 x86_64 with Kernel 3.8.11, compiled with the following config: http://tny.cz/e28f7351 What version of Xen are you using? Thanks, -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-May-22 20:21 UTC
Re: Some MSI related bugs when trying to use VT-d. Help identify software vs hardware problem?
On Fri, May 10, 2013 at 11:20:28AM -0400, Andrew Bobulsky wrote:> Hello Konrad, > > Thanks for getting back to me!Sure, sorry for the long response, vacation and other things have delayed my response time.> > On May 10, 2013, at 9:36 AM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > wrote: > > On Wed, May 08, 2013 at 09:45:14AM +0100, George Dunlap wrote: > > On Tue, May 7, 2013 at 11:38 AM, Andrew Bobulsky <rulerof@gmail.com> wrote: > > Hello List! > > > I''m having another [rather fruitless] go at trying to get PCIe passthrough > > to work on my Radeon 6990 card(s). I have an i7 920 chip in a Gigabyte > > GA-EX58-EXTREME board that I''ve flashed a modded BIOS into to add VT-d > > support... I found the BIOS image on a BIOS modding forum maybe a year or > > two ago. > > > > I stuck an extra Highpoint RocketU 1144A USB 3 card into the board, because > > I know it works *very well* with IOMMU and its architecture is really > > convenient... each port on the back is essentially its own PCIe device[1]. > > I was able to "xl pci-assignable-add" the usb controllers, and attach and > > detach them at will to a Server 2012 DomU. The dmesg output from that event > > looked like this: > > > [49857.921550] xhci_hcd 0000:06:00.0: remove, state 4 > > [49857.921555] usb usb15: USB disconnect, device number 1 > > [49857.921600] xHCI xhci_drop_endpoint called for root hub > > [49857.921602] xHCI xhci_check_bandwidth called for root hub > > [49857.921681] xhci_hcd 0000:06:00.0: USB bus 15 deregistered > > [49857.921686] xhci_hcd 0000:06:00.0: remove, state 1 > > [49857.921689] usb usb13: USB disconnect, device number 1 > > [49857.921691] usb 13-1: USB disconnect, device number 4 > > [49857.953278] xHCI xhci_drop_endpoint called for root hub > > [49857.953280] xHCI xhci_check_bandwidth called for root hub > > [49857.963158] xhci_hcd 0000:06:00.0: USB bus 13 deregistered > > [49857.963455] pciback 0000:06:00.0: seizing device > > [49857.963500] xen: registering gsi 17 triggering 0 polarity 1 > > [49857.963503] Already setup the GSI :17 > > [49857.963514] pciback 0000:06:00.0: MSI-X preparation failed (-38) > > > > Nonetheless, it works quite well! I played audio through a USB headset from > > the DomU to confirm it as well. > > > > > However, when I try to "xl pci-assignable-add" one of my VGA controllers > > from the Radeon, the action completes, and "xl pci-assignable-list" shows > > the device as available, but "xl pci-attach" never completes, and attemping > > to "xl pci-assignable-add" the HDMI audo device never returns to the CLI > > either. > > > OK, and is 0e:00.0 your VGA controller? > > How do you assign the VGA controller? Do you do: > > echo "0000:0e.00.0" > /sys/../radeon/unbind > echo "0000:0e.00.0" > /sys/../pciback/new_slot > echo "0000:0e.00.0" > /sys/../pciback/bind > ? > > > I tried both the sysfs methods suggested by the wiki, as well as trying to > hide the device via grub. Pciback is compiled into the kernel (I can''t say > I verified it, but it''s definitely loaded and doesn''t show up on lsmod :P). > > I didn''t check my dmesg output when using sysfs, as I just figured I was > doing something wrong as certain objects either didn''t exist or wouldn''t > accept input via the ">" operator. I can''t remember precisely, but I''d be > willing to go back and find out if you like. > > I basically ended up doing "xl pci-assignable-add" to all of the functions > I needed. In the case of the 6990, there are four, which for me are > 0:d:0.0-1, and 0:e:0.0-1. Two VGA controllers and their respective HDMI > audio devices. > > There are two of these cards in the system, a total of 8 functions: four > VGA controllers and four HDMI audio devices. The four listed here are on > my "unused" card. The AMD CCC does detect all four devices, so I assume > that the driver is bound to all of them by the time I get around to running > my xl commands. > > > There is no visible output in dmesg from the attempt on the HDMI audio > > device, but when I run it against the VGA controller, I get this: > > > [55817.715309] pciback 0000:0e:00.0: seizing device > > [55817.737444] ------------[ cut here ]------------ > > [55817.737447] kernel BUG at drivers/pci/msi.c:346! > > [55817.737449] invalid opcode: 0000 [#1] PREEMPT SMP > > [55817.737451] Modules linked in: xt_physdev iptable_filter ip_tables > > x_tables tun parport_pc ppdev lp parport bnep rfcomm bluetooth rfkill crc16 > > cpufreq_stats binfmt_misc fuse bridge stp llc ext2 loop snd_hda_codec_hdmi > > snd_hda_codec_realtek joydev mperf coretemp crc32c_intel fglrx(PO) > > snd_hda_intel snd_hda_codec snd_usb_audio microcode hid_generic mxm_wmi > > snd_usbmidi_lib evdev psmouse snd_seq_midi snd_seq_midi_event i2c_i801 > > pcspkr tpm_tis snd_hwdep tpm snd_rawmidi serio_raw snd_pcm i2c_core tpm_bios > > snd_seq snd_timer snd_seq_device lpc_ich mfd_core snd ehci_pci soundcore > > snd_page_alloc wmi xhci_hcd button processor thermal_sys sg sr_mod cdrom > > ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_mod sd_mod crc_t10dif > > usb_storage usbhid hid ahci libahci uhci_hcd ehci_hcd usbcore usb_common > > e1000e > > [55817.737482] CPU 6 > > [55817.737484] Pid: 18055, comm: xl Tainted: P O 3.8.11 #1 > > Gigabyte Technology Co., Ltd. EX58-EXTREME/EX58-EXTREME > > [55817.737485] RIP: e030:[<ffffffff811eab09>] [<ffffffff811eab09>] > > free_msi_irqs+0x5d/0x11b > > [55817.737490] RSP: e02b:ffff88034211dd08 EFLAGS: 00010282 > > [55817.737491] RAX: ffff880420bbd600 RBX: ffff88042096fa80 RCX: > > 0000000000000000 > > [55817.737492] RDX: 0000000000000000 RSI: 0000000000000091 RDI: > > 0000000000000011 > > [55817.737493] RBP: ffff880421f81000 R08: ffff88042096fa80 R09: > > ffff88034211dce4 > > [55817.737494] R10: ffff88034211dd16 R11: 0000000000000000 R12: > > ffff880421f81858 > > [55817.737495] R13: 0000000000000001 R14: 0000000000000000 R15: > > 0000000000000001 > > [55817.737498] FS: 00007fa609475740(0000) GS:ffff88043a2c0000(0000) > > knlGS:0000000000000000 > > [55817.737499] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > > [55817.737500] CR2: ffffffffff600400 CR3: 000000033a673000 CR4: > > 0000000000002660 > > [55817.737501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > > 0000000000000000 > > [55817.737503] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > > 0000000000000400 > > [55817.737504] Process xl (pid: 18055, threadinfo ffff88034211c000, task > > ffff8804217b5ca0) > > [55817.737504] Stack: > > [55817.737505] 00000000000000a2 ffff880421f81000 0000000000000000 > > ffff8803ca2ab6c0 > > [55817.737507] ffff880421f81098 ffff880421f810f8 ffff8803ca2abc00 > > ffffffff811eb21d > > [55817.737509] ffff880421f81000 ffffffff81240822 ffff880421f81098 > > ffff880421f81000 > > [55817.737510] Call Trace: > > [55817.737513] [<ffffffff811eb21d>] ? pci_disable_msi+0x28/0x41 > > [55817.737516] [<ffffffff81240822>] ? xen_pcibk_reset_device+0x3a/0xa0 > > [55817.737518] [<ffffffff8123fd9f>] ? pcistub_init_device+0x167/0x19c > > [55817.737521] [<ffffffff81108661>] ? __kmalloc+0xd6/0xe2 > > [55817.737523] [<ffffffff8123ff08>] ? pcistub_probe+0x134/0x1b8 > > [55817.737525] [<ffffffff811df0f5>] ? local_pci_probe+0x37/0x5d > > [55817.737527] [<ffffffff811dff7c>] ? pci_device_probe+0xc2/0xe3 > > [55817.737529] [<ffffffff81277d71>] ? driver_probe_device+0xa1/0x1ac > > [55817.737532] [<ffffffff81276e38>] ? driver_bind+0x7e/0xc7 > > [55817.737534] [<ffffffff81165859>] ? sysfs_write_file+0xd3/0x10f > > [55817.737537] [<ffffffff8110fb7d>] ? vfs_write+0xa4/0xfe > > [55817.737539] [<ffffffff813c22e5>] ? _raw_spin_lock+0xe/0x2a > > [55817.737541] [<ffffffff8110fcc8>] ? sys_write+0x58/0x92 > > [55817.737543] [<ffffffff813c7f29>] ? system_call_fastpath+0x16/0x1b > > [55817.737544] Code: 8a 3b 45 31 f6 41 d0 ef 44 89 f9 45 89 ef 83 e1 07 41 > > d3 e7 eb 1c 8b 7b 0c 44 01 f7 e8 59 74 eb ff 48 83 b8 90 00 00 00 00 74 04 > > <0f> 0b eb fe 41 ff c6 45 39 fe 7c df 48 8b 5b >10 48 83 eb 10 48 > > [55817.737560] RIP [<ffffffff811eab09>] free_msi_irqs+0x5d/0x11b > > [55817.737562] RSP <ffff88034211dd08> > > [55817.737563] ---[ end trace e1c5a8a903358804 ]--- > > > > Any chance anyone could help me identify the source of this problem? Can I > > work around it with software, or do I need a different motherboard or video > > card to make it work? > > > cc''ing Konrad and a couple of other people who might be able to take a > > look at the BUG > > > That looks to be: > > 344 #ifdef CONFIG_GENERIC_HARDIRQS > > 345 for (i = 0; i < nvec; i++) > > 346 BUG_ON(irq_has_action(entry->irq + i)); > > 347 #endif > > I have to say I hadn''t actually compiled the kernel with GENERIC_HARDIRQS > in a while.I take it back, it looks as if my kernel has been compiling with that option.> > Looking at the code the issue seems that the MSI is enabled when > the PCI device was assigned to xen-pciback. That looks like a bug > in the radeon driver. > > And based on our config (thanks!) you could also do this on your Linux > command line: > > xen-pciback.hide=(0e:00.0) > > If you do that and try to pass in the radeon device does it work? > > > Strangely, this was actually the second or third thing I tried... It''s like > my system is ignoring it for some reason. I hid *all* of the devices I > wanted to use via grub... But they weren''t hidden. I don''t recall if I > tried it with my USB 3 controller, as *it* worked just fine when yanking it > from Dom0 via xl.What do you mean by hidden? Are you thinking that by doing ''lspci'' you wouldn''t see them? They will be visible in your dom0. It is just that the driver (radeon) won''t bind to them. You will see something like this (look for pciback): 01:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter Flags: fast devsel, IRQ 16 Memory at fe4a0000 (32-bit, non-prefetchable) [disabled] [size=128K] Memory at fe480000 (32-bit, non-prefetchable) [disabled] [size=128K] I/O ports at e020 [disabled] [size=32] Expansion ROM at fe460000 [disabled] [size=128K] Capabilities: [c8] Power Management version 2 Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 00-15-17-ff-ff-8f-18-a2 ======> Kernel driver in use: pciback <=================== Kernel modules: e1000e> > I ended up on kernel 3.8.11 because various other kernels I tried, > including 3.4.9 and something from the 3.7.x line both resulted in X > failing to start, but only when booting Xen. After playing kernel shuffleX failing to start? What card at that point are you using? Is the radeon or another? How does it fail? If you boot with ''drm.debug=255 debug loglevel=8'' on your command line what does dmesg (or /var/log/messages) show? Can you attach that please. Perhaps you can also attach the lspci so I can get an idea of what is what in your box.> for a while, I''ve landed, late last night, on 3.4.44, and it actually > resolved my MSI problem! > > Funny enough, every kernel I''ve tried has been built with the config I > posted too, 3.4.44 included! If I was just "doing it wrong" with the > 3.8.11 kernel, then the solution of "use 3.4.44 instead" is acceptable for > me, but if I''ve truly hit on some kind of kernel bug that bears resolving, > I still have the dpkg images that I built of all the kernels in question > and can collect more data as needed. I''m rebooting a lot right now anyway. > ;) > > Still, I''m troubled by my inability to hide devices via grub. I just > expected it to work and am flabbergasted that it won''t... I''ll send a copy > of my grub config if you like, but I don''t want to pollute the developers > list with an issue that''s probably more appropriate for the users list. :)cat /proc/cmdline and ''xl info'' would help.> > Thanks so much for your time and help, Konrad! I really, really appreciate > it!Sure thing.> > Best Regards, > Andrew Bobulsky > > > Thanks! > > > I''m running Debian 6.0.7 x86_64 with Kernel 3.8.11, compiled with the > > following config: http://tny.cz/e28f7351 > > > What version of Xen are you using? > > > Thanks, > > -George > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Andrew Bobulsky
2013-May-22 21:32 UTC
Re: Some MSI related bugs when trying to use VT-d. Help identify software vs hardware problem?
Hello Konrad!
On May 22, 2013, at 1:21 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com>
wrote:
On Fri, May 10, 2013 at 11:20:28AM -0400, Andrew Bobulsky wrote:
Hello Konrad,
Thanks for getting back to me!
Sure, sorry for the long response, vacation and other things have delayed
my response time.
In a sheer twist of irony, I''m now the one who is on vacation :D
I''ll be returning home on the 28th, and will be able to answer with
better
detail at that point, but I can give you a few answers in the meantime.
Here goes!
On May 10, 2013, at 9:36 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com>
wrote:
On Wed, May 08, 2013 at 09:45:14AM +0100, George Dunlap wrote:
On Tue, May 7, 2013 at 11:38 AM, Andrew Bobulsky <rulerof@gmail.com>
wrote:
Hello List!
I''m having another [rather fruitless] go at trying to get PCIe
passthrough
to work on my Radeon 6990 card(s). I have an i7 920 chip in a Gigabyte
GA-EX58-EXTREME board that I''ve flashed a modded BIOS into to add VT-d
support... I found the BIOS image on a BIOS modding forum maybe a year or
two ago.
I stuck an extra Highpoint RocketU 1144A USB 3 card into the board, because
I know it works *very well* with IOMMU and its architecture is really
convenient... each port on the back is essentially its own PCIe device[1].
I was able to "xl pci-assignable-add" the usb controllers, and attach
and
detach them at will to a Server 2012 DomU. The dmesg output from that event
looked like this:
[49857.921550] xhci_hcd 0000:06:00.0: remove, state 4
[49857.921555] usb usb15: USB disconnect, device number 1
[49857.921600] xHCI xhci_drop_endpoint called for root hub
[49857.921602] xHCI xhci_check_bandwidth called for root hub
[49857.921681] xhci_hcd 0000:06:00.0: USB bus 15 deregistered
[49857.921686] xhci_hcd 0000:06:00.0: remove, state 1
[49857.921689] usb usb13: USB disconnect, device number 1
[49857.921691] usb 13-1: USB disconnect, device number 4
[49857.953278] xHCI xhci_drop_endpoint called for root hub
[49857.953280] xHCI xhci_check_bandwidth called for root hub
[49857.963158] xhci_hcd 0000:06:00.0: USB bus 13 deregistered
[49857.963455] pciback 0000:06:00.0: seizing device
[49857.963500] xen: registering gsi 17 triggering 0 polarity 1
[49857.963503] Already setup the GSI :17
[49857.963514] pciback 0000:06:00.0: MSI-X preparation failed (-38)
Nonetheless, it works quite well! I played audio through a USB headset from
the DomU to confirm it as well.
However, when I try to "xl pci-assignable-add" one of my VGA
controllers
from the Radeon, the action completes, and "xl pci-assignable-list"
shows
the device as available, but "xl pci-attach" never completes, and
attemping
to "xl pci-assignable-add" the HDMI audo device never returns to the
CLI
either.
OK, and is 0e:00.0 your VGA controller?
How do you assign the VGA controller? Do you do:
echo "0000:0e.00.0" > /sys/../radeon/unbind
echo "0000:0e.00.0" > /sys/../pciback/new_slot
echo "0000:0e.00.0" > /sys/../pciback/bind
?
I tried both the sysfs methods suggested by the wiki, as well as trying to
hide the device via grub. Pciback is compiled into the kernel (I can''t
say
I verified it, but it''s definitely loaded and doesn''t show up
on lsmod :P).
I didn''t check my dmesg output when using sysfs, as I just figured I
was
doing something wrong as certain objects either didn''t exist or
wouldn''t
accept input via the ">" operator. I can''t remember
precisely, but I''d be
willing to go back and find out if you like.
I basically ended up doing "xl pci-assignable-add" to all of the
functions
I needed. In the case of the 6990, there are four, which for me are
0:d:0.0-1, and 0:e:0.0-1. Two VGA controllers and their respective HDMI
audio devices.
There are two of these cards in the system, a total of 8 functions: four
VGA controllers and four HDMI audio devices. The four listed here are on
my "unused" card. The AMD CCC does detect all four devices, so I
assume
that the driver is bound to all of them by the time I get around to running
my xl commands.
There is no visible output in dmesg from the attempt on the HDMI audio
device, but when I run it against the VGA controller, I get this:
[55817.715309] pciback 0000:0e:00.0: seizing device
[55817.737444] ------------[ cut here ]------------
[55817.737447] kernel BUG at drivers/pci/msi.c:346!
[55817.737449] invalid opcode: 0000 [#1] PREEMPT SMP
[55817.737451] Modules linked in: xt_physdev iptable_filter ip_tables
x_tables tun parport_pc ppdev lp parport bnep rfcomm bluetooth rfkill crc16
cpufreq_stats binfmt_misc fuse bridge stp llc ext2 loop snd_hda_codec_hdmi
snd_hda_codec_realtek joydev mperf coretemp crc32c_intel fglrx(PO)
snd_hda_intel snd_hda_codec snd_usb_audio microcode hid_generic mxm_wmi
snd_usbmidi_lib evdev psmouse snd_seq_midi snd_seq_midi_event i2c_i801
pcspkr tpm_tis snd_hwdep tpm snd_rawmidi serio_raw snd_pcm i2c_core tpm_bios
snd_seq snd_timer snd_seq_device lpc_ich mfd_core snd ehci_pci soundcore
snd_page_alloc wmi xhci_hcd button processor thermal_sys sg sr_mod cdrom
ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_mod sd_mod crc_t10dif
usb_storage usbhid hid ahci libahci uhci_hcd ehci_hcd usbcore usb_common
e1000e
[55817.737482] CPU 6
[55817.737484] Pid: 18055, comm: xl Tainted: P O 3.8.11 #1
Gigabyte Technology Co., Ltd. EX58-EXTREME/EX58-EXTREME
[55817.737485] RIP: e030:[<ffffffff811eab09>] [<ffffffff811eab09>]
free_msi_irqs+0x5d/0x11b
[55817.737490] RSP: e02b:ffff88034211dd08 EFLAGS: 00010282
[55817.737491] RAX: ffff880420bbd600 RBX: ffff88042096fa80 RCX:
0000000000000000
[55817.737492] RDX: 0000000000000000 RSI: 0000000000000091 RDI:
0000000000000011
[55817.737493] RBP: ffff880421f81000 R08: ffff88042096fa80 R09:
ffff88034211dce4
[55817.737494] R10: ffff88034211dd16 R11: 0000000000000000 R12:
ffff880421f81858
[55817.737495] R13: 0000000000000001 R14: 0000000000000000 R15:
0000000000000001
[55817.737498] FS: 00007fa609475740(0000) GS:ffff88043a2c0000(0000)
knlGS:0000000000000000
[55817.737499] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[55817.737500] CR2: ffffffffff600400 CR3: 000000033a673000 CR4:
0000000000002660
[55817.737501] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[55817.737503] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[55817.737504] Process xl (pid: 18055, threadinfo ffff88034211c000, task
ffff8804217b5ca0)
[55817.737504] Stack:
[55817.737505] 00000000000000a2 ffff880421f81000 0000000000000000
ffff8803ca2ab6c0
[55817.737507] ffff880421f81098 ffff880421f810f8 ffff8803ca2abc00
ffffffff811eb21d
[55817.737509] ffff880421f81000 ffffffff81240822 ffff880421f81098
ffff880421f81000
[55817.737510] Call Trace:
[55817.737513] [<ffffffff811eb21d>] ? pci_disable_msi+0x28/0x41
[55817.737516] [<ffffffff81240822>] ? xen_pcibk_reset_device+0x3a/0xa0
[55817.737518] [<ffffffff8123fd9f>] ? pcistub_init_device+0x167/0x19c
[55817.737521] [<ffffffff81108661>] ? __kmalloc+0xd6/0xe2
[55817.737523] [<ffffffff8123ff08>] ? pcistub_probe+0x134/0x1b8
[55817.737525] [<ffffffff811df0f5>] ? local_pci_probe+0x37/0x5d
[55817.737527] [<ffffffff811dff7c>] ? pci_device_probe+0xc2/0xe3
[55817.737529] [<ffffffff81277d71>] ? driver_probe_device+0xa1/0x1ac
[55817.737532] [<ffffffff81276e38>] ? driver_bind+0x7e/0xc7
[55817.737534] [<ffffffff81165859>] ? sysfs_write_file+0xd3/0x10f
[55817.737537] [<ffffffff8110fb7d>] ? vfs_write+0xa4/0xfe
[55817.737539] [<ffffffff813c22e5>] ? _raw_spin_lock+0xe/0x2a
[55817.737541] [<ffffffff8110fcc8>] ? sys_write+0x58/0x92
[55817.737543] [<ffffffff813c7f29>] ? system_call_fastpath+0x16/0x1b
[55817.737544] Code: 8a 3b 45 31 f6 41 d0 ef 44 89 f9 45 89 ef 83 e1 07 41
d3 e7 eb 1c 8b 7b 0c 44 01 f7 e8 59 74 eb ff 48 83 b8 90 00 00 00 00 74 04
<0f> 0b eb fe 41 ff c6 45 39 fe 7c df 48 8b 5b >10 48 83 eb 10 48
[55817.737560] RIP [<ffffffff811eab09>] free_msi_irqs+0x5d/0x11b
[55817.737562] RSP <ffff88034211dd08>
[55817.737563] ---[ end trace e1c5a8a903358804 ]---
Any chance anyone could help me identify the source of this problem? Can I
work around it with software, or do I need a different motherboard or video
card to make it work?
cc''ing Konrad and a couple of other people who might be able to take a
look at the BUG
That looks to be:
344 #ifdef CONFIG_GENERIC_HARDIRQS
345 for (i = 0; i < nvec; i++)
346 BUG_ON(irq_has_action(entry->irq + i));
347 #endif
I have to say I hadn''t actually compiled the kernel with
GENERIC_HARDIRQS
in a while.
I take it back, it looks as if my kernel has been compiling with that
option.
Looking at the code the issue seems that the MSI is enabled when
the PCI device was assigned to xen-pciback. That looks like a bug
in the radeon driver.
And based on our config (thanks!) you could also do this on your Linux
command line:
xen-pciback.hide=(0e:00.0)
If you do that and try to pass in the radeon device does it work?
Strangely, this was actually the second or third thing I tried... It''s
like
my system is ignoring it for some reason. I hid *all* of the devices I
wanted to use via grub... But they weren''t hidden. I don''t
recall if I
tried it with my USB 3 controller, as *it* worked just fine when yanking it
from Dom0 via xl.
What do you mean by hidden? Are you thinking that by doing
''lspci'' you
wouldn''t
see them? They will be visible in your dom0. It is just that the driver
(radeon) won''t bind to them. You will see something like this (look for
pciback):
01:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet
Controller (rev 06)
Subsystem: Intel Corporation PRO/1000 PT Dual Port Server Adapter
Flags: fast devsel, IRQ 16
Memory at fe4a0000 (32-bit, non-prefetchable) [disabled] [size=128K]
Memory at fe480000 (32-bit, non-prefetchable) [disabled] [size=128K]
I/O ports at e020 [disabled] [size=32]
Expansion ROM at fe460000 [disabled] [size=128K]
Capabilities: [c8] Power Management version 2
Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [e0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 00-15-17-ff-ff-8f-18-a2
======> Kernel driver in use: pciback <=================== Kernel
modules: e1000e
When I say that they weren''t hidden, while I didn''t check with
lspci -v,
running xl pci-assignable-list shows that nothing is available to pass to a
DomU. I have to xl pci-assignable-add every device I want to use
regardless of whether or not it is hidden via my GRUB command line.
Additionally, the Radeon driver was bound to the card I hid via grub, and
its HDMI audio device was also bound to snd-hda-intel (I think) in spite of
it also being in the grub command line.
xl itself mentions that it is seizing devices from others drivers in order
to bind them to pciback.
I ended up on kernel 3.8.11 because various other kernels I tried,
including 3.4.9 and something from the 3.7.x line both resulted in X
failing to start, but only when booting Xen. After playing kernel shuffle
X failing to start?
Blinking (or solid, I cant recall precisely) underscore-type cursor in the
upper left corner of the screen, and it appears to be at a BIOS-like
text-mode resolution.
What card at that point are you using? Is the radeon
or another? How does it fail?
I tried with the Radeon 6990, and also with a Radeon 5850. I tried with
and without the Radeon driver, too, it didn''t seem to make much
difference
:(
If you boot with ''drm.debug=255 debug loglevel=8'' on your
command line what does dmesg (or /var/log/messages) show? Can you attach
that
please.
I usually rebooted the system so my dmesg was out of date by the time i get
my hands on it. I could try to SSH in, which should work as the system
responds to Ctrl-Alt-Del by rebooting after a few seconds, or grab a
historical log.... I''ll report back next week :)
Perhaps you can also attach the lspci so I can get an idea of what
is what in your box.
The best I have access to is an lspci -vtQ that I ran a while back:
http://pastebin.com/raw.php?i=4dGmneYi
for a while, I''ve landed, late last night, on 3.4.44, and it actually
resolved my MSI problem!
Funny enough, every kernel I''ve tried has been built with the config I
posted too, 3.4.44 included! If I was just "doing it wrong" with the
3.8.11 kernel, then the solution of "use 3.4.44 instead" is acceptable
for
me, but if I''ve truly hit on some kind of kernel bug that bears
resolving,
I still have the dpkg images that I built of all the kernels in question
and can collect more data as needed. I''m rebooting a lot right now
anyway.
;)
Still, I''m troubled by my inability to hide devices via grub. I just
expected it to work and am flabbergasted that it won''t...
I''ll send a copy
of my grub config if you like, but I don''t want to pollute the
developers
list with an issue that''s probably more appropriate for the users list.
:)
cat /proc/cmdline and ''xl info'' would help.
Thanks so much for your time and help, Konrad! I really, really appreciate
it!
Sure thing.
Cheers,
Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-May-28 18:40 UTC
Is: radeon 6990 with dom0 is not showing graphics. Was: Re: Some MSI related bugs when trying to use VT-d. Help identify software vs hardware problem?
> When I say that they weren''t hidden, while I didn''t check with lspci -v, > running xl pci-assignable-list shows that nothing is available to pass to a > DomU. I have to xl pci-assignable-add every device I want to use > regardless of whether or not it is hidden via my GRUB command line. > Additionally, the Radeon driver was bound to the card I hid via grub, and > its HDMI audio device was also bound to snd-hda-intel (I think) in spite of > it also being in the grub command line. > > xl itself mentions that it is seizing devices from others drivers in order > to bind them to pciback.Ah, OK. that sounds good.> > > > I ended up on kernel 3.8.11 because various other kernels I tried, > > including 3.4.9 and something from the 3.7.x line both resulted in X > > failing to start, but only when booting Xen. After playing kernel shuffle > > > X failing to start? > > > Blinking (or solid, I cant recall precisely) underscore-type cursor in the > upper left corner of the screen, and it appears to be at a BIOS-like > text-mode resolution. > > What card at that point are you using? Is the radeon > or another? How does it fail? > > > I tried with the Radeon 6990, and also with a Radeon 5850. I tried with > and without the Radeon driver, too, it didn''t seem to make much difference > :( > > If you boot with ''drm.debug=255 debug loglevel=8'' on your > command line what does dmesg (or /var/log/messages) show? Can you attach > that > please. > > > I usually rebooted the system so my dmesg was out of date by the time i get > my hands on it. I could try to SSH in, which should work as the system > responds to Ctrl-Alt-Del by rebooting after a few seconds, or grab a > historical log.... I''ll report back next week :)OK, lets track this as a seperate issue. If you use ''radeon.modset=0'' on the Linux command line do you see anything.