Tobias Geiger
2012-Jul-25 12:30 UTC
Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
Hi! i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock stable): 1st: only the GPU PCI Passthrough works, the PCI USB Controller is not recognized within the DomU (HVM Win7 64) Dom0 cmdline is: ro root=LABEL=dom0root xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) security=apparmor noirqdebug nouveau.msi=1 Only 8:00.0 and 8:00.1 get passed through without problems, all the USB Controller IDs are not correctly passed through and get a exclamation mark within the win7 device manager ("could not be started"). 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry that i have no full stacktrace, all i have is a "screenshot" which i uploaded here: http://imageshack.us/photo/my-images/52/img20120724235921.jpg/ With 3.4 both issues were not there - everything worked perfectly. Tell me which debugging info you need, i may be able to re-install my netconsole to get the full stacktrace (but i had not much luck with netconsole regarding kernel panics - rarely this info gets sent before the "panic"...) Greetings Tobias
Konrad Rzeszutek Wilk
2012-Jul-25 13:43 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:> Hi! > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock > stable): > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is > not recognized within the DomU (HVM Win7 64) > Dom0 cmdline is: > ro root=LABEL=dom0root xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) > security=apparmor noirqdebug nouveau.msi=1 > > Only 8:00.0 and 8:00.1 get passed through without problems, all the > USB Controller IDs are not correctly passed through and get a > exclamation mark within the win7 device manager ("could not be > started").Ok, but they do get passed in though? As in, QEMU sees them. If you boot a Live Ubuntu/Fedora CD within the guest with the PCI passed in devices do you see them? Meaning lspci shows them? Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?> > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry > that i have no full stacktrace, all i have is a "screenshot" which i > uploaded here: > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/Ugh, that looks like somebody removed a large chunk of a pagetable. Hmm. Are you using dom0_mem=max parameter? If not, can you try that and also disable ballooning in the xm/xl config file pls?> > > With 3.4 both issues were not there - everything worked perfectly. > Tell me which debugging info you need, i may be able to re-install > my netconsole to get the full stacktrace (but i had not much luck > with netconsole regarding kernel panics - rarely this info gets sent > before the "panic"...) > > Greetings > Tobias > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Tobias Geiger
2012-Jul-25 14:20 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
Am 25.07.2012 15:43, schrieb Konrad Rzeszutek Wilk:> On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote: >> Hi! >> >> i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock >> stable): >> >> 1st: only the GPU PCI Passthrough works, the PCI USB Controller is >> not recognized within the DomU (HVM Win7 64) >> Dom0 cmdline is: >> ro root=LABEL=dom0root >> xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) >> security=apparmor noirqdebug nouveau.msi=1 >> >> Only 8:00.0 and 8:00.1 get passed through without problems, all the >> USB Controller IDs are not correctly passed through and get a >> exclamation mark within the win7 device manager ("could not be >> started"). > > Ok, but they do get passed in though? As in, QEMU sees them. > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI > passed in devices do you see them? Meaning lspci shows them? >Yes, they get passed through: pc:~# xl pci-list win Vdev Device 05.0 0000:08:00.0 06.0 0000:08:00.1 07.0 0000:00:1d.0 08.0 0000:00:1d.1 09.0 0000:00:1d.2 0a.0 0000:00:1d.7 but *:1d.* gets a exclamation mark within win7... sorry i have no linux hvm at hand right now to do a lspci.> > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5? > >> >> >> 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry >> that i have no full stacktrace, all i have is a "screenshot" which i >> uploaded here: >> http://imageshack.us/photo/my-images/52/img20120724235921.jpg/ > > Ugh, that looks like somebody removed a large chunk of a pagetable. > > Hmm. Are you using dom0_mem=max parameter? If not, can you try > that and also disable ballooning in the xm/xl config file pls?i already have/had: xen_commandline : watchdog dom0_mem=4096M,max:7680M dom0_vcpus_pin but autoballooning was on in xl.conf, i disabled it: but still i get a panic as soon as domu is shut down: (luckily i happend to press "enter" on the dmesg command exactly at the right time to get the full stacktrace just before my ssh connection died...) pc:~# dmesg [ 206.553547] xen-blkback:backend/vbd/1/832: prepare for reconnect [ 207.421690] xen-blkback:backend/vbd/1/768: prepare for reconnect [ 208.248271] vif vif-1-0: 2 reading script [ 208.252882] br0: port 3(vif1.0) entered disabled state [ 208.253584] br0: port 3(vif1.0) entered disabled state [ 213.115052] ------------[ cut here ]------------ [ 213.115071] kernel BUG at drivers/xen/balloon.c:359! [ 213.115079] invalid opcode: 0000 [#1] PREEMPT SMP [ 213.115091] CPU 4 [ 213.115094] Modules linked in: uvcvideo snd_seq_midi snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi videobuf2_vm alloc videobuf2_memops videobuf2_core videodev joydev hid_generic gpio_ich [last unloaded: scsi_wait_scan] [ 213.115124] [ 213.115126] Pid: 1191, comm: kworker/4:1 Not tainted 3.5.0 #2 /DX58SO [ 213.115135] RIP: e030:[<ffffffff81448105>] [<ffffffff81448105>] balloon_process+0x385/0x3a0 [ 213.115146] RSP: e02b:ffff88012e7f7dc0 EFLAGS: 00010213 [ 213.115150] RAX: 0000000220be8000 RBX: 0000000000000000 RCX: 0000000000000008 [ 213.115158] RDX: ffff88010bb02000 RSI: 00000000000001cb RDI: 000000000020efcb [ 213.115164] RBP: ffff88012e7f7e20 R08: ffff88014068e140 R09: 0000000000000001 [ 213.115169] R10: 0000000000000001 R11: 0000000000000000 R12: 0000160000000000 [ 213.115175] R13: 0000000000000001 R14: 000000000020efcb R15: ffffea00083bf2c0 [ 213.115183] FS: 00007f31ea7f7700(0000) GS:ffff880140680000(0000) knlGS:0000000000000000 [ 213.115189] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 213.115193] CR2: 00007f31ea193986 CR3: 0000000001e0c000 CR4: 0000000000002660 [ 213.115199] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 213.115204] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 213.115210] Process kworker/4:1 (pid: 1191, threadinfo ffff88012e7f6000, task ffff88012ec65b00) [ 213.115216] Stack: [ 213.115218] 000000000008a6ba 0000000000000001 ffffffff8200ea80 0000000000000001 [ 213.115331] 0000000000000000 0000000000007ff0 ffff88012e7f7e00 ffff8801312fb100 [ 213.115341] ffff880140697000 ffff88014068e140 0000000000000000 ffffffff81e587c0 [ 213.115350] Call Trace: [ 213.115356] [<ffffffff8106752b>] process_one_work+0x12b/0x450 [ 213.115362] [<ffffffff81447d80>] ? decrease_reservation+0x320/0x320 [ 213.115368] [<ffffffff810688ae>] worker_thread+0x12e/0x2d0 [ 213.115374] [<ffffffff81068780>] ? manage_workers.isra.26+0x1f0/0x1f0 [ 213.115380] [<ffffffff8106db6e>] kthread+0x8e/0xa0 [ 213.115386] [<ffffffff8184e324>] kernel_thread_helper+0x4/0x10 [ 213.115394] [<ffffffff8184c7bc>] ? retint_restore_args+0x5/0x6 [ 213.115400] [<ffffffff8184e320>] ? gs_change+0x13/0x13 [ 213.115406] Code: 01 15 80 69 bc 00 48 29 d0 48 89 05 7e 69 bc 00 e9 31 fd ff ff 0f 0b 0f 0b 4c 89 f7 e8 35 33 bc ff 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 48 83 c1 01 e9 c2 fd ff ff 0f [ 213.115509] RIP [<ffffffff81448105>] balloon_process+0x385/0x3a0 [ 213.115521] RSP <ffff88012e7f7dc0> [ 213.126036] ---[ end trace 38b78364333593e7 ]--- [ 213.126061] BUG: unable to handle kernel paging request at fffffffffffffff8 [ 213.126072] IP: [<ffffffff8106e07c>] kthread_data+0xc/0x20 [ 213.126079] PGD 1e0e067 PUD 1e0f067 PMD 0 [ 213.126087] Oops: 0000 [#2] PREEMPT SMP [ 213.126094] CPU 4 [ 213.126097] Modules linked in: uvcvideo snd_seq_midi snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi videobuf2_vm alloc videobuf2_memops videobuf2_core videodev joydev hid_generic gpio_ich [last unloaded: scsi_wait_scan] [ 213.126151] [ 213.126154] Pid: 1191, comm: kworker/4:1 Tainted: G D 3.5.0 #2 /DX58SO [ 213.126175] RIP: e030:[<ffffffff8106e07c>] [<ffffffff8106e07c>] kthread_data+0xc/0x20 [ 213.126192] RSP: e02b:ffff88012e7f7a90 EFLAGS: 00010092 [ 213.126203] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000004 [ 213.126212] RDX: ffffffff81fcba40 RSI: 0000000000000004 RDI: ffff88012ec65b00 [ 213.126225] RBP: ffff88012e7f7aa8 R08: 0000000000989680 R09: ffffffff81fcba40 [ 213.126239] R10: ffffffff813b0d60 R11: 0000000000000000 R12: ffff8801406936c0 [ 213.126254] R13: 0000000000000004 R14: ffff88012ec65af0 R15: ffff88012ec65b00 [ 213.126270] FS: 00007f31ea7f7700(0000) GS:ffff880140680000(0000) knlGS:0000000000000000 [ 213.126284] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 213.126296] CR2: fffffffffffffff8 CR3: 0000000001e0c000 CR4: 0000000000002660 [ 213.126310] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 213.126325] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 213.126337] Process kworker/4:1 (pid: 1191, threadinfo ffff88012e7f6000, task ffff88012ec65b00) [ 213.126354] Stack: [ 213.126360] ffffffff810698d0 ffff88012e7f7aa8 ffff88012ec65ed8 ffff88012e7f7b18 [ 213.126381] ffffffff8184ad32 ffff88012e7f7fd8 ffff88012ec65b00 ffff88012e7f7fd8 [ 213.126403] ffff88012e7f7fd8 ffff8801312f94e0 ffff88012ec65b00 ffff88012ec660f0 [ 213.126422] Call Trace: [ 213.126427] [<ffffffff810698d0>] ? wq_worker_sleeping+0x10/0xa0 [ 213.126435] [<ffffffff8184ad32>] __schedule+0x592/0x7d0 [ 213.126443] [<ffffffff8184b094>] schedule+0x24/0x70 [ 213.126449] [<ffffffff81051582>] do_exit+0x5b2/0x910 [ 213.126457] [<ffffffff8183e941>] ? printk+0x48/0x4a [ 213.126464] [<ffffffff8100ad02>] ? check_events+0x12/0x20 [ 213.126472] [<ffffffff810175a1>] oops_end+0x71/0xa0 [ 213.126478] [<ffffffff81017713>] die+0x53/0x80 [ 213.126484] [<ffffffff81014418>] do_trap+0xb8/0x160 [ 213.126490] [<ffffffff81014713>] do_invalid_op+0xa3/0xb0 [ 213.126499] [<ffffffff81448105>] ? balloon_process+0x385/0x3a0 [ 213.127254] [<ffffffff81085f52>] ? load_balance+0xd2/0x800 [ 213.127940] [<ffffffff8108116d>] ? cpuacct_charge+0x6d/0xb0 [ 213.128621] [<ffffffff8184e19b>] invalid_op+0x1b/0x20 [ 213.129304] [<ffffffff81448105>] ? balloon_process+0x385/0x3a0 [ 213.129962] [<ffffffff8106752b>] process_one_work+0x12b/0x450 [ 213.130590] [<ffffffff81447d80>] ? decrease_reservation+0x320/0x320 [ 213.131226] [<ffffffff810688ae>] worker_thread+0x12e/0x2d0 [ 213.131856] [<ffffffff81068780>] ? manage_workers.isra.26+0x1f0/0x1f0 [ 213.132482] [<ffffffff8106db6e>] kthread+0x8e/0xa0 [ 213.133099] [<ffffffff8184e324>] kernel_thread_helper+0x4/0x10 [ 213.133718] [<ffffffff8184c7bc>] ? retint_restore_args+0x5/0x6 [ 213.134338] [<ffffffff8184e320>] ? gs_change+0x13/0x13 [ 213.134954] Code: e0 ff ff 01 48 8b 80 38 e0 ff ff a8 08 0f 84 3d ff ff ff e8 97 cf 7d 00 e9 33 ff ff ff 66 90 48 8b 87 80 03 00 00 55 48 89 e5 5d <48> 8b 40 f8 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 [ 213.135647] RIP [<ffffffff8106e07c>] kthread_data+0xc/0x20 [ 213.136320] RSP <ffff88012e7f7a90> [ 213.136967] CR2: fffffffffffffff8 [ 213.137610] ---[ end trace 38b78364333593e8 ]--- [ 213.137611] Fixing recursive fault but reboot is needed! seems like a ballooning thing - i will try to with only a "max" setting, not a range ... stay tuned ;)> >> >> >> With 3.4 both issues were not there - everything worked perfectly. >> Tell me which debugging info you need, i may be able to re-install >> my netconsole to get the full stacktrace (but i had not much luck >> with netconsole regarding kernel panics - rarely this info gets sent >> before the "panic"...) >> >> Greetings >> Tobias >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel
Tobias Geiger
2012-Jul-25 14:32 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
It will take some time for me to re-test with "dom0_mem=4096M" (i.e. w/o a "max" range), because i forgot a "panic=X" command on the Dom0 cmdline, so right now the machine is waiting for me to press the reset-button ... :( I''ll post my results asap. Greetings Am 25.07.2012 16:20, schrieb Tobias Geiger:> Am 25.07.2012 15:43, schrieb Konrad Rzeszutek Wilk: >> On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote: >>> Hi! >>> >>> i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock >>> stable): >>> >>> 1st: only the GPU PCI Passthrough works, the PCI USB Controller is >>> not recognized within the DomU (HVM Win7 64) >>> Dom0 cmdline is: >>> ro root=LABEL=dom0root >>> xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) >>> security=apparmor noirqdebug nouveau.msi=1 >>> >>> Only 8:00.0 and 8:00.1 get passed through without problems, all the >>> USB Controller IDs are not correctly passed through and get a >>> exclamation mark within the win7 device manager ("could not be >>> started"). >> >> Ok, but they do get passed in though? As in, QEMU sees them. >> If you boot a Live Ubuntu/Fedora CD within the guest with the PCI >> passed in devices do you see them? Meaning lspci shows them? >> > > Yes, they get passed through: > > pc:~# xl pci-list win > Vdev Device > 05.0 0000:08:00.0 > 06.0 0000:08:00.1 > 07.0 0000:00:1d.0 > 08.0 0000:00:1d.1 > 09.0 0000:00:1d.2 > 0a.0 0000:00:1d.7 > > but *:1d.* gets a exclamation mark within win7... > > sorry i have no linux hvm at hand right now to do a lspci. > >> >> Is the lspci -vvv output in dom0 different from 3.4 vs 3.5? >> >>> >>> >>> 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry >>> that i have no full stacktrace, all i have is a "screenshot" which >>> i >>> uploaded here: >>> http://imageshack.us/photo/my-images/52/img20120724235921.jpg/ >> >> Ugh, that looks like somebody removed a large chunk of a pagetable. >> >> Hmm. Are you using dom0_mem=max parameter? If not, can you try >> that and also disable ballooning in the xm/xl config file pls? > > i already have/had: > xen_commandline : watchdog dom0_mem=4096M,max:7680M > dom0_vcpus_pin > > but autoballooning was on in xl.conf, i disabled it: > > but still i get a panic as soon as domu is shut down: > (luckily i happend to press "enter" on the dmesg command exactly at > the right time to get the full stacktrace just before my ssh > connection died...) > > pc:~# dmesg > [ 206.553547] xen-blkback:backend/vbd/1/832: prepare for reconnect > [ 207.421690] xen-blkback:backend/vbd/1/768: prepare for reconnect > [ 208.248271] vif vif-1-0: 2 reading script > [ 208.252882] br0: port 3(vif1.0) entered disabled state > [ 208.253584] br0: port 3(vif1.0) entered disabled state > [ 213.115052] ------------[ cut here ]------------ > [ 213.115071] kernel BUG at drivers/xen/balloon.c:359! > [ 213.115079] invalid opcode: 0000 [#1] PREEMPT SMP > [ 213.115091] CPU 4 > [ 213.115094] Modules linked in: uvcvideo snd_seq_midi snd_usb_audio > snd_usbmidi_lib snd_hwdep snd_rawmidi videobuf2_vm > alloc videobuf2_memops videobuf2_core videodev joydev hid_generic > gpio_ich [last unloaded: scsi_wait_scan] > [ 213.115124] > [ 213.115126] Pid: 1191, comm: kworker/4:1 Not tainted 3.5.0 #2 > /DX58SO > [ 213.115135] RIP: e030:[<ffffffff81448105>] [<ffffffff81448105>] > balloon_process+0x385/0x3a0 > [ 213.115146] RSP: e02b:ffff88012e7f7dc0 EFLAGS: 00010213 > [ 213.115150] RAX: 0000000220be8000 RBX: 0000000000000000 RCX: > 0000000000000008 > [ 213.115158] RDX: ffff88010bb02000 RSI: 00000000000001cb RDI: > 000000000020efcb > [ 213.115164] RBP: ffff88012e7f7e20 R08: ffff88014068e140 R09: > 0000000000000001 > [ 213.115169] R10: 0000000000000001 R11: 0000000000000000 R12: > 0000160000000000 > [ 213.115175] R13: 0000000000000001 R14: 000000000020efcb R15: > ffffea00083bf2c0 > [ 213.115183] FS: 00007f31ea7f7700(0000) GS:ffff880140680000(0000) > knlGS:0000000000000000 > [ 213.115189] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 213.115193] CR2: 00007f31ea193986 CR3: 0000000001e0c000 CR4: > 0000000000002660 > [ 213.115199] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 213.115204] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 213.115210] Process kworker/4:1 (pid: 1191, threadinfo > ffff88012e7f6000, task ffff88012ec65b00) > [ 213.115216] Stack: > [ 213.115218] 000000000008a6ba 0000000000000001 ffffffff8200ea80 > 0000000000000001 > [ 213.115331] 0000000000000000 0000000000007ff0 ffff88012e7f7e00 > ffff8801312fb100 > [ 213.115341] ffff880140697000 ffff88014068e140 0000000000000000 > ffffffff81e587c0 > [ 213.115350] Call Trace: > [ 213.115356] [<ffffffff8106752b>] process_one_work+0x12b/0x450 > [ 213.115362] [<ffffffff81447d80>] ? > decrease_reservation+0x320/0x320 > [ 213.115368] [<ffffffff810688ae>] worker_thread+0x12e/0x2d0 > [ 213.115374] [<ffffffff81068780>] ? > manage_workers.isra.26+0x1f0/0x1f0 > [ 213.115380] [<ffffffff8106db6e>] kthread+0x8e/0xa0 > [ 213.115386] [<ffffffff8184e324>] kernel_thread_helper+0x4/0x10 > [ 213.115394] [<ffffffff8184c7bc>] ? retint_restore_args+0x5/0x6 > [ 213.115400] [<ffffffff8184e320>] ? gs_change+0x13/0x13 > [ 213.115406] Code: 01 15 80 69 bc 00 48 29 d0 48 89 05 7e 69 bc 00 > e9 31 fd ff ff 0f 0b 0f 0b 4c 89 f7 e8 35 33 bc ff > 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 48 > 83 c1 01 e9 c2 fd ff ff 0f > [ 213.115509] RIP [<ffffffff81448105>] balloon_process+0x385/0x3a0 > [ 213.115521] RSP <ffff88012e7f7dc0> > [ 213.126036] ---[ end trace 38b78364333593e7 ]--- > [ 213.126061] BUG: unable to handle kernel paging request at > fffffffffffffff8 > [ 213.126072] IP: [<ffffffff8106e07c>] kthread_data+0xc/0x20 > [ 213.126079] PGD 1e0e067 PUD 1e0f067 PMD 0 > [ 213.126087] Oops: 0000 [#2] PREEMPT SMP > [ 213.126094] CPU 4 > [ 213.126097] Modules linked in: uvcvideo snd_seq_midi snd_usb_audio > snd_usbmidi_lib snd_hwdep snd_rawmidi videobuf2_vm > alloc videobuf2_memops videobuf2_core videodev joydev hid_generic > gpio_ich [last unloaded: scsi_wait_scan] > [ 213.126151] > [ 213.126154] Pid: 1191, comm: kworker/4:1 Tainted: G D > 3.5.0 #2 /DX58SO > [ 213.126175] RIP: e030:[<ffffffff8106e07c>] [<ffffffff8106e07c>] > kthread_data+0xc/0x20 > [ 213.126192] RSP: e02b:ffff88012e7f7a90 EFLAGS: 00010092 > [ 213.126203] RAX: 0000000000000000 RBX: 0000000000000004 RCX: > 0000000000000004 > [ 213.126212] RDX: ffffffff81fcba40 RSI: 0000000000000004 RDI: > ffff88012ec65b00 > [ 213.126225] RBP: ffff88012e7f7aa8 R08: 0000000000989680 R09: > ffffffff81fcba40 > [ 213.126239] R10: ffffffff813b0d60 R11: 0000000000000000 R12: > ffff8801406936c0 > [ 213.126254] R13: 0000000000000004 R14: ffff88012ec65af0 R15: > ffff88012ec65b00 > [ 213.126270] FS: 00007f31ea7f7700(0000) GS:ffff880140680000(0000) > knlGS:0000000000000000 > [ 213.126284] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 213.126296] CR2: fffffffffffffff8 CR3: 0000000001e0c000 CR4: > 0000000000002660 > [ 213.126310] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 213.126325] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 213.126337] Process kworker/4:1 (pid: 1191, threadinfo > ffff88012e7f6000, task ffff88012ec65b00) > [ 213.126354] Stack: > [ 213.126360] ffffffff810698d0 ffff88012e7f7aa8 ffff88012ec65ed8 > ffff88012e7f7b18 > [ 213.126381] ffffffff8184ad32 ffff88012e7f7fd8 ffff88012ec65b00 > ffff88012e7f7fd8 > [ 213.126403] ffff88012e7f7fd8 ffff8801312f94e0 ffff88012ec65b00 > ffff88012ec660f0 > [ 213.126422] Call Trace: > [ 213.126427] [<ffffffff810698d0>] ? wq_worker_sleeping+0x10/0xa0 > [ 213.126435] [<ffffffff8184ad32>] __schedule+0x592/0x7d0 > [ 213.126443] [<ffffffff8184b094>] schedule+0x24/0x70 > [ 213.126449] [<ffffffff81051582>] do_exit+0x5b2/0x910 > [ 213.126457] [<ffffffff8183e941>] ? printk+0x48/0x4a > [ 213.126464] [<ffffffff8100ad02>] ? check_events+0x12/0x20 > [ 213.126472] [<ffffffff810175a1>] oops_end+0x71/0xa0 > [ 213.126478] [<ffffffff81017713>] die+0x53/0x80 > [ 213.126484] [<ffffffff81014418>] do_trap+0xb8/0x160 > [ 213.126490] [<ffffffff81014713>] do_invalid_op+0xa3/0xb0 > [ 213.126499] [<ffffffff81448105>] ? balloon_process+0x385/0x3a0 > [ 213.127254] [<ffffffff81085f52>] ? load_balance+0xd2/0x800 > [ 213.127940] [<ffffffff8108116d>] ? cpuacct_charge+0x6d/0xb0 > [ 213.128621] [<ffffffff8184e19b>] invalid_op+0x1b/0x20 > [ 213.129304] [<ffffffff81448105>] ? balloon_process+0x385/0x3a0 > [ 213.129962] [<ffffffff8106752b>] process_one_work+0x12b/0x450 > [ 213.130590] [<ffffffff81447d80>] ? > decrease_reservation+0x320/0x320 > [ 213.131226] [<ffffffff810688ae>] worker_thread+0x12e/0x2d0 > [ 213.131856] [<ffffffff81068780>] ? > manage_workers.isra.26+0x1f0/0x1f0 > [ 213.132482] [<ffffffff8106db6e>] kthread+0x8e/0xa0 > [ 213.133099] [<ffffffff8184e324>] kernel_thread_helper+0x4/0x10 > [ 213.133718] [<ffffffff8184c7bc>] ? retint_restore_args+0x5/0x6 > [ 213.134338] [<ffffffff8184e320>] ? gs_change+0x13/0x13 > [ 213.134954] Code: e0 ff ff 01 48 8b 80 38 e0 ff ff a8 08 0f 84 3d > ff ff ff e8 97 cf 7d 00 e9 33 ff ff ff 66 90 48 8b > 87 80 03 00 00 55 48 89 e5 5d <48> 8b 40 f8 c3 66 66 66 66 66 66 2e > 0f 1f 84 00 00 00 00 00 55 > [ 213.135647] RIP [<ffffffff8106e07c>] kthread_data+0xc/0x20 > [ 213.136320] RSP <ffff88012e7f7a90> > [ 213.136967] CR2: fffffffffffffff8 > [ 213.137610] ---[ end trace 38b78364333593e8 ]--- > [ 213.137611] Fixing recursive fault but reboot is needed! > > seems like a ballooning thing - i will try to with only a "max" > setting, not a range ... stay tuned ;) > > >> >>> >>> >>> With 3.4 both issues were not there - everything worked perfectly. >>> Tell me which debugging info you need, i may be able to re-install >>> my netconsole to get the full stacktrace (but i had not much luck >>> with netconsole regarding kernel panics - rarely this info gets >>> sent >>> before the "panic"...) >>> >>> Greetings >>> Tobias >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Tobias Geiger
2012-Jul-25 17:59 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
The dom0 panic when shutting down domu also happens with dom0_mem=4096M and also with dom0_mem=4096M,max:4096M and both times: pc:~# cat /etc/xen/xl.conf | grep autoballoon autoballoon=0 :( FWIW here is the diff betwen dom0 kernel 3.5 and 3.4: pc:~# diff /tmp/3.4.config /usr/src/3.5/linux-3.5/.config 3c3 < # Linux/x86_64 3.4.0 Kernel Configuration ---> # Linux/x86_64 3.5.0 Kernel Configuration12,16d11 < CONFIG_GENERIC_CMOS_UPDATE=y < CONFIG_CLOCKSOURCE_WATCHDOG=y < CONFIG_GENERIC_CLOCKEVENTS=y < CONFIG_ARCH_CLOCKSOURCE_DATA=y < CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y 31d25 < CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y 33d26 < CONFIG_GENERIC_TIME_VSYSCALL=y 51d43 < # CONFIG_KTIME_SCALAR is not set 52a45> CONFIG_ARCH_SUPPORTS_UPROBES=y55a49> CONFIG_BUILDTIME_EXTABLE_SORT=y104a99,113> CONFIG_CLOCKSOURCE_WATCHDOG=y > CONFIG_ARCH_CLOCKSOURCE_DATA=y > CONFIG_GENERIC_TIME_VSYSCALL=y > CONFIG_GENERIC_CLOCKEVENTS=y > CONFIG_GENERIC_CLOCKEVENTS_BUILD=y > CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y > CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y > CONFIG_GENERIC_CMOS_UPDATE=y > > # > # Timers subsystem > # > CONFIG_TICK_ONESHOT=y > CONFIG_NO_HZ=y > CONFIG_HIGH_RES_TIMERS=y111a121> CONFIG_RCU_FANOUT_LEAF=16145d154 < CONFIG_USER_NS=y 188d196 < # CONFIG_PERF_COUNTERS is not set 210a219> CONFIG_GENERIC_SMP_IDLE_THREAD=y222a232,233> CONFIG_HAVE_ARCH_SECCOMP_FILTER=y > CONFIG_SECCOMP_FILTER=y322,326d332 < CONFIG_TICK_ONESHOT=y < CONFIG_NO_HZ=y < CONFIG_HIGH_RES_TIMERS=y < CONFIG_GENERIC_CLOCKEVENTS_BUILD=y < CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y 444a451> CONFIG_CROSS_MEMORY_ATTACH=y445a453> CONFIG_FRONTSWAP=y486a495,496> # CONFIG_PM_AUTOSLEEP is not set > # CONFIG_PM_WAKELOCKS is not set504c514 < CONFIG_ACPI_PROCESSOR_AGGREGATOR=m ---> # CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set540,543c550,553 < CONFIG_X86_PCC_CPUFREQ=m < CONFIG_X86_ACPI_CPUFREQ=m < CONFIG_X86_POWERNOW_K8=m < CONFIG_X86_SPEEDSTEP_CENTRINO=m ---> # CONFIG_X86_PCC_CPUFREQ is not set > # CONFIG_X86_ACPI_CPUFREQ is not set > # CONFIG_X86_POWERNOW_K8 is not set > # CONFIG_X86_SPEEDSTEP_CENTRINO is not set621a632> CONFIG_X86_DEV_DMA_OPS=y630a642> CONFIG_XFRM_ALGO=y773a786> CONFIG_NETFILTER_XT_TARGET_HMARK=m891d903 < CONFIG_IP6_NF_QUEUE=m 952d963 < # CONFIG_ECONET is not set 977a989,990> CONFIG_NET_SCH_CODEL=m > CONFIG_NET_SCH_FQ_CODEL=m1018a1032> CONFIG_BATMAN_ADV_BLA=y1026d1039 < CONFIG_HAVE_BPF_JIT=y 1082a1096,1097> CONFIG_NFC_HCI=m > # CONFIG_NFC_SHDLC is not set1090a1106> CONFIG_HAVE_BPF_JIT=y1253d1268 < # CONFIG_MTD_UBI_DEBUG is not set 1331c1346,1347 < CONFIG_BMP085=m ---> CONFIG_BMP085=y > CONFIG_BMP085_I2C=m1358a1375> CONFIG_INTEL_MEI=m1535a1553> CONFIG_SBP_TARGET=m1558a1577> CONFIG_NET_TEAM_MODE_LOADBALANCE=m1695c1714 < CONFIG_STMMAC_PLATFORM=m ---> CONFIG_STMMAC_PLATFORM=y1712a1732,1734> CONFIG_NET_VENDOR_WIZNET=y > # CONFIG_WIZNET_W5100 is not set > # CONFIG_WIZNET_W5300 is not set1753d1774 < # CONFIG_TR is not set 1809a1831> CONFIG_INPUT_MATRIXKMAP=m1837a1860> CONFIG_KEYBOARD_LM8333=m1900a1924> # CONFIG_INPUT_MC13783_PWRBUTTON is not set2015d2038 < CONFIG_RAMOOPS=m 2139a2163> CONFIG_GPIO_ICH=m2321a2346> CONFIG_SENSORS_INA2XX=m2340a2366> CONFIG_SENSORS_MC13783_ADC=m2370a2397> CONFIG_IE6XX_WDT=m2421c2448 < CONFIG_MFD_CORE=m ---> CONFIG_MFD_CORE=y2427a2455> CONFIG_MFD_LM3533=m2442a2471> # CONFIG_MFD_MAX77693 is not set2447c2476 < CONFIG_MFD_WM8400=m ---> # CONFIG_MFD_WM8400 is not set2453a2483,2485> CONFIG_MFD_MC13783=m > CONFIG_MFD_MC13XXX=m > CONFIG_MFD_MC13XXX_I2C=m2457a2490> CONFIG_LPC_ICH=y2464a2498> # CONFIG_MFD_PALMAS is not set2472a2507,2509> CONFIG_REGULATOR_MC13XXX_CORE=m > CONFIG_REGULATOR_MC13783=m > CONFIG_REGULATOR_MC13892=m2485d2521 < CONFIG_REGULATOR_WM8400=m 2683c2719 < CONFIG_VIDEO_EM28XX_RC=y ---> CONFIG_VIDEO_EM28XX_RC=m2690d2725 < CONFIG_USB_ET61X251=m 2782a2818,2820> CONFIG_DRM_AST=m > # CONFIG_DRM_MGAG200 is not set > CONFIG_DRM_CIRRUS_QEMU=m2853a2892,2894> CONFIG_FB_AUO_K190X=m > CONFIG_FB_AUO_K1900=m > CONFIG_FB_AUO_K1901=m2859a2901> CONFIG_BACKLIGHT_LM3533=m2986d3027 < # CONFIG_SND_HDA_ENABLE_REALTEK_QUIRKS is not set 3050a3092> CONFIG_SND_SOC_CS42L52=m3058a3101> CONFIG_SND_SOC_LM49453=m3079d3121 < CONFIG_SND_SOC_WM8400=m 3117a3160,3161> CONFIG_SND_SOC_MC13783=m > CONFIG_SND_SOC_ML26124=m3118a3163> CONFIG_SND_SIMPLE_CARD=m3137,3140d3181 < CONFIG_HID_SUPPORT=y < CONFIG_HID=y < # CONFIG_HID_BATTERY_STRENGTH is not set < # CONFIG_HIDRAW is not set 3143c3184 < # USB Input Devices ---> # HID support3145,3147c3186,3189 < CONFIG_USB_HID=y < # CONFIG_HID_PID is not set < # CONFIG_USB_HIDDEV is not set ---> CONFIG_HID=y > # CONFIG_HID_BATTERY_STRENGTH is not set > # CONFIG_HIDRAW is not set > CONFIG_HID_GENERIC=m3155a3198> CONFIG_HID_AUREAL=m3216a3260,3266> > # > # USB HID support > # > CONFIG_USB_HID=y > # CONFIG_HID_PID is not set > # CONFIG_USB_HIDDEV is not set3230,3231d3279 < # CONFIG_USB_DEVICEFS is not set < # CONFIG_USB_DEVICE_CLASS is not set 3269a3318,3321> CONFIG_USB_CHIPIDEA=m > # CONFIG_USB_CHIPIDEA_UDC is not set > # CONFIG_USB_CHIPIDEA_HOST is not set > # CONFIG_USB_CHIPIDEA_DEBUG is not set3382a3435> CONFIG_USB_SERIAL_QT2=m3408a3462,3466> > # > # USB Physical Layer drivers > # > CONFIG_USB_ISP1301=m3414a3473,3476> > # > # USB Peripheral Controller > #3419d3480 < CONFIG_USB_CI13XXX_PCI=m 3443a3505> CONFIG_USB_GADGET_TARGET=m3500a3563> CONFIG_LEDS_LM3533=m3514a3578> CONFIG_LEDS_MC13783=m3530a3595> CONFIG_LEDS_TRIGGER_TRANSIENT=m3618a3684> CONFIG_RTC_DRV_MC13XXX=m3654a3721> # CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set3699,3700d3765 < # CONFIG_USB_SERIAL_QUATECH_USB2 is not set < # CONFIG_VME_BUS is not set 3702d3766 < # CONFIG_IIO is not set 3737d3800 < CONFIG_INTEL_MEI=m 3760a3824,3833> CONFIG_IPACK_BUS=m > CONFIG_BOARD_TPCI200=m > CONFIG_SERIAL_IPOCTAL=m > CONFIG_WIMAX_GDM72XX=m > # CONFIG_WIMAX_GDM72XX_QOS is not set > # CONFIG_WIMAX_GDM72XX_K_MODE is not set > # CONFIG_WIMAX_GDM72XX_WIMAX2 is not set > CONFIG_WIMAX_GDM72XX_USB=y > # CONFIG_WIMAX_GDM72XX_SDIO is not set > # CONFIG_WIMAX_GDM72XX_USB_PM is not set3787c3860 < CONFIG_INTEL_MENLOW=m ---> # CONFIG_INTEL_MENLOW is not set3843a3917,3925> CONFIG_EXTCON=m > > # > # Extcon Device Drivers > # > CONFIG_EXTCON_GPIO=m > # CONFIG_MEMORY is not set > # CONFIG_IIO is not set > # CONFIG_VME_BUS is not set3989d4070 < # CONFIG_UBIFS_FS_XATTR is not set 3993d4073 < # CONFIG_UBIFS_FS_DEBUG is not set 4015a4096> CONFIG_NFS_V2=y4087a4169,4179> CONFIG_NLS_MAC_ROMAN=m > CONFIG_NLS_MAC_CELTIC=m > CONFIG_NLS_MAC_CENTEURO=m > CONFIG_NLS_MAC_CROATIAN=m > CONFIG_NLS_MAC_CYRILLIC=m > CONFIG_NLS_MAC_GAELIC=m > CONFIG_NLS_MAC_GREEK=m > CONFIG_NLS_MAC_ICELAND=m > CONFIG_NLS_MAC_INUIT=m > CONFIG_NLS_MAC_ROMANIAN=m > CONFIG_NLS_MAC_TURKISH=m4102a4195> # CONFIG_READABLE_ASM is not set4110a4204,4205> # CONFIG_PANIC_ON_OOPS is not set > CONFIG_PANIC_ON_OOPS_VALUE=04188a4284,4285> # CONFIG_UPROBE_EVENT is not set > # CONFIG_PROBE_EVENTS is not set4393a4491> CONFIG_HAVE_KVM_MSI=y4406a4505,4506> CONFIG_GENERIC_STRNCPY_FROM_USER=y > CONFIG_GENERIC_STRNLEN_USER=y4460a4561> # CONFIG_DDR is not setAm Mittwoch 25 Juli 2012, 15:43:57 schrieb Konrad Rzeszutek Wilk:> On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote: > > Hi! > > > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock > > stable): > > > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is > > not recognized within the DomU (HVM Win7 64) > > Dom0 cmdline is: > > ro root=LABEL=dom0root > > xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) > > security=apparmor noirqdebug nouveau.msi=1 > > > > Only 8:00.0 and 8:00.1 get passed through without problems, all the > > USB Controller IDs are not correctly passed through and get a > > exclamation mark within the win7 device manager ("could not be > > started"). > > Ok, but they do get passed in though? As in, QEMU sees them. > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI > passed in devices do you see them? Meaning lspci shows them? > > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5? > > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry > > that i have no full stacktrace, all i have is a "screenshot" which i > > uploaded here: > > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/ > > Ugh, that looks like somebody removed a large chunk of a pagetable. > > Hmm. Are you using dom0_mem=max parameter? If not, can you try > that and also disable ballooning in the xm/xl config file pls? > > > With 3.4 both issues were not there - everything worked perfectly. > > Tell me which debugging info you need, i may be able to re-install > > my netconsole to get the full stacktrace (but i had not much luck > > with netconsole regarding kernel panics - rarely this info gets sent > > before the "panic"...) > > > > Greetings > > Tobias > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2012-Jul-25 18:09 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
On Wed, Jul 25, 2012 at 07:59:33PM +0200, Tobias Geiger wrote:> The dom0 panic when shutting down domu also happens with > > dom0_mem=4096M > and also with > dom0_mem=4096M,max:4096M > > and both times: > pc:~# cat /etc/xen/xl.conf | grep autoballoon > autoballoon=0 > > > :(OK, so the balloon driver is still being activated somehow.> > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?^^^ anything there?
Konrad Rzeszutek Wilk
2012-Aug-06 16:16 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk wrote:> On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote: > > Hi! > > > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock > > stable): > > > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is > > not recognized within the DomU (HVM Win7 64) > > Dom0 cmdline is: > > ro root=LABEL=dom0root xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) > > security=apparmor noirqdebug nouveau.msi=1 > > > > Only 8:00.0 and 8:00.1 get passed through without problems, all the > > USB Controller IDs are not correctly passed through and get a > > exclamation mark within the win7 device manager ("could not be > > started"). > > Ok, but they do get passed in though? As in, QEMU sees them. > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI > passed in devices do you see them? Meaning lspci shows them? > > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5? > > > > > > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry > > that i have no full stacktrace, all i have is a "screenshot" which i > > uploaded here: > > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/ > > Ugh, that looks like somebody removed a large chunk of a pagetable. > > Hmm. Are you using dom0_mem=max parameter? If not, can you try > that and also disable ballooning in the xm/xl config file pls? > > > > > > > With 3.4 both issues were not there - everything worked perfectly. > > Tell me which debugging info you need, i may be able to re-install > > my netconsole to get the full stacktrace (but i had not much luck > > with netconsole regarding kernel panics - rarely this info gets sent > > before the "panic"...)So I am able to reproduce this with a Windows 7 with an ATI 4870 and an Intel 82574L NIC. The video card still works, but the NIC stopped working. Same version of hypervisor/toolstack/etc, only change is the kernel (v3.4.6->v3.5). Time to get my hands greasy with this..
Konrad Rzeszutek Wilk
2012-Aug-20 23:30 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
On Mon, Aug 06, 2012 at 12:16:33PM -0400, Konrad Rzeszutek Wilk wrote:> On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk wrote: > > On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote: > > > Hi! > > > > > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock > > > stable): > > > > > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is > > > not recognized within the DomU (HVM Win7 64) > > > Dom0 cmdline is: > > > ro root=LABEL=dom0root xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) > > > security=apparmor noirqdebug nouveau.msi=1 > > > > > > Only 8:00.0 and 8:00.1 get passed through without problems, all the > > > USB Controller IDs are not correctly passed through and get a > > > exclamation mark within the win7 device manager ("could not be > > > started"). > > > > Ok, but they do get passed in though? As in, QEMU sees them. > > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI > > passed in devices do you see them? Meaning lspci shows them? > > > > > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5? > > > > > > > > > > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry > > > that i have no full stacktrace, all i have is a "screenshot" which i > > > uploaded here: > > > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/ > > > > Ugh, that looks like somebody removed a large chunk of a pagetable. > > > > Hmm. Are you using dom0_mem=max parameter? If not, can you try > > that and also disable ballooning in the xm/xl config file pls? > > > > > > > > > > > With 3.4 both issues were not there - everything worked perfectly. > > > Tell me which debugging info you need, i may be able to re-install > > > my netconsole to get the full stacktrace (but i had not much luck > > > with netconsole regarding kernel panics - rarely this info gets sent > > > before the "panic"...) > > So I am able to reproduce this with a Windows 7 with an ATI 4870 and > an Intel 82574L NIC. The video card still works, but the NIC stopped > working. Same version of hypervisor/toolstack/etc, only change is the > kernel (v3.4.6->v3.5). > > Time to get my hands greasy with this..And its due to a patch I added in v3.4 (cd9db80e5257682a7f7ab245a2459648b3c8d268) - which did not work properly in v3.4, but with v3.5 got it working (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to now work anymore. Anyhow, for right now jsut revert cd9db80e5257682a7f7ab245a2459648b3c8d268 and it should work for you.> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Ren, Yongjie
2012-Aug-21 02:41 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
> -----Original Message----- > From: xen-devel-bounces@lists.xen.org > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Konrad Rzeszutek > Wilk > Sent: Tuesday, August 21, 2012 7:30 AM > To: Tobias Geiger > Cc: xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI > Passthrough?! > > On Mon, Aug 06, 2012 at 12:16:33PM -0400, Konrad Rzeszutek Wilk wrote: > > On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk > wrote: > > > On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote: > > > > Hi! > > > > > > > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock > > > > stable): > > > > > > > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is > > > > not recognized within the DomU (HVM Win7 64) > > > > Dom0 cmdline is: > > > > ro root=LABEL=dom0root > xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) > > > > security=apparmor noirqdebug nouveau.msi=1 > > > > > > > > Only 8:00.0 and 8:00.1 get passed through without problems, all the > > > > USB Controller IDs are not correctly passed through and get a > > > > exclamation mark within the win7 device manager ("could not be > > > > started"). > > > > > > Ok, but they do get passed in though? As in, QEMU sees them. > > > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI > > > passed in devices do you see them? Meaning lspci shows them? > > > > > > > > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5? > > > > > > > > > > > > > > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - > sorry > > > > that i have no full stacktrace, all i have is a "screenshot" which i > > > > uploaded here: > > > > > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/ > > > > > > Ugh, that looks like somebody removed a large chunk of a pagetable. > > > > > > Hmm. Are you using dom0_mem=max parameter? If not, can you try > > > that and also disable ballooning in the xm/xl config file pls? > > > > > > > > > > > > > > > With 3.4 both issues were not there - everything worked perfectly. > > > > Tell me which debugging info you need, i may be able to re-install > > > > my netconsole to get the full stacktrace (but i had not much luck > > > > with netconsole regarding kernel panics - rarely this info gets sent > > > > before the "panic"...) > > > > So I am able to reproduce this with a Windows 7 with an ATI 4870 and > > an Intel 82574L NIC. The video card still works, but the NIC stopped > > working. Same version of hypervisor/toolstack/etc, only change is the > > kernel (v3.4.6->v3.5). > > > > Time to get my hands greasy with this.. > > And its due to a patch I added in v3.4 > (cd9db80e5257682a7f7ab245a2459648b3c8d268) > - which did not work properly in v3.4, but with v3.5 got it working > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to now > work > anymore. > > Anyhow, for right now jsut revert > cd9db80e5257682a7f7ab245a2459648b3c8d268 > and it should work for you. >Also, our team reported a VT-d bug 2 months ago. http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 We found "3bb07f1b73ea6313b843807063e183e168c9182a" is the bad commit in linux tree. Linux3.4.7 works fine; but Linux 3.5 has this issue. Seem Tobias has the same issue as that in the bug. But we didn''t meet Dom0 panic when shutting down the DomU.
Konrad Rzeszutek Wilk
2012-Aug-21 14:23 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
On Tue, Aug 21, 2012 at 02:41:36AM +0000, Ren, Yongjie wrote:> > -----Original Message----- > > From: xen-devel-bounces@lists.xen.org > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Konrad Rzeszutek > > Wilk > > Sent: Tuesday, August 21, 2012 7:30 AM > > To: Tobias Geiger > > Cc: xen-devel@lists.xen.org > > Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI > > Passthrough?! > > > > On Mon, Aug 06, 2012 at 12:16:33PM -0400, Konrad Rzeszutek Wilk wrote: > > > On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk > > wrote: > > > > On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote: > > > > > Hi! > > > > > > > > > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock > > > > > stable): > > > > > > > > > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is > > > > > not recognized within the DomU (HVM Win7 64) > > > > > Dom0 cmdline is: > > > > > ro root=LABEL=dom0root > > xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) > > > > > security=apparmor noirqdebug nouveau.msi=1 > > > > > > > > > > Only 8:00.0 and 8:00.1 get passed through without problems, all the > > > > > USB Controller IDs are not correctly passed through and get a > > > > > exclamation mark within the win7 device manager ("could not be > > > > > started"). > > > > > > > > Ok, but they do get passed in though? As in, QEMU sees them. > > > > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI > > > > passed in devices do you see them? Meaning lspci shows them? > > > > > > > > > > > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5? > > > > > > > > > > > > > > > > > > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - > > sorry > > > > > that i have no full stacktrace, all i have is a "screenshot" which i > > > > > uploaded here: > > > > > > > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/ > > > > > > > > Ugh, that looks like somebody removed a large chunk of a pagetable. > > > > > > > > Hmm. Are you using dom0_mem=max parameter? If not, can you try > > > > that and also disable ballooning in the xm/xl config file pls? > > > > > > > > > > > > > > > > > > > With 3.4 both issues were not there - everything worked perfectly. > > > > > Tell me which debugging info you need, i may be able to re-install > > > > > my netconsole to get the full stacktrace (but i had not much luck > > > > > with netconsole regarding kernel panics - rarely this info gets sent > > > > > before the "panic"...) > > > > > > So I am able to reproduce this with a Windows 7 with an ATI 4870 and > > > an Intel 82574L NIC. The video card still works, but the NIC stopped > > > working. Same version of hypervisor/toolstack/etc, only change is the > > > kernel (v3.4.6->v3.5). > > > > > > Time to get my hands greasy with this.. > > > > And its due to a patch I added in v3.4 > > (cd9db80e5257682a7f7ab245a2459648b3c8d268) > > - which did not work properly in v3.4, but with v3.5 got it working > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to now > > work > > anymore. > > > > Anyhow, for right now jsut revert > > cd9db80e5257682a7f7ab245a2459648b3c8d268 > > and it should work for you. > > > Also, our team reported a VT-d bug 2 months ago. > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 > We found "3bb07f1b73ea6313b843807063e183e168c9182a" is the bad commit in linux tree. > Linux3.4.7 works fine; but Linux 3.5 has this issue.Oh, I wish I saw that earlier.> Seem Tobias has the same issue as that in the bug. > But we didn''t meet Dom0 panic when shutting down the DomU.Neither do I - not sure why he sees that.> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >