Ren, Yongjie
2012-Aug-28 08:25 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
> -----Original Message----- > From: Konrad Rzeszutek [mailto:ketuzsezr@gmail.com] On Behalf Of > Konrad Rzeszutek Wilk > Sent: Tuesday, August 21, 2012 10:23 PM > To: Ren, Yongjie > Cc: Konrad Rzeszutek Wilk; Tobias Geiger; xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI > Passthrough?! > > On Tue, Aug 21, 2012 at 02:41:36AM +0000, Ren, Yongjie wrote: > > > -----Original Message----- > > > From: xen-devel-bounces@lists.xen.org > > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Konrad > Rzeszutek > > > Wilk > > > Sent: Tuesday, August 21, 2012 7:30 AM > > > To: Tobias Geiger > > > Cc: xen-devel@lists.xen.org > > > Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI > > > Passthrough?! > > > > > > On Mon, Aug 06, 2012 at 12:16:33PM -0400, Konrad Rzeszutek Wilk > wrote: > > > > On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk > > > wrote: > > > > > On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote: > > > > > > Hi! > > > > > > > > > > > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was > rock > > > > > > stable): > > > > > > > > > > > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller > is > > > > > > not recognized within the DomU (HVM Win7 64) > > > > > > Dom0 cmdline is: > > > > > > ro root=LABEL=dom0root > > > > xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) > > > > > > security=apparmor noirqdebug nouveau.msi=1 > > > > > > > > > > > > Only 8:00.0 and 8:00.1 get passed through without problems, all > the > > > > > > USB Controller IDs are not correctly passed through and get a > > > > > > exclamation mark within the win7 device manager ("could not be > > > > > > started"). > > > > > > > > > > Ok, but they do get passed in though? As in, QEMU sees them. > > > > > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI > > > > > passed in devices do you see them? Meaning lspci shows them? > > > > > > > > > > > > > > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5? > > > > > > > > > > > > > > > > > > > > > > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - > > > sorry > > > > > > that i have no full stacktrace, all i have is a "screenshot" which i > > > > > > uploaded here: > > > > > > > > > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/ > > > > > > > > > > Ugh, that looks like somebody removed a large chunk of a > pagetable. > > > > > > > > > > Hmm. Are you using dom0_mem=max parameter? If not, can you > try > > > > > that and also disable ballooning in the xm/xl config file pls? > > > > > > > > > > > > > > > > > > > > > > > With 3.4 both issues were not there - everything worked perfectly. > > > > > > Tell me which debugging info you need, i may be able to re-install > > > > > > my netconsole to get the full stacktrace (but i had not much luck > > > > > > with netconsole regarding kernel panics - rarely this info gets sent > > > > > > before the "panic"...) > > > > > > > > So I am able to reproduce this with a Windows 7 with an ATI 4870 and > > > > an Intel 82574L NIC. The video card still works, but the NIC stopped > > > > working. Same version of hypervisor/toolstack/etc, only change is the > > > > kernel (v3.4.6->v3.5). > > > > > > > > Time to get my hands greasy with this.. > > > > > > And its due to a patch I added in v3.4 > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268) > > > - which did not work properly in v3.4, but with v3.5 got it working > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to > now > > > work > > > anymore. > > > > > > Anyhow, for right now jsut revert > > > cd9db80e5257682a7f7ab245a2459648b3c8d268 > > > and it should work for you. > > >Confirmed, after reverting that commit, VT-d will work fine. Will you fix this and push it to upstream Linux, Konrad?> > Also, our team reported a VT-d bug 2 months ago. > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 >
Konrad Rzeszutek Wilk
2012-Aug-28 13:19 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
> > > > Anyhow, for right now jsut revert > > > > cd9db80e5257682a7f7ab245a2459648b3c8d268 > > > > and it should work for you. > > > > > Confirmed, after reverting that commit, VT-d will work fine. > Will you fix this and push it to upstream Linux, Konrad?Yes I plan to fix it - thought I am not sure exactly how. The reset functionality works - (too well one could say) - perhaps what I also need is to enable the device after the reset.> > > > Also, our team reported a VT-d bug 2 months ago. > > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 > >
Konrad Rzeszutek Wilk
2012-Sep-05 18:54 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
> > > > And its due to a patch I added in v3.4 > > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268) > > > > - which did not work properly in v3.4, but with v3.5 got it working > > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to > > now > > > > work > > > > anymore. > > > > > > > > Anyhow, for right now jsut revert > > > > cd9db80e5257682a7f7ab245a2459648b3c8d268 > > > > and it should work for you. > > > > > Confirmed, after reverting that commit, VT-d will work fine. > Will you fix this and push it to upstream Linux, Konrad? > > > > Also, our team reported a VT-d bug 2 months ago. > > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 > >Can either one of you please test this patch, please: diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c index 097e536..425bd0b 100644 --- a/drivers/xen/xen-pciback/pci_stub.c +++ b/drivers/xen/xen-pciback/pci_stub.c @@ -4,6 +4,8 @@ * Ryan Wilson <hap9@epoch.ncsc.mil> * Chris Bookholt <hap10@epoch.ncsc.mil> */ +#define DEBUG 1 + #include <linux/module.h> #include <linux/init.h> #include <linux/rwsem.h> @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref *kref) /* Call the reset function which does not take lock as this * is called from "unbind" which takes a device_lock mutex. */ + dev_dbg(&psdev->dev->dev, "FLR locked..\n"); __pci_reset_function_locked(psdev->dev); if (pci_load_and_free_saved_state(psdev->dev, &dev_data->pci_saved_state)) { dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n"); - } else + } else { + dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n"); pci_restore_state(psdev->dev); - + } /* Disable the device */ xen_pcibk_reset_device(psdev->dev); @@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct pci_dev *dev) if (err) goto config_release; - dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); - __pci_reset_function_locked(dev); - /* We need the device active to save the state. */ dev_dbg(&dev->dev, "save state of device\n"); pci_save_state(dev); dev_data->pci_saved_state = pci_store_saved_state(dev); if (!dev_data->pci_saved_state) dev_err(&dev->dev, "Could not store PCI conf saved state!\n"); - + else { + dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); + __pci_reset_function_locked(dev); + } /* Now disable the device (this also ensures some private device * data is setup before we export) */
Tobias Geiger
2012-Sep-06 11:28 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
Hello Konrad, the patch helps regarding the USB-PCIController-Passthrough - this works now in DomU. but i still get the Dom0 crash when shutting down DomU: Sep 6 13:26:19 pc kernel: [ 361.011514] xen-blkback:backend/vbd/1/832: prepare for reconnect Sep 6 13:26:20 pc kernel: [ 361.876395] xen-blkback:backend/vbd/1/768: prepare for reconnect Sep 6 13:26:21 pc kernel: [ 362.682152] br0: port 3(vif1.0) entered disabled state Sep 6 13:26:21 pc kernel: [ 362.682267] br0: port 3(vif1.0) entered disabled state Sep 6 13:26:24 pc kernel: [ 365.541386] ------------[ cut here ]------------ Sep 6 13:26:24 pc kernel: [ 365.541411] invalid opcode: 0000 [#1] PREEMPT SMP Sep 6 13:26:24 pc kernel: [ 365.541423] CPU 2 Sep 6 13:26:24 pc kernel: [ 365.541427] Modules linked in: uvcvideo snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core videodev gpio_ich joydev hid_generic [last unloaded: sc si_wait_scan] Sep 6 13:26:24 pc kernel: [ 365.541474] Sep 6 13:26:24 pc kernel: [ 365.541477] Pid: 1208, comm: kworker/2:1 Not tainted 3.5.0 #3 /DX58SO Sep 6 13:26:24 pc kernel: [ 365.541491] RIP: e030:[<ffffffff81447f95>] [<ffffffff81447f95>] balloon_process+0x385/0x3 a0 Sep 6 13:26:24 pc kernel: [ 365.541507] RSP: e02b:ffff88012e7abdc0 EFLAGS: 00010213 Sep 6 13:26:24 pc kernel: [ 365.541515] RAX: 0000000220be7000 RBX: 0000000000000000 RCX: 0000000000000008 Sep 6 13:26:24 pc kernel: [ 365.541523] RDX: ffff88010d99a000 RSI: 00000000000001df RDI: 000000000020efdf Sep 6 13:26:24 pc kernel: [ 365.541532] RBP: ffff88012e7abe20 R08: ffff88014064e140 R09: 00000000fffffffe Sep 6 13:26:24 pc kernel: [ 365.541540] R10: 0000000000000001 R11: 0000000000000000 R12: 0000160000000000 Sep 6 13:26:24 pc kernel: [ 365.541548] R13: 0000000000000001 R14: 000000000020efdf R15: ffffea00083bf7c0 Sep 6 13:26:24 pc kernel: [ 365.541561] FS: 00007f79d32ce700(0000) GS:ffff880140640000(0000) knlGS:0000000000000000 Sep 6 13:26:24 pc kernel: [ 365.541571] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b Sep 6 13:26:24 pc kernel: [ 365.541578] CR2: 00007f79d2d6ce02 CR3: 0000000001e0c000 CR4: 0000000000002660 Sep 6 13:26:24 pc kernel: [ 365.541587] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 6 13:26:24 pc kernel: [ 365.541596] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Sep 6 13:26:24 pc kernel: [ 365.541604] Process kworker/2:1 (pid: 1208, threadinfo ffff88012e7aa000, task ffff88013101 c440) Sep 6 13:26:24 pc kernel: [ 365.541613] Stack: Sep 6 13:26:24 pc kernel: [ 365.541618] 000000000006877b 0000000000000001 ffffffff8200ea80 0000000000000001 Sep 6 13:26:24 pc kernel: [ 365.541649] 0000000000000000 0000000000007ff0 ffff88012e7abe00 ffff8801302eee00 Sep 6 13:26:24 pc kernel: [ 365.541664] ffff880140657000 ffff88014064e140 0000000000000000 ffffffff81e587c0 Sep 6 13:26:24 pc kernel: [ 365.541679] Call Trace: Sep 6 13:26:24 pc kernel: [ 365.541688] [<ffffffff8106753b>] process_one_work+0x12b/0x450 Sep 6 13:26:24 pc kernel: [ 365.541697] [<ffffffff81447c10>] ? decrease_reservation+0x320/0x320 Sep 6 13:26:24 pc kernel: [ 365.541706] [<ffffffff810688be>] worker_thread+0x12e/0x2d0 Sep 6 13:26:24 pc kernel: [ 365.541715] [<ffffffff81068790>] ? manage_workers.isra.26+0x1f0/0x1f0 Sep 6 13:26:24 pc kernel: [ 365.541725] [<ffffffff8106db7e>] kthread+0x8e/0xa0 Sep 6 13:26:24 pc kernel: [ 365.541735] [<ffffffff8184e3e4>] kernel_thread_helper+0x4/0x10 Sep 6 13:26:24 pc kernel: [ 365.541745] [<ffffffff8184c87c>] ? retint_restore_args+0x5/0x6 Sep 6 13:26:24 pc kernel: [ 365.541754] [<ffffffff8184e3e0>] ? gs_change+0x13/0x13 Sep 6 13:26:24 pc kernel: [ 365.541760] Code: 01 15 f0 6a bc 00 48 29 d0 48 89 05 ee 6a bc 00 e9 31 fd ff ff 0f 0b 0f 0b 4c 89 f7 e8 85 34 bc ff 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 48 83 c1 01 e9 c2 fd ff ff 0 f Sep 6 13:26:24 pc kernel: [ 365.541898] RSP <ffff88012e7abdc0> Sep 6 13:26:24 pc kernel: [ 365.565054] ---[ end trace 25eb9ce0cc61c3a1 ]--- Sep 6 13:26:24 pc kernel: [ 365.565101] PGD 1e0e067 PUD 1e0f067 PMD 0 Sep 6 13:26:24 pc kernel: [ 365.565108] Oops: 0000 [#2] PREEMPT SMP Sep 6 13:26:24 pc kernel: [ 365.565115] CPU 2 Sep 6 13:26:24 pc kernel: [ 365.565118] Modules linked in: uvcvideo snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core videodev gpio_ich joydev hid_generic [last unloaded: sc si_wait_scan] Sep 6 13:26:24 pc kernel: [ 365.565153] Sep 6 13:26:24 pc kernel: [ 365.565156] Pid: 1208, comm: kworker/2:1 Tainted: G D 3.5.0 #3 /DX58SO Sep 6 13:26:24 pc kernel: [ 365.565176] RIP: e030:[<ffffffff8106e08c>] [<ffffffff8106e08c>] kthread_data+0xc/0x20 Sep 6 13:26:24 pc kernel: [ 365.565194] RSP: e02b:ffff88012e7aba90 EFLAGS: 00010092 Sep 6 13:26:24 pc kernel: [ 365.565205] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000002 Sep 6 13:26:24 pc kernel: [ 365.565219] RDX: ffffffff81fcba40 RSI: 0000000000000002 RDI: ffff88013101c440 Sep 6 13:26:24 pc kernel: [ 365.565233] RBP: ffff88012e7abaa8 R08: 0000000000989680 R09: ffffffff81fcba40 Sep 6 13:26:24 pc kernel: [ 365.565248] R10: ffffffff813b0c00 R11: 0000000000000000 R12: ffff8801406536c0 Sep 6 13:26:24 pc kernel: [ 365.565262] R13: 0000000000000002 R14: ffff88013101c430 R15: ffff88013101c440 Sep 6 13:26:24 pc kernel: [ 365.565280] FS: 00007f79d32ce700(0000) GS:ffff880140640000(0000) knlGS:0000000000000000 Sep 6 13:26:24 pc kernel: [ 365.565293] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b Sep 6 13:26:24 pc kernel: [ 365.565303] CR2: fffffffffffffff8 CR3: 0000000001e0c000 CR4: 0000000000002660 Sep 6 13:26:24 pc kernel: [ 365.565318] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 6 13:26:24 pc kernel: [ 365.565332] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Sep 6 13:26:24 pc kernel: [ 365.565349] Process kworker/2:1 (pid: 1208, threadinfo ffff88012e7aa000, task ffff88013101 c440) Sep 6 13:26:24 pc kernel: [ 365.565362] Stack: Sep 6 13:26:24 pc kernel: [ 365.565367] ffffffff810698e0 ffff88012e7abaa8 ffff88013101c818 ffff88012e7abb18 Sep 6 13:26:24 pc kernel: [ 365.565389] ffffffff8184ae02 ffff88012e7abfd8 ffff88013101c440 ffff88012e7abfd8 Sep 6 13:26:24 pc kernel: [ 365.565410] ffff88012e7abfd8 ffff88012d8840c0 ffff88013101c440 ffff88013101ca30 Perhaps this stacktrace helps... Thanks! Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk:>> > > > And its due to a patch I added in v3.4 >> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268) >> > > > - which did not work properly in v3.4, but with v3.5 got it >> working >> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 >> to >> > now >> > > > work >> > > > anymore. >> > > > >> > > > Anyhow, for right now jsut revert >> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268 >> > > > and it should work for you. >> > > > >> Confirmed, after reverting that commit, VT-d will work fine. >> Will you fix this and push it to upstream Linux, Konrad? >> >> > > Also, our team reported a VT-d bug 2 months ago. >> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 >> > > > Can either one of you please test this patch, please: > > > diff --git a/drivers/xen/xen-pciback/pci_stub.c > b/drivers/xen/xen-pciback/pci_stub.c > index 097e536..425bd0b 100644 > --- a/drivers/xen/xen-pciback/pci_stub.c > +++ b/drivers/xen/xen-pciback/pci_stub.c > @@ -4,6 +4,8 @@ > * Ryan Wilson <hap9@epoch.ncsc.mil> > * Chris Bookholt <hap10@epoch.ncsc.mil> > */ > +#define DEBUG 1 > + > #include <linux/module.h> > #include <linux/init.h> > #include <linux/rwsem.h> > @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref > *kref) > /* Call the reset function which does not take lock as this > * is called from "unbind" which takes a device_lock mutex. > */ > + dev_dbg(&psdev->dev->dev, "FLR locked..\n"); > __pci_reset_function_locked(psdev->dev); > if (pci_load_and_free_saved_state(psdev->dev, > &dev_data->pci_saved_state)) { > dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n"); > - } else > + } else { > + dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n"); > pci_restore_state(psdev->dev); > - > + } > /* Disable the device */ > xen_pcibk_reset_device(psdev->dev); > > @@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct > pci_dev *dev) > if (err) > goto config_release; > > - dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); > - __pci_reset_function_locked(dev); > - > /* We need the device active to save the state. */ > dev_dbg(&dev->dev, "save state of device\n"); > pci_save_state(dev); > dev_data->pci_saved_state = pci_store_saved_state(dev); > if (!dev_data->pci_saved_state) > dev_err(&dev->dev, "Could not store PCI conf saved state!\n"); > - > + else { > + dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); > + __pci_reset_function_locked(dev); > + } > /* Now disable the device (this also ensures some private device > * data is setup before we export) > */
Tobias Geiger
2012-Sep-06 11:32 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
FYI: This Dom0-Crash only happens, when i shuttdown the DomU within the DomU, meaning when i choose "Start - Shutdown" within Windows7. The Crash does NOT happen, when i do "xl shutdown domu" ... ?! :) Greetings Tobias Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk:>> > > > And its due to a patch I added in v3.4 >> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268) >> > > > - which did not work properly in v3.4, but with v3.5 got it >> working >> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 >> to >> > now >> > > > work >> > > > anymore. >> > > > >> > > > Anyhow, for right now jsut revert >> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268 >> > > > and it should work for you. >> > > > >> Confirmed, after reverting that commit, VT-d will work fine. >> Will you fix this and push it to upstream Linux, Konrad? >> >> > > Also, our team reported a VT-d bug 2 months ago. >> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 >> > > > Can either one of you please test this patch, please: > > > diff --git a/drivers/xen/xen-pciback/pci_stub.c > b/drivers/xen/xen-pciback/pci_stub.c > index 097e536..425bd0b 100644 > --- a/drivers/xen/xen-pciback/pci_stub.c > +++ b/drivers/xen/xen-pciback/pci_stub.c > @@ -4,6 +4,8 @@ > * Ryan Wilson <hap9@epoch.ncsc.mil> > * Chris Bookholt <hap10@epoch.ncsc.mil> > */ > +#define DEBUG 1 > + > #include <linux/module.h> > #include <linux/init.h> > #include <linux/rwsem.h> > @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref > *kref) > /* Call the reset function which does not take lock as this > * is called from "unbind" which takes a device_lock mutex. > */ > + dev_dbg(&psdev->dev->dev, "FLR locked..\n"); > __pci_reset_function_locked(psdev->dev); > if (pci_load_and_free_saved_state(psdev->dev, > &dev_data->pci_saved_state)) { > dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n"); > - } else > + } else { > + dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n"); > pci_restore_state(psdev->dev); > - > + } > /* Disable the device */ > xen_pcibk_reset_device(psdev->dev); > > @@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct > pci_dev *dev) > if (err) > goto config_release; > > - dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); > - __pci_reset_function_locked(dev); > - > /* We need the device active to save the state. */ > dev_dbg(&dev->dev, "save state of device\n"); > pci_save_state(dev); > dev_data->pci_saved_state = pci_store_saved_state(dev); > if (!dev_data->pci_saved_state) > dev_err(&dev->dev, "Could not store PCI conf saved state!\n"); > - > + else { > + dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); > + __pci_reset_function_locked(dev); > + } > /* Now disable the device (this also ensures some private device > * data is setup before we export) > */
Tobias Geiger
2012-Sep-06 11:46 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
me again :) it seems the Crash is not always a "fatal one": [ 247.080617] vif vif-2-0: 2 reading script [ 247.083519] br0: port 4(vif2.0) entered disabled state [ 247.084144] br0: port 4(vif2.0) entered disabled state [ 250.700029] ------------[ cut here ]------------ [ 250.700046] kernel BUG at drivers/xen/balloon.c:359! [ 250.700059] invalid opcode: 0000 [#1] PREEMPT SMP [ 250.700071] CPU 4 [ 250.700075] Modules linked in: joydev hid_generic uvcvideo snd_usb_audio snd_seq_midi snd_usbmidi_lib snd_hwdep snd_r awmidi videobuf2_vmalloc videobuf2_memops videobuf2_core videodev gpio_ich [last unloaded: scsi_wait_scan] [ 250.700122] [ 250.700125] Pid: 23, comm: kworker/4:0 Not tainted 3.5.0 #3 /DX58SO [ 250.700139] RIP: e030:[<ffffffff81447f95>] [<ffffffff81447f95>] balloon_process+0x385/0x3a0 [ 250.700158] RSP: e02b:ffff8801317b9dc0 EFLAGS: 00010213 [ 250.700162] RAX: 000000021f895000 RBX: 0000000000000000 RCX: 0000000000000002 [ 250.700167] RDX: ffffffff82027000 RSI: 0000000000000137 RDI: 00000000000a2337 [ 250.700172] RBP: ffff8801317b9e20 R08: ffff88014068e140 R09: 00000000fffffffc [ 250.700180] R10: 0000000000000001 R11: 0000000000000000 R12: 0000160000000000 [ 250.700185] R13: 0000000000000001 R14: 00000000000a2337 R15: ffffea000288cdc0 [ 250.700192] FS: 00007fb82ee14700(0000) GS:ffff880140680000(0000) knlGS:0000000000000000 [ 250.700198] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 250.700202] CR2: 00007fb82e7b39a6 CR3: 0000000001e0c000 CR4: 0000000000002660 [ 250.700207] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 250.700213] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 250.700218] Process kworker/4:0 (pid: 23, threadinfo ffff8801317b8000, task ffff88013178db00) [ 250.700223] Stack: [ 250.700225] 000000000006aa7b 0000000000000001 ffffffff8200ea80 0000000000000001 [ 250.700293] 0000000000000000 0000000000007ff0 ffff8801317b9e00 ffff880131796400 [ 250.700301] ffff880140697000 ffff88014068e140 0000000000000000 ffffffff81e587c0 [ 250.700311] Call Trace: [ 250.700317] [<ffffffff8106753b>] process_one_work+0x12b/0x450 [ 250.700322] [<ffffffff81447c10>] ? decrease_reservation+0x320/0x320 [ 250.700328] [<ffffffff810688be>] worker_thread+0x12e/0x2d0 [ 250.700334] [<ffffffff81068790>] ? manage_workers.isra.26+0x1f0/0x1f0 [ 250.700340] [<ffffffff8106db7e>] kthread+0x8e/0xa0 [ 250.700346] [<ffffffff8184e3e4>] kernel_thread_helper+0x4/0x10 [ 250.700353] [<ffffffff8184c87c>] ? retint_restore_args+0x5/0x6 [ 250.700358] [<ffffffff8184e3e0>] ? gs_change+0x13/0x13 [ 250.700362] Code: 01 15 f0 6a bc 00 48 29 d0 48 89 05 ee 6a bc 00 e9 31 fd ff ff 0f 0b 0f 0b 4c 89 f7 e8 85 34 bc ff 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 48 83 c1 01 e9 c2 fd ff ff 0f [ 250.700471] RIP [<ffffffff81447f95>] balloon_process+0x385/0x3a0 [ 250.700482] RSP <ffff8801317b9dc0> [ 250.733955] ---[ end trace a5e5187e8ed6c1ff ]--- [ 250.733982] BUG: unable to handle kernel paging request at fffffffffffffff8 [ 250.733992] IP: [<ffffffff8106e08c>] kthread_data+0xc/0x20 [ 250.733999] PGD 1e0e067 PUD 1e0f067 PMD 0 [ 250.734006] Oops: 0000 [#2] PREEMPT SMP [ 250.734013] CPU 4 [ 250.734016] Modules linked in: joydev hid_generic uvcvideo snd_usb_audio snd_seq_midi snd_usbmidi_lib snd_hwdep snd_r awmidi videobuf2_vmalloc videobuf2_memops videobuf2_core videodev gpio_ich [last unloaded: scsi_wait_scan] [ 250.734071] [ 250.734073] Pid: 23, comm: kworker/4:0 Tainted: G D 3.5.0 #3 /DX58SO [ 250.734095] RIP: e030:[<ffffffff8106e08c>] [<ffffffff8106e08c>] kthread_data+0xc/0x20 [ 250.734111] RSP: e02b:ffff8801317b9a90 EFLAGS: 00010092 [ 250.734122] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000004 [ 250.734137] RDX: ffffffff81fcba40 RSI: 0000000000000004 RDI: ffff88013178db00 [ 250.734151] RBP: ffff8801317b9aa8 R08: 0000000000989680 R09: ffffffff81fcba40 [ 250.734166] R10: ffffffff8104960a R11: 0000000000000000 R12: ffff8801406936c0 [ 250.734178] R13: 0000000000000004 R14: ffff88013178daf0 R15: ffff88013178db00 [ 250.734196] FS: 00007fb82ee14700(0000) GS:ffff880140680000(0000) knlGS:0000000000000000 [ 250.734202] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 250.734209] CR2: fffffffffffffff8 CR3: 0000000001e0c000 CR4: 0000000000002660 [ 250.734222] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 250.734235] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 250.734249] Process kworker/4:0 (pid: 23, threadinfo ffff8801317b8000, task ffff88013178db00) [ 250.734266] Stack: [ 250.734271] ffffffff810698e0 ffff8801317b9aa8 ffff88013178ded8 ffff8801317b9b18 [ 250.734292] ffffffff8184ae02 ffff8801317b9fd8 ffff88013178db00 ffff8801317b9fd8 [ 250.734313] ffff8801317b9fd8 ffff8801334796c0 ffff88013178db00 ffff8801317b9ae8 [ 250.734979] Call Trace: [ 250.735572] [<ffffffff810698e0>] ? wq_worker_sleeping+0x10/0xa0 [ 250.736179] [<ffffffff8184ae02>] __schedule+0x592/0x7d0 [ 250.736783] [<ffffffff8184b164>] schedule+0x24/0x70 [ 250.737373] [<ffffffff81051592>] do_exit+0x5b2/0x910 [ 250.737937] [<ffffffff8183ea1e>] ? printk+0x48/0x4a [ 250.738498] [<ffffffff8100ace2>] ? check_events+0x12/0x20 [ 250.739053] [<ffffffff81017581>] oops_end+0x71/0xa0 [ 250.739596] [<ffffffff810176f3>] die+0x53/0x80 [ 250.740134] [<ffffffff810143f8>] do_trap+0xb8/0x160 [ 250.740668] [<ffffffff810146f3>] do_invalid_op+0xa3/0xb0 [ 250.741203] [<ffffffff81447f95>] ? balloon_process+0x385/0x3a0 [ 250.741737] [<ffffffff81085f52>] ? load_balance+0xd2/0x800 [ 250.742267] [<ffffffff81006276>] ? xen_flush_tlb+0xd6/0x2a0 [ 250.742803] [<ffffffff8108117d>] ? cpuacct_charge+0x6d/0xb0 [ 250.743332] [<ffffffff8184e25b>] invalid_op+0x1b/0x20 [ 250.743855] [<ffffffff81447f95>] ? balloon_process+0x385/0x3a0 [ 250.744374] [<ffffffff8106753b>] process_one_work+0x12b/0x450 [ 250.744897] [<ffffffff81447c10>] ? decrease_reservation+0x320/0x320 [ 250.745426] [<ffffffff810688be>] worker_thread+0x12e/0x2d0 [ 250.745942] [<ffffffff81068790>] ? manage_workers.isra.26+0x1f0/0x1f0 [ 250.746457] [<ffffffff8106db7e>] kthread+0x8e/0xa0 [ 250.746969] [<ffffffff8184e3e4>] kernel_thread_helper+0x4/0x10 [ 250.747480] [<ffffffff8184c87c>] ? retint_restore_args+0x5/0x6 [ 250.747990] [<ffffffff8184e3e0>] ? gs_change+0x13/0x13 [ 250.748487] Code: e0 ff ff 01 48 8b 80 38 e0 ff ff a8 08 0f 84 3d ff ff ff e8 57 d0 7d 00 e9 33 ff ff ff 66 90 48 8b 87 80 03 00 00 55 48 89 e5 5d <48> 8b 40 f8 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 [ 250.749575] RIP [<ffffffff8106e08c>] kthread_data+0xc/0x20 [ 250.750103] RSP <ffff8801317b9a90> [ 250.750627] CR2: fffffffffffffff8 [ 250.751151] ---[ end trace a5e5187e8ed6c200 ]--- [ 250.751152] Fixing recursive fault but reboot is needed! [ 311.042233] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} (detected by 7, t=60011 jiffies) [ 311.042237] INFO: Stall ended before state dump start [ 491.279642] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} (detected by 7, t=240249 jiffies) [ 491.279646] INFO: Stall ended before state dump start [ 671.670546] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} (detected by 7, t=420638 jiffies) [ 671.670550] INFO: Stall ended before state dump start [ 763.240862] INFO: rcu_bh detected stalls on CPUs/tasks: { 1 4} (detected by 5, t=63547 jiffies) [ 763.240867] INFO: Stall ended before state dump start [ 853.438186] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} (detected by 7, t=602410 jiffies) [ 853.438190] INFO: Stall ended before state dump start [ 943.632087] INFO: rcu_bh detected stalls on CPUs/tasks: { 1 4} (detected by 0, t=243935 jiffies) [ 943.632092] INFO: Stall ended before state dump start [ 1033.828726] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} (detected by 7, t=782798 jiffies) [ 1033.828729] INFO: Stall ended before state dump start Now Dom0 still reacts, but mostly unusable sluggish... Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk:>> > > > And its due to a patch I added in v3.4 >> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268) >> > > > - which did not work properly in v3.4, but with v3.5 got it >> working >> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 >> to >> > now >> > > > work >> > > > anymore. >> > > > >> > > > Anyhow, for right now jsut revert >> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268 >> > > > and it should work for you. >> > > > >> Confirmed, after reverting that commit, VT-d will work fine. >> Will you fix this and push it to upstream Linux, Konrad? >> >> > > Also, our team reported a VT-d bug 2 months ago. >> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 >> > > > Can either one of you please test this patch, please: > > > diff --git a/drivers/xen/xen-pciback/pci_stub.c > b/drivers/xen/xen-pciback/pci_stub.c > index 097e536..425bd0b 100644 > --- a/drivers/xen/xen-pciback/pci_stub.c > +++ b/drivers/xen/xen-pciback/pci_stub.c > @@ -4,6 +4,8 @@ > * Ryan Wilson <hap9@epoch.ncsc.mil> > * Chris Bookholt <hap10@epoch.ncsc.mil> > */ > +#define DEBUG 1 > + > #include <linux/module.h> > #include <linux/init.h> > #include <linux/rwsem.h> > @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref > *kref) > /* Call the reset function which does not take lock as this > * is called from "unbind" which takes a device_lock mutex. > */ > + dev_dbg(&psdev->dev->dev, "FLR locked..\n"); > __pci_reset_function_locked(psdev->dev); > if (pci_load_and_free_saved_state(psdev->dev, > &dev_data->pci_saved_state)) { > dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n"); > - } else > + } else { > + dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n"); > pci_restore_state(psdev->dev); > - > + } > /* Disable the device */ > xen_pcibk_reset_device(psdev->dev); > > @@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct > pci_dev *dev) > if (err) > goto config_release; > > - dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); > - __pci_reset_function_locked(dev); > - > /* We need the device active to save the state. */ > dev_dbg(&dev->dev, "save state of device\n"); > pci_save_state(dev); > dev_data->pci_saved_state = pci_store_saved_state(dev); > if (!dev_data->pci_saved_state) > dev_err(&dev->dev, "Could not store PCI conf saved state!\n"); > - > + else { > + dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); > + __pci_reset_function_locked(dev); > + } > /* Now disable the device (this also ensures some private device > * data is setup before we export) > */
Konrad Rzeszutek Wilk
2012-Sep-06 13:05 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
On Thu, Sep 06, 2012 at 01:28:00PM +0200, Tobias Geiger wrote:> Hello Konrad, > > the patch helps regarding the USB-PCIController-Passthrough - this > works now in DomU.Good. Can I put your Reported and Tested by tag.> > but i still get the Dom0 crash when shutting down DomU:That is a different issue. Take a look at " dom0 linux 3.6.0-rc4, crash due to ballooning althoug dom0_mem=X, max:X set " thread please.
Tobias Geiger
2012-Sep-06 13:24 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
Am 06.09.2012 15:05, schrieb Konrad Rzeszutek Wilk:> On Thu, Sep 06, 2012 at 01:28:00PM +0200, Tobias Geiger wrote: >> Hello Konrad, >> >> the patch helps regarding the USB-PCIController-Passthrough - this >> works now in DomU. > > Good. Can I put your Reported and Tested by tag.Of course. thanks for the fix!>> >> but i still get the Dom0 crash when shutting down DomU: > > That is a different issue. Take a look at > " dom0 linux 3.6.0-rc4, crash due to ballooning althoug dom0_mem=X, > max:X set " > thread please.ok will do! Greetings Tobias
Ren, Yongjie
2012-Sep-07 02:08 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
> -----Original Message----- > From: Tobias Geiger [mailto:tobias.geiger@vido.info] > Sent: Thursday, September 06, 2012 7:28 PM > To: Konrad Rzeszutek Wilk > Cc: Ren, Yongjie; Konrad Rzeszutek Wilk; xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI > Passthrough?! > > Hello Konrad, > > the patch helps regarding the USB-PCIController-Passthrough - this > works now in DomU. >Hi Tobias, In my testing, this patch can''t work for a NIC pass-through. Could you have a try with a NIC pass-through?> but i still get the Dom0 crash when shutting down DomU: > > Sep 6 13:26:19 pc kernel: [ 361.011514] > xen-blkback:backend/vbd/1/832: prepare for reconnect > Sep 6 13:26:20 pc kernel: [ 361.876395] > xen-blkback:backend/vbd/1/768: prepare for reconnect > Sep 6 13:26:21 pc kernel: [ 362.682152] br0: port 3(vif1.0) entered > disabled state > Sep 6 13:26:21 pc kernel: [ 362.682267] br0: port 3(vif1.0) entered > disabled state > Sep 6 13:26:24 pc kernel: [ 365.541386] ------------[ cut here > ]------------ > Sep 6 13:26:24 pc kernel: [ 365.541411] invalid opcode: 0000 [#1] > PREEMPT SMP > Sep 6 13:26:24 pc kernel: [ 365.541423] CPU 2 > Sep 6 13:26:24 pc kernel: [ 365.541427] Modules linked in: uvcvideo > snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd > ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core > videodev gpio_ich joydev hid_generic [last unloaded: sc > si_wait_scan] > Sep 6 13:26:24 pc kernel: [ 365.541474] > Sep 6 13:26:24 pc kernel: [ 365.541477] Pid: 1208, comm: kworker/2:1 > Not tainted 3.5.0 #3 /DX58SO > Sep 6 13:26:24 pc kernel: [ 365.541491] RIP: > e030:[<ffffffff81447f95>] [<ffffffff81447f95>] > balloon_process+0x385/0x3 > a0 > Sep 6 13:26:24 pc kernel: [ 365.541507] RSP: e02b:ffff88012e7abdc0 > EFLAGS: 00010213 > Sep 6 13:26:24 pc kernel: [ 365.541515] RAX: 0000000220be7000 RBX: > 0000000000000000 RCX: 0000000000000008 > Sep 6 13:26:24 pc kernel: [ 365.541523] RDX: ffff88010d99a000 RSI: > 00000000000001df RDI: 000000000020efdf > Sep 6 13:26:24 pc kernel: [ 365.541532] RBP: ffff88012e7abe20 R08: > ffff88014064e140 R09: 00000000fffffffe > Sep 6 13:26:24 pc kernel: [ 365.541540] R10: 0000000000000001 R11: > 0000000000000000 R12: 0000160000000000 > Sep 6 13:26:24 pc kernel: [ 365.541548] R13: 0000000000000001 R14: > 000000000020efdf R15: ffffea00083bf7c0 > Sep 6 13:26:24 pc kernel: [ 365.541561] FS: 00007f79d32ce700(0000) > GS:ffff880140640000(0000) knlGS:0000000000000000 > Sep 6 13:26:24 pc kernel: [ 365.541571] CS: e033 DS: 0000 ES: 0000 > CR0: 000000008005003b > Sep 6 13:26:24 pc kernel: [ 365.541578] CR2: 00007f79d2d6ce02 CR3: > 0000000001e0c000 CR4: 0000000000002660 > Sep 6 13:26:24 pc kernel: [ 365.541587] DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > Sep 6 13:26:24 pc kernel: [ 365.541596] DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > Sep 6 13:26:24 pc kernel: [ 365.541604] Process kworker/2:1 (pid: > 1208, threadinfo ffff88012e7aa000, task ffff88013101 > c440) > Sep 6 13:26:24 pc kernel: [ 365.541613] Stack: > Sep 6 13:26:24 pc kernel: [ 365.541618] 000000000006877b > 0000000000000001 ffffffff8200ea80 0000000000000001 > Sep 6 13:26:24 pc kernel: [ 365.541649] 0000000000000000 > 0000000000007ff0 ffff88012e7abe00 ffff8801302eee00 > Sep 6 13:26:24 pc kernel: [ 365.541664] ffff880140657000 > ffff88014064e140 0000000000000000 ffffffff81e587c0 > Sep 6 13:26:24 pc kernel: [ 365.541679] Call Trace: > Sep 6 13:26:24 pc kernel: [ 365.541688] [<ffffffff8106753b>] > process_one_work+0x12b/0x450 > Sep 6 13:26:24 pc kernel: [ 365.541697] [<ffffffff81447c10>] ? > decrease_reservation+0x320/0x320 > Sep 6 13:26:24 pc kernel: [ 365.541706] [<ffffffff810688be>] > worker_thread+0x12e/0x2d0 > Sep 6 13:26:24 pc kernel: [ 365.541715] [<ffffffff81068790>] ? > manage_workers.isra.26+0x1f0/0x1f0 > Sep 6 13:26:24 pc kernel: [ 365.541725] [<ffffffff8106db7e>] > kthread+0x8e/0xa0 > Sep 6 13:26:24 pc kernel: [ 365.541735] [<ffffffff8184e3e4>] > kernel_thread_helper+0x4/0x10 > Sep 6 13:26:24 pc kernel: [ 365.541745] [<ffffffff8184c87c>] ? > retint_restore_args+0x5/0x6 > Sep 6 13:26:24 pc kernel: [ 365.541754] [<ffffffff8184e3e0>] ? > gs_change+0x13/0x13 > Sep 6 13:26:24 pc kernel: [ 365.541760] Code: 01 15 f0 6a bc 00 48 29 > d0 48 89 05 ee 6a bc 00 e9 31 fd ff ff 0f 0b 0f > 0b 4c 89 f7 e8 85 34 bc ff 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f > 1f 84 00 00 00 00 00 48 83 c1 01 e9 c2 fd ff ff 0 > f > Sep 6 13:26:24 pc kernel: [ 365.541898] RSP <ffff88012e7abdc0> > Sep 6 13:26:24 pc kernel: [ 365.565054] ---[ end trace > 25eb9ce0cc61c3a1 ]--- > Sep 6 13:26:24 pc kernel: [ 365.565101] PGD 1e0e067 PUD 1e0f067 > PMD 0 > Sep 6 13:26:24 pc kernel: [ 365.565108] Oops: 0000 [#2] PREEMPT SMP > Sep 6 13:26:24 pc kernel: [ 365.565115] CPU 2 > Sep 6 13:26:24 pc kernel: [ 365.565118] Modules linked in: uvcvideo > snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd > ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core > videodev gpio_ich joydev hid_generic [last unloaded: sc > si_wait_scan] > Sep 6 13:26:24 pc kernel: [ 365.565153] > Sep 6 13:26:24 pc kernel: [ 365.565156] Pid: 1208, comm: kworker/2:1 > Tainted: G D 3.5.0 #3 > /DX58SO > Sep 6 13:26:24 pc kernel: [ 365.565176] RIP: > e030:[<ffffffff8106e08c>] [<ffffffff8106e08c>] kthread_data+0xc/0x20 > Sep 6 13:26:24 pc kernel: [ 365.565194] RSP: e02b:ffff88012e7aba90 > EFLAGS: 00010092 > Sep 6 13:26:24 pc kernel: [ 365.565205] RAX: 0000000000000000 RBX: > 0000000000000002 RCX: 0000000000000002 > Sep 6 13:26:24 pc kernel: [ 365.565219] RDX: ffffffff81fcba40 RSI: > 0000000000000002 RDI: ffff88013101c440 > Sep 6 13:26:24 pc kernel: [ 365.565233] RBP: ffff88012e7abaa8 R08: > 0000000000989680 R09: ffffffff81fcba40 > Sep 6 13:26:24 pc kernel: [ 365.565248] R10: ffffffff813b0c00 R11: > 0000000000000000 R12: ffff8801406536c0 > Sep 6 13:26:24 pc kernel: [ 365.565262] R13: 0000000000000002 R14: > ffff88013101c430 R15: ffff88013101c440 > Sep 6 13:26:24 pc kernel: [ 365.565280] FS: 00007f79d32ce700(0000) > GS:ffff880140640000(0000) knlGS:0000000000000000 > Sep 6 13:26:24 pc kernel: [ 365.565293] CS: e033 DS: 0000 ES: 0000 > CR0: 000000008005003b > Sep 6 13:26:24 pc kernel: [ 365.565303] CR2: fffffffffffffff8 CR3: > 0000000001e0c000 CR4: 0000000000002660 > Sep 6 13:26:24 pc kernel: [ 365.565318] DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > Sep 6 13:26:24 pc kernel: [ 365.565332] DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > Sep 6 13:26:24 pc kernel: [ 365.565349] Process kworker/2:1 (pid: > 1208, threadinfo ffff88012e7aa000, task ffff88013101 > c440) > Sep 6 13:26:24 pc kernel: [ 365.565362] Stack: > Sep 6 13:26:24 pc kernel: [ 365.565367] ffffffff810698e0 > ffff88012e7abaa8 ffff88013101c818 ffff88012e7abb18 > Sep 6 13:26:24 pc kernel: [ 365.565389] ffffffff8184ae02 > ffff88012e7abfd8 ffff88013101c440 ffff88012e7abfd8 > Sep 6 13:26:24 pc kernel: [ 365.565410] ffff88012e7abfd8 > ffff88012d8840c0 ffff88013101c440 ffff88013101ca30 > > > > Perhaps this stacktrace helps... > > Thanks! > > Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk: > >> > > > And its due to a patch I added in v3.4 > >> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268) > >> > > > - which did not work properly in v3.4, but with v3.5 got it > >> working > >> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 > >> to > >> > now > >> > > > work > >> > > > anymore. > >> > > > > >> > > > Anyhow, for right now jsut revert > >> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268 > >> > > > and it should work for you. > >> > > > > >> Confirmed, after reverting that commit, VT-d will work fine. > >> Will you fix this and push it to upstream Linux, Konrad? > >> > >> > > Also, our team reported a VT-d bug 2 months ago. > >> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 > >> > > > > > Can either one of you please test this patch, please: > > > > > > diff --git a/drivers/xen/xen-pciback/pci_stub.c > > b/drivers/xen/xen-pciback/pci_stub.c > > index 097e536..425bd0b 100644 > > --- a/drivers/xen/xen-pciback/pci_stub.c > > +++ b/drivers/xen/xen-pciback/pci_stub.c > > @@ -4,6 +4,8 @@ > > * Ryan Wilson <hap9@epoch.ncsc.mil> > > * Chris Bookholt <hap10@epoch.ncsc.mil> > > */ > > +#define DEBUG 1 > > + > > #include <linux/module.h> > > #include <linux/init.h> > > #include <linux/rwsem.h> > > @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref > > *kref) > > /* Call the reset function which does not take lock as this > > * is called from "unbind" which takes a device_lock mutex. > > */ > > + dev_dbg(&psdev->dev->dev, "FLR locked..\n"); > > __pci_reset_function_locked(psdev->dev); > > if (pci_load_and_free_saved_state(psdev->dev, > > &dev_data->pci_saved_state)) { > > dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n"); > > - } else > > + } else { > > + dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n"); > > pci_restore_state(psdev->dev); > > - > > + } > > /* Disable the device */ > > xen_pcibk_reset_device(psdev->dev); > > > > @@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct > > pci_dev *dev) > > if (err) > > goto config_release; > > > > - dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); > > - __pci_reset_function_locked(dev); > > - > > /* We need the device active to save the state. */ > > dev_dbg(&dev->dev, "save state of device\n"); > > pci_save_state(dev); > > dev_data->pci_saved_state = pci_store_saved_state(dev); > > if (!dev_data->pci_saved_state) > > dev_err(&dev->dev, "Could not store PCI conf saved state!\n"); > > - > > + else { > > + dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); > > + __pci_reset_function_locked(dev); > > + } > > /* Now disable the device (this also ensures some private device > > * data is setup before we export) > > */
Tobias Geiger
2012-Sep-07 10:37 UTC
Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
Am 07.09.2012 04:08, schrieb Ren, Yongjie:>> -----Original Message----- >> From: Tobias Geiger [mailto:tobias.geiger@vido.info] >> Sent: Thursday, September 06, 2012 7:28 PM >> To: Konrad Rzeszutek Wilk >> Cc: Ren, Yongjie; Konrad Rzeszutek Wilk; xen-devel@lists.xen.org >> Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding >> PCI >> Passthrough?! >> >> Hello Konrad, >> >> the patch helps regarding the USB-PCIController-Passthrough - this >> works now in DomU. >> > Hi Tobias, > In my testing, this patch can''t work for a NIC pass-through. > Could you have a try with a NIC pass-through?Hi! unfortunatly not - i have no physical access to the machine until the middle of next week. and as so soon as i try to passthrough the only nic in this machine my remote access will die... :( perhaps someone else can test nic-passthrough? if not i''ll try it asap next week! Greetings Tobias> >> but i still get the Dom0 crash when shutting down DomU: >> >> Sep 6 13:26:19 pc kernel: [ 361.011514] >> xen-blkback:backend/vbd/1/832: prepare for reconnect >> Sep 6 13:26:20 pc kernel: [ 361.876395] >> xen-blkback:backend/vbd/1/768: prepare for reconnect >> Sep 6 13:26:21 pc kernel: [ 362.682152] br0: port 3(vif1.0) >> entered >> disabled state >> Sep 6 13:26:21 pc kernel: [ 362.682267] br0: port 3(vif1.0) >> entered >> disabled state >> Sep 6 13:26:24 pc kernel: [ 365.541386] ------------[ cut here >> ]------------ >> Sep 6 13:26:24 pc kernel: [ 365.541411] invalid opcode: 0000 [#1] >> PREEMPT SMP >> Sep 6 13:26:24 pc kernel: [ 365.541423] CPU 2 >> Sep 6 13:26:24 pc kernel: [ 365.541427] Modules linked in: >> uvcvideo >> snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd >> ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core >> videodev gpio_ich joydev hid_generic [last unloaded: sc >> si_wait_scan] >> Sep 6 13:26:24 pc kernel: [ 365.541474] >> Sep 6 13:26:24 pc kernel: [ 365.541477] Pid: 1208, comm: >> kworker/2:1 >> Not tainted 3.5.0 #3 /DX58SO >> Sep 6 13:26:24 pc kernel: [ 365.541491] RIP: >> e030:[<ffffffff81447f95>] [<ffffffff81447f95>] >> balloon_process+0x385/0x3 >> a0 >> Sep 6 13:26:24 pc kernel: [ 365.541507] RSP: e02b:ffff88012e7abdc0 >> EFLAGS: 00010213 >> Sep 6 13:26:24 pc kernel: [ 365.541515] RAX: 0000000220be7000 RBX: >> 0000000000000000 RCX: 0000000000000008 >> Sep 6 13:26:24 pc kernel: [ 365.541523] RDX: ffff88010d99a000 RSI: >> 00000000000001df RDI: 000000000020efdf >> Sep 6 13:26:24 pc kernel: [ 365.541532] RBP: ffff88012e7abe20 R08: >> ffff88014064e140 R09: 00000000fffffffe >> Sep 6 13:26:24 pc kernel: [ 365.541540] R10: 0000000000000001 R11: >> 0000000000000000 R12: 0000160000000000 >> Sep 6 13:26:24 pc kernel: [ 365.541548] R13: 0000000000000001 R14: >> 000000000020efdf R15: ffffea00083bf7c0 >> Sep 6 13:26:24 pc kernel: [ 365.541561] FS: >> 00007f79d32ce700(0000) >> GS:ffff880140640000(0000) knlGS:0000000000000000 >> Sep 6 13:26:24 pc kernel: [ 365.541571] CS: e033 DS: 0000 ES: >> 0000 >> CR0: 000000008005003b >> Sep 6 13:26:24 pc kernel: [ 365.541578] CR2: 00007f79d2d6ce02 CR3: >> 0000000001e0c000 CR4: 0000000000002660 >> Sep 6 13:26:24 pc kernel: [ 365.541587] DR0: 0000000000000000 DR1: >> 0000000000000000 DR2: 0000000000000000 >> Sep 6 13:26:24 pc kernel: [ 365.541596] DR3: 0000000000000000 DR6: >> 00000000ffff0ff0 DR7: 0000000000000400 >> Sep 6 13:26:24 pc kernel: [ 365.541604] Process kworker/2:1 (pid: >> 1208, threadinfo ffff88012e7aa000, task ffff88013101 >> c440) >> Sep 6 13:26:24 pc kernel: [ 365.541613] Stack: >> Sep 6 13:26:24 pc kernel: [ 365.541618] 000000000006877b >> 0000000000000001 ffffffff8200ea80 0000000000000001 >> Sep 6 13:26:24 pc kernel: [ 365.541649] 0000000000000000 >> 0000000000007ff0 ffff88012e7abe00 ffff8801302eee00 >> Sep 6 13:26:24 pc kernel: [ 365.541664] ffff880140657000 >> ffff88014064e140 0000000000000000 ffffffff81e587c0 >> Sep 6 13:26:24 pc kernel: [ 365.541679] Call Trace: >> Sep 6 13:26:24 pc kernel: [ 365.541688] [<ffffffff8106753b>] >> process_one_work+0x12b/0x450 >> Sep 6 13:26:24 pc kernel: [ 365.541697] [<ffffffff81447c10>] ? >> decrease_reservation+0x320/0x320 >> Sep 6 13:26:24 pc kernel: [ 365.541706] [<ffffffff810688be>] >> worker_thread+0x12e/0x2d0 >> Sep 6 13:26:24 pc kernel: [ 365.541715] [<ffffffff81068790>] ? >> manage_workers.isra.26+0x1f0/0x1f0 >> Sep 6 13:26:24 pc kernel: [ 365.541725] [<ffffffff8106db7e>] >> kthread+0x8e/0xa0 >> Sep 6 13:26:24 pc kernel: [ 365.541735] [<ffffffff8184e3e4>] >> kernel_thread_helper+0x4/0x10 >> Sep 6 13:26:24 pc kernel: [ 365.541745] [<ffffffff8184c87c>] ? >> retint_restore_args+0x5/0x6 >> Sep 6 13:26:24 pc kernel: [ 365.541754] [<ffffffff8184e3e0>] ? >> gs_change+0x13/0x13 >> Sep 6 13:26:24 pc kernel: [ 365.541760] Code: 01 15 f0 6a bc 00 48 >> 29 >> d0 48 89 05 ee 6a bc 00 e9 31 fd ff ff 0f 0b 0f >> 0b 4c 89 f7 e8 85 34 bc ff 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 >> 0f >> 1f 84 00 00 00 00 00 48 83 c1 01 e9 c2 fd ff ff 0 >> f >> Sep 6 13:26:24 pc kernel: [ 365.541898] RSP <ffff88012e7abdc0> >> Sep 6 13:26:24 pc kernel: [ 365.565054] ---[ end trace >> 25eb9ce0cc61c3a1 ]--- >> Sep 6 13:26:24 pc kernel: [ 365.565101] PGD 1e0e067 PUD 1e0f067 >> PMD 0 >> Sep 6 13:26:24 pc kernel: [ 365.565108] Oops: 0000 [#2] PREEMPT >> SMP >> Sep 6 13:26:24 pc kernel: [ 365.565115] CPU 2 >> Sep 6 13:26:24 pc kernel: [ 365.565118] Modules linked in: >> uvcvideo >> snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd >> ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core >> videodev gpio_ich joydev hid_generic [last unloaded: sc >> si_wait_scan] >> Sep 6 13:26:24 pc kernel: [ 365.565153] >> Sep 6 13:26:24 pc kernel: [ 365.565156] Pid: 1208, comm: >> kworker/2:1 >> Tainted: G D 3.5.0 #3 >> /DX58SO >> Sep 6 13:26:24 pc kernel: [ 365.565176] RIP: >> e030:[<ffffffff8106e08c>] [<ffffffff8106e08c>] >> kthread_data+0xc/0x20 >> Sep 6 13:26:24 pc kernel: [ 365.565194] RSP: e02b:ffff88012e7aba90 >> EFLAGS: 00010092 >> Sep 6 13:26:24 pc kernel: [ 365.565205] RAX: 0000000000000000 RBX: >> 0000000000000002 RCX: 0000000000000002 >> Sep 6 13:26:24 pc kernel: [ 365.565219] RDX: ffffffff81fcba40 RSI: >> 0000000000000002 RDI: ffff88013101c440 >> Sep 6 13:26:24 pc kernel: [ 365.565233] RBP: ffff88012e7abaa8 R08: >> 0000000000989680 R09: ffffffff81fcba40 >> Sep 6 13:26:24 pc kernel: [ 365.565248] R10: ffffffff813b0c00 R11: >> 0000000000000000 R12: ffff8801406536c0 >> Sep 6 13:26:24 pc kernel: [ 365.565262] R13: 0000000000000002 R14: >> ffff88013101c430 R15: ffff88013101c440 >> Sep 6 13:26:24 pc kernel: [ 365.565280] FS: >> 00007f79d32ce700(0000) >> GS:ffff880140640000(0000) knlGS:0000000000000000 >> Sep 6 13:26:24 pc kernel: [ 365.565293] CS: e033 DS: 0000 ES: >> 0000 >> CR0: 000000008005003b >> Sep 6 13:26:24 pc kernel: [ 365.565303] CR2: fffffffffffffff8 CR3: >> 0000000001e0c000 CR4: 0000000000002660 >> Sep 6 13:26:24 pc kernel: [ 365.565318] DR0: 0000000000000000 DR1: >> 0000000000000000 DR2: 0000000000000000 >> Sep 6 13:26:24 pc kernel: [ 365.565332] DR3: 0000000000000000 DR6: >> 00000000ffff0ff0 DR7: 0000000000000400 >> Sep 6 13:26:24 pc kernel: [ 365.565349] Process kworker/2:1 (pid: >> 1208, threadinfo ffff88012e7aa000, task ffff88013101 >> c440) >> Sep 6 13:26:24 pc kernel: [ 365.565362] Stack: >> Sep 6 13:26:24 pc kernel: [ 365.565367] ffffffff810698e0 >> ffff88012e7abaa8 ffff88013101c818 ffff88012e7abb18 >> Sep 6 13:26:24 pc kernel: [ 365.565389] ffffffff8184ae02 >> ffff88012e7abfd8 ffff88013101c440 ffff88012e7abfd8 >> Sep 6 13:26:24 pc kernel: [ 365.565410] ffff88012e7abfd8 >> ffff88012d8840c0 ffff88013101c440 ffff88013101ca30 >> >> >> >> Perhaps this stacktrace helps... >> >> Thanks! >> >> Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk: >> >> > > > And its due to a patch I added in v3.4 >> >> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268) >> >> > > > - which did not work properly in v3.4, but with v3.5 got it >> >> working >> >> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes >> v3.5 >> >> to >> >> > now >> >> > > > work >> >> > > > anymore. >> >> > > > >> >> > > > Anyhow, for right now jsut revert >> >> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268 >> >> > > > and it should work for you. >> >> > > > >> >> Confirmed, after reverting that commit, VT-d will work fine. >> >> Will you fix this and push it to upstream Linux, Konrad? >> >> >> >> > > Also, our team reported a VT-d bug 2 months ago. >> >> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824 >> >> > >> > >> > Can either one of you please test this patch, please: >> > >> > >> > diff --git a/drivers/xen/xen-pciback/pci_stub.c >> > b/drivers/xen/xen-pciback/pci_stub.c >> > index 097e536..425bd0b 100644 >> > --- a/drivers/xen/xen-pciback/pci_stub.c >> > +++ b/drivers/xen/xen-pciback/pci_stub.c >> > @@ -4,6 +4,8 @@ >> > * Ryan Wilson <hap9@epoch.ncsc.mil> >> > * Chris Bookholt <hap10@epoch.ncsc.mil> >> > */ >> > +#define DEBUG 1 >> > + >> > #include <linux/module.h> >> > #include <linux/init.h> >> > #include <linux/rwsem.h> >> > @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref >> > *kref) >> > /* Call the reset function which does not take lock as this >> > * is called from "unbind" which takes a device_lock mutex. >> > */ >> > + dev_dbg(&psdev->dev->dev, "FLR locked..\n"); >> > __pci_reset_function_locked(psdev->dev); >> > if (pci_load_and_free_saved_state(psdev->dev, >> > &dev_data->pci_saved_state)) { >> > dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n"); >> > - } else >> > + } else { >> > + dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n"); >> > pci_restore_state(psdev->dev); >> > - >> > + } >> > /* Disable the device */ >> > xen_pcibk_reset_device(psdev->dev); >> > >> > @@ -353,16 +357,16 @@ static int __devinit >> pcistub_init_device(struct >> > pci_dev *dev) >> > if (err) >> > goto config_release; >> > >> > - dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); >> > - __pci_reset_function_locked(dev); >> > - >> > /* We need the device active to save the state. */ >> > dev_dbg(&dev->dev, "save state of device\n"); >> > pci_save_state(dev); >> > dev_data->pci_saved_state = pci_store_saved_state(dev); >> > if (!dev_data->pci_saved_state) >> > dev_err(&dev->dev, "Could not store PCI conf saved state!\n"); >> > - >> > + else { >> > + dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); >> > + __pci_reset_function_locked(dev); >> > + } >> > /* Now disable the device (this also ensures some private device >> > * data is setup before we export) >> > */