I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two tap:aio disks are working fine, but the phy disks and the vif don''t work and I get the following error messages from the DomU kernel during boot: [ 1.783658] Using IPI No-Shortcut mode [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 (state 3) [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 (state 3) [ 11.880079] XENBUS: Timeout connecting to device: device/vif/0 (state 0) [ 11.880146] md: Waiting for all devices to be available before autodetect The DomU VM runs linux version 2.6.30.1 and has worked perfectly on other systems running a 2.6.18 kernel under Xen 3.4. Any ideas? thanks, Anthony Wright. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Jul 28, 2011 at 7:24 AM, Anthony Wright <anthony@overnetdata.com> wrote:> I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two tap:aio disks are working fine, but the phy disks and the vif don''t work and I get the following error messages from the DomU kernel during boot: > > [ 1.783658] Using IPI No-Shortcut mode > [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 (state 3) > [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 (state 3) > [ 11.880079] XENBUS: Timeout connecting to device: device/vif/0 (state 0) > [ 11.880146] md: Waiting for all devices to be available before autodetect > > The DomU VM runs linux version 2.6.30.1 and has worked perfectly on other systems running a 2.6.18 kernel under Xen 3.4. > > Any ideas? >You should post your domU config file. Maybe the problem is some syntax change from Xen 3.4 to Xen 4.1 or from 2.6.18 kernel to 3.0 kernel. Thanks, Todd -- Todd Deshane http://www.linkedin.com/in/deshantm http://www.xen.org/products/cloudxen.html http://runningxen.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Jul-28 15:36 UTC
Re: [Xen-devel] phy disks and vifs timing out in DomU
On 28/07/2011 16:01, Todd Deshane wrote:> On Thu, Jul 28, 2011 at 7:24 AM, Anthony Wright <anthony@overnetdata.com> wrote: >> I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two tap:aio disks are working fine, but the phy disks and the vif don''t work and I get the following error messages from the DomU kernel during boot: >> >> [ 1.783658] Using IPI No-Shortcut mode >> [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 (state 3) >> [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 (state 3) >> [ 11.880079] XENBUS: Timeout connecting to device: device/vif/0 (state 0) >> [ 11.880146] md: Waiting for all devices to be available before autodetect >> >> The DomU VM runs linux version 2.6.30.1 and has worked perfectly on other systems running a 2.6.18 kernel under Xen 3.4. >> >> Any ideas? >> > You should post your domU config file. Maybe the problem is some > syntax change from Xen 3.4 to Xen 4.1 or from 2.6.18 kernel to 3.0 > kernel. > > Thanks, > ToddI''ve attached the Dom0 & the DomU kernel configs. Dom0 is running linux 3.0, DomU is running linux 2.6.30.1. I''m somewhat surprised that the DomU kernel should be important since I know it runs under Xen 3.4. Does this mean that to upgrade from Xen 3.4 to 4.1 I''ve also got to upgrade all my VMs? thanks, Anthony. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Jul 28, 2011 at 3:36 PM, Anthony Wright <anthony@overnetdata.com> wrote:> On 28/07/2011 16:01, Todd Deshane wrote: >> On Thu, Jul 28, 2011 at 7:24 AM, Anthony Wright <anthony@overnetdata.com> wrote: >>> I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two tap:aio disks are working fine, but the phy disks and the vif don''t work and I get the following error messages from the DomU kernel during boot: >>> >>> [ 1.783658] Using IPI No-Shortcut mode >>> [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 (state 3) >>> [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 (state 3) >>> [ 11.880079] XENBUS: Timeout connecting to device: device/vif/0 (state 0) >>> [ 11.880146] md: Waiting for all devices to be available before autodetect >>> >>> The DomU VM runs linux version 2.6.30.1 and has worked perfectly on other systems running a 2.6.18 kernel under Xen 3.4. >>> >>> Any ideas? >>> >> You should post your domU config file. Maybe the problem is some >> syntax change from Xen 3.4 to Xen 4.1 or from 2.6.18 kernel to 3.0 >> kernel. >> >> Thanks, >> Todd > I''ve attached the Dom0 & the DomU kernel configs. Dom0 is running linux > 3.0, DomU is running linux 2.6.30.1.I meant the domU guest configuration file (the xm/xl one). I meant that you might be specifying the disk line incorrectly.> > I''m somewhat surprised that the DomU kernel should be important since I > know it runs under Xen 3.4. Does this mean that to upgrade from Xen 3.4 > to 4.1 I''ve also got to upgrade all my VMs?-- Todd Deshane http://www.linkedin.com/in/deshantm http://www.xen.org/products/cloudxen.html http://runningxen.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Jul-28 16:00 UTC
Re: [Xen-devel] phy disks and vifs timing out in DomU
On 28/07/2011 16:46, Todd Deshane wrote:> On Thu, Jul 28, 2011 at 3:36 PM, Anthony Wright <anthony@overnetdata.com> wrote: >> On 28/07/2011 16:01, Todd Deshane wrote: >>> On Thu, Jul 28, 2011 at 7:24 AM, Anthony Wright <anthony@overnetdata.com> wrote: >>>> I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two tap:aio disks are working fine, but the phy disks and the vif don''t work and I get the following error messages from the DomU kernel during boot: >>>> >>>> [ 1.783658] Using IPI No-Shortcut mode >>>> [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 (state 3) >>>> [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 (state 3) >>>> [ 11.880079] XENBUS: Timeout connecting to device: device/vif/0 (state 0) >>>> [ 11.880146] md: Waiting for all devices to be available before autodetect >>>> >>>> The DomU VM runs linux version 2.6.30.1 and has worked perfectly on other systems running a 2.6.18 kernel under Xen 3.4. >>>> >>>> Any ideas? >>>> >>> You should post your domU config file. Maybe the problem is some >>> syntax change from Xen 3.4 to Xen 4.1 or from 2.6.18 kernel to 3.0 >>> kernel. >>> >>> Thanks, >>> Todd >> I''ve attached the Dom0 & the DomU kernel configs. Dom0 is running linux >> 3.0, DomU is running linux 2.6.30.1. > I meant the domU guest configuration file (the xm/xl one). I meant > that you might be specifying the disk line incorrectly.I''ve attached the startup config. I''ve tried more RAM, but that doesn''t help, and I can''t find anything useful in the xen logs. thanks, Anthony. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2011-07-28 at 11:36 -0400, Anthony Wright wrote:> I''m somewhat surprised that the DomU kernel should be important since > I know it runs under Xen 3.4. Does this mean that to upgrade from Xen > 3.4 to 4.1 I''ve also got to upgrade all my VMs?The domU ABI is stable so this should never be necessary (of course bugs do happen). Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Jul-29 07:53 UTC
[Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with the vga-support patch backported). I can''t get my DomU''s to work due to the phy disks and vifs timing out in DomU and looking through my logs this morning I''m getting a consistent kernel bug report with xen mentioned at the top of the stack trace and vifdisconnect mentioned on the third line of the stack trace, so I suspect it''s related to the problem I''m seeing. I don''t remember seeing the stack trace with 4.1.0 xen, but it''s possible I missed it. I''ve had the report on two consecutive boots and attach the message log from both plus the Dom0 kernel config. Anthony. On 28/07/2011 17:28, Ian Campbell wrote:> On Thu, 2011-07-28 at 11:36 -0400, Anthony Wright wrote: >> I''m somewhat surprised that the DomU kernel should be important since >> I know it runs under Xen 3.4. Does this mean that to upgrade from Xen >> 3.4 to 4.1 I''ve also got to upgrade all my VMs? > The domU ABI is stable so this should never be necessary (of course bugs > do happen). > > Ian. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Jul-29 15:48 UTC
Re: [Xen-devel] phy disks and vifs timing out in DomU (only on certain hardware)
Ok, the plot thickens... I have installed virtually identical systems on two physical machines - identical (and I mean identical) xen, dom0, domU with possibly a slightly different configuration for the domU. They''re as close as I can get without imaging the disk. On both machines Dom0 starts normally, but on one physical machine the DomU starts correctly seeing the two tap:aio disks, the two phy disks and the vif, while on the other physical machine the DomU only sees the two tap:aio disks. The two pieces of hardware are quite different one is a 32 bit processor from 2000/2001 (this is the one that works), the other is a much more modern 64 bit processor about a year old. On 28/07/2011 17:28, Ian Campbell wrote:> On Thu, 2011-07-28 at 11:36 -0400, Anthony Wright wrote: >> I''m somewhat surprised that the DomU kernel should be important since >> I know it runs under Xen 3.4. Does this mean that to upgrade from Xen >> 3.4 to 4.1 I''ve also got to upgrade all my VMs? > The domU ABI is stable so this should never be necessary (of course bugs > do happen). > > Ian. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jul-29 15:55 UTC
Re: [Xen-devel] phy disks and vifs timing out in DomU
On Thu, Jul 28, 2011 at 05:00:13PM +0100, Anthony Wright wrote:> On 28/07/2011 16:46, Todd Deshane wrote: > > On Thu, Jul 28, 2011 at 3:36 PM, Anthony Wright <anthony@overnetdata.com> wrote: > >> On 28/07/2011 16:01, Todd Deshane wrote: > >>> On Thu, Jul 28, 2011 at 7:24 AM, Anthony Wright <anthony@overnetdata.com> wrote: > >>>> I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two tap:aio disks are working fine, but the phy disks and the vif don''t work and I get the following error messages from the DomU kernel during boot: > >>>> > >>>> [ 1.783658] Using IPI No-Shortcut mode > >>>> [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 (state 3) > >>>> [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 (state 3)What device does that correspond to (hint: run xl block-list or xm block-list)?> disk = [ > ''tap:aio:/workspace/agent/appliances/XenFileServer-3.18/rootfs,xvda1,r'' > ,''tap:aio:/var/agent/running/1/sysconfig,xvda2,r'' > ,''phy:/dev/Master/Workspace-1,xvdb1,w'' > ,''phy:/dev/Master/Filesystem,xvdc1,w''_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jul-29 16:06 UTC
Re: [Xen-devel] phy disks and vifs timing out in DomU (only on certain hardware)
On Fri, Jul 29, 2011 at 04:48:15PM +0100, Anthony Wright wrote:> Ok, the plot thickens...Please don''t top post.> > I have installed virtually identical systems on two physical machines - > identical (and I mean identical) xen, dom0, domU with possibly amd5sum match?> slightly different configuration for the domU. They''re as close as I can > get without imaging the disk. On both machines Dom0 starts normally, but > on one physical machine the DomU starts correctly seeing the two tap:aio > disks, the two phy disks and the vif, while on the other physical > machine the DomU only sees the two tap:aio disks. > > The two pieces of hardware are quite different one is a 32 bit processor > from 2000/2001 (this is the one that works), the other is a much more > modern 64 bit processor about a year old._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Jul-29 18:40 UTC
Re: [Xen-devel] phy disks and vifs timing out in DomU
On 29/07/2011 16:55, Konrad Rzeszutek Wilk wrote:> On Thu, Jul 28, 2011 at 05:00:13PM +0100, Anthony Wright wrote: >> On 28/07/2011 16:46, Todd Deshane wrote: >>> On Thu, Jul 28, 2011 at 3:36 PM, Anthony Wright <anthony@overnetdata.com> wrote: >>>> On 28/07/2011 16:01, Todd Deshane wrote: >>>>> On Thu, Jul 28, 2011 at 7:24 AM, Anthony Wright <anthony@overnetdata.com> wrote: >>>>>> I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two tap:aio disks are working fine, but the phy disks and the vif don''t work and I get the following error messages from the DomU kernel during boot: >>>>>> >>>>>> [ 1.783658] Using IPI No-Shortcut mode >>>>>> [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 (state 3) >>>>>> [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 (state 3) > What device does that correspond to (hint: run xl block-list or xm block-list)? >The output from block-list is: Vdev BE handle state evt-ch ring-ref BE-path 51729 0 764 3 10 10 /local/domain/0/backend/vbd/764/51729 51745 0 764 3 11 11 /local/domain/0/backend/vbd/764/51745 51713 0 764 4 8 8 /local/domain/0/backend/qdisk/764/51713 51714 0 764 4 9 9 /local/domain/0/backend/qdisk/764/51714 The two vbds map to two LVM logical volumes in two different volume groups. On 29/07/2011 17:06, Konrad Rzeszutek Wilk wrote:>> > I have installed virtually identical systems on two physical machines - >> > identical (and I mean identical) xen, dom0, domU with possibly a > md5sum match?Yes - md5sum match on all the key components, i.e. xen, dom0 kernel, 99.9% of the root filesystem, the domU kernel & 99.9% of the domU filesystem. Where there isn''t a precise match is on some of the config files. I don''t think these should have any effect, but I will have a go at mirroring the disks (I can''t swap disks since one is SATA & the other IDE). I also was having problems with the vif device, and got a kernel bug report that could potentially relate to it. I''ve attached two syslogs. Anthony. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jul-29 20:01 UTC
Re: [Xen-devel] phy disks and vifs timing out in DomU
[Ian, I copied you on this b/c of the netbk issue - read on]> >>>>> On Thu, Jul 28, 2011 at 7:24 AM, Anthony Wright <anthony@overnetdata.com> wrote: > >>>>>> I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two tap:aio disks are working fine, but the phy disks and the vif don''t work and I get the following error messages from the DomU kernel during boot: > >>>>>> > >>>>>> [ 1.783658] Using IPI No-Shortcut mode > >>>>>> [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 (state 3) > >>>>>> [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 (state 3)Hm, which version of DomU were these? I wonder if this is related to the ''feature-barrier'' that is not supported with 3.0. Do you see anything in the DomU about the disks? or xen-blkfront? Can you run the guests with ''initcall_debug loglevel=8 debug'' to see if if the blkfront is actually running on those disks. Any idea where the source for those DomU''s is? If it is an issue with ''feature-barrier'' it looks like it can''t handle not having that option visible which it should.> > What device does that correspond to (hint: run xl block-list or xm block-list)? > > > The output from block-list is: > > Vdev BE handle state evt-ch ring-ref BE-path > 51729 0 764 3 10 10 /local/domain/0/backend/vbd/764/51729 > 51745 0 764 3 11 11 /local/domain/0/backend/vbd/764/51745 > 51713 0 764 4 8 8 > /local/domain/0/backend/qdisk/764/51713 > 51714 0 764 4 9 9 > /local/domain/0/backend/qdisk/764/51714 > > The two vbds map to two LVM logical volumes in two different volume groups.qdisk.. ok so it does swap over to QEMU internal AIO path. From the output it looks like the ones that hang are the ''phy'' types? Is that right?> > On 29/07/2011 17:06, Konrad Rzeszutek Wilk wrote: > >> > I have installed virtually identical systems on two physical machines - > >> > identical (and I mean identical) xen, dom0, domU with possibly a > > md5sum match? > Yes - md5sum match on all the key components, i.e. xen, dom0 kernel, > 99.9% of the root filesystem, the domU kernel & 99.9% of the domU > filesystem. Where there isn''t a precise match is on some of the config > files. I don''t think these should have any effect, but I will have a go > at mirroring the disks (I can''t swap disks since one is SATA & the other > IDE). > > I also was having problems with the vif device, and got a kernel bug > report that could potentially relate to it. I''ve attached two syslogs.Yeah, that is bad. I actually see a similar issue if I kill forcibly the guests. I hadn''t yet narrowed it down - .. you are looking to be using 4.1.. But not 4.1.1 right? Can you describe to me how you get the netbk crash?> 2011 Jul 29 07:02:10 kernel: [ 33.242680] vbd vbd-1-51745: 1 mapping ring-ref 11 port 11 > 2011 Jul 29 07:02:10 kernel: [ 33.253038] vif vif-1-0: vif1.0: failed to map tx ring. err=-12 status=-1 > 2011 Jul 29 07:02:10 kernel: [ 33.253065] vif vif-1-0: 1 mapping shared-frames 768/769 port 12 > 2011 Jul 29 07:02:43 kernel: [ 66.103514] vif vif-1-0: 2 reading script > 2011 Jul 29 07:02:43 kernel: [ 66.106265] br-internal: port 1(vif1.0) entering disabled state > 2011 Jul 29 07:02:43 kernel: [ 66.106309] libfcoe_device_notification: NETDEV_UNREGISTER vif1.0 > 2011 Jul 29 07:02:43 kernel: [ 66.106333] br-internal: port 1(vif1.0) entering disabled state > 2011 Jul 29 07:02:43 kernel: [ 66.106372] br-internal: mixed no checksumming and other settings. > 2011 Jul 29 07:02:43 kernel: [ 66.114097] ------------[ cut here ]------------ > 2011 Jul 29 07:02:43 kernel: [ 66.114878] kernel BUG at mm/vmalloc.c:2164! > 2011 Jul 29 07:02:43 kernel: [ 66.115058] invalid opcode: 0000 [#1] SMP > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Modules linked in: > 2011 Jul 29 07:02:43 kernel: [ 66.115376] > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Pid: 20, comm: xenwatch Not tainted 3.0.0 #1 MSI MS-7309/MS-7309 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] EIP: 0061:[<c0494bff>] EFLAGS: 00010203 CPU: 1 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] EIP is at free_vm_area+0xf/0x19 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] EAX: 00000000 EBX: cf866480 ECX: 00000018 EDX: 00000000 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] ESI: cfa06800 EDI: d076c400 EBP: cfa06c00 ESP: d0ce7eb4 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Process xenwatch (pid: 20, ti=d0ce6000 task=d0c55140 task.ti=d0ce6000) > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Stack: > 2011 Jul 29 07:02:43 kernel: [ 66.115376] cfa06c00 c09e87aa fffc6e63 c0c4bd65 d0ce7ecc cfa06844 d0ce7ecc d0ce7ecc > 2011 Jul 29 07:02:43 kernel: [ 66.115376] cfa06c00 cfa06800 d076c400 cfa06c94 c09eace0 d04cd380 00000000 fffffffe > 2011 Jul 29 07:02:43 kernel: [ 66.115376] d0ce7f9c c061fe74 d04cd2e0 d076c420 d076c400 d0ce7f9c c09e9f8c d076c400 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] Call Trace: > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c09e87aa>] ? xen_netbk_unmap_frontend_rings+0xbf/0xd3 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0c4bd65>] ? netdev_run_todo+0x1b7/0x1cc > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c09eace0>] ? xenvif_disconnect+0xd0/0xe4 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c061fe74>] ? xenbus_rm+0x37/0x3e > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c09e9f8c>] ? netback_remove+0x40/0x5d > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c062075d>] ? xenbus_dev_remove+0x2c/0x3d > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06620e6>] ? __device_release_driver+0x42/0x79 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06621ac>] ? device_release_driver+0xf/0x17 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0661818>] ? bus_remove_device+0x75/0x84 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0660693>] ? device_del+0xe6/0x125 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06606da>] ? device_unregister+0x8/0x10 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06205f0>] ? xenbus_dev_changed+0x71/0x129 > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0405394>] ? check_events+0x8/0xc > 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c061f711>] ? xenwatch_thread+0xeb/0x113 > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c044792c>] ? wake_up_bit+0x53/0x53 > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c061f626>] ? xenbus_thread+0x1cc/0x1cc > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c0447616>] ? kthread+0x63/0x68 > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c04475b3>] ? kthread_worker_fn+0x122/0x122 > 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c0e0f036>] ? kernel_thread_helper+0x6/0x10 > 2011 Jul 29 07:02:43 kernel: [ 66.129624] Code: c1 00 00 00 01 89 f0 e8 a1 ff ff ff 81 6b 08 00 10 00 00 eb 02 31 db 89 d8 5b 5e c3 53 89 c3 8b 40 04 e8 9b ff ff ff 39 d8 74 04 <0f> 0b eb fe 5b e9 73 95 00 00 57 89 d7 56 31 f6 53 89 c3 eb 09 > 2011 Jul 29 07:02:43 kernel: [ 66.129624] EIP: [<c0494bff>] free_vm_area+0xf/0x19 SS:ESP 0069:d0ce7eb4 > 2011 Jul 29 07:02:43 kernel: [ 66.129624] ---[ end trace 7bb110af96f32256 ]---_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Jul-30 17:05 UTC
Re: [Xen-devel] phy disks and vifs timing out in DomU
On 29/07/2011 21:01, Konrad Rzeszutek Wilk wrote:> [Ian, I copied you on this b/c of the netbk issue - read on] > >>>>>>> On Thu, Jul 28, 2011 at 7:24 AM, Anthony Wright <anthony@overnetdata.com> wrote: >>>>>>>> I have a 32 bit 3.0 Dom0 kernel running Xen 4.1. I am trying to run a 32 bit PV DomU with two tap:aio disks, two phy disks & 1 vif. The two tap:aio disks are working fine, but the phy disks and the vif don''t work and I get the following error messages from the DomU kernel during boot: >>>>>>>> >>>>>>>> [ 1.783658] Using IPI No-Shortcut mode >>>>>>>> [ 11.880061] XENBUS: Timeout connecting to device: device/vbd/51729 (state 3) >>>>>>>> [ 11.880072] XENBUS: Timeout connecting to device: device/vbd/51745 (state 3) > Hm, which version of DomU were these? I wonder if this is related to the ''feature-barrier'' > that is not supported with 3.0. Do you see anything in the DomU about the disks? > or xen-blkfront? Can you run the guests with ''initcall_debug loglevel=8 debug'' to see > if if the blkfront is actually running on those disks.I have attached the domU console output with these options set. I have also spent a fair amount of time trying to narrow down the conditions that cause it, with lots of hardware switching & disk imaging. The conclusion I came to was that it''s not hardware related, but there''s a subtle interaction going on with LVM that''s causing the problem, but I''m struggling to work out how to narrow it down any further than that. I started with a setup that works: Machine 1 with HDD 1 (IDE), and a setup that didn''t: Machine 2 with HDD 2 (SATA). Machine 2 has an IDE port so I unplugged HDD 2 and put HDD 1 in Machine 2 and that setup worked thus excluding most of the hardware. Next I imaged HDD 3 (SATA) from HDD 1 (IDE), unplugged HDD 1 and put HDD 3 in Machine 2, and that setup worked, thus excluding an IDE/SATA issue, and giving me a disk I could safely play with. The disks are organised into two partitions, partition 1 is for Dom0, partition 2 is an LVM volume group and is used for the DomUs. One LV (called Main) in this volume group is used by Dom0 to hold the DomU kernels, config information and other static data & executables, the rest of the VG is issued as LVs to the various DomUs as needed with a fair amount of free space left in the VG. I took the Main LV from HDD 2 (didn''t work) and imaged it onto HDD 3 and by judicious LV renaming booted against this image and the setup failed - great I thought - it''s looks like a very subtle config issue. Next I created a third LV this time imaged from the Main LV that worked, giving me three Main LVs (I called them Main-Works, Main-Broken & Main-Testing) and I simply use lvrename to select the one I wanted as active. However now I couldn''t get the setup to work with any of these three Main LVs including the one that originally worked. Removing the LVs I had recently created, and going back to the original Main LV, the setup started working again. I''m going to try an up to date version of LVM (the one I''m using is a little out of date), and see if that makes any difference, but the version I have at the moment has worked without problem in the past.> Any idea where the source for those DomU''s is? If it is an issue with ''feature-barrier'' > it looks like it can''t handle not having that option visible which it should. >We build the DomUs with a tightly controlled internal build sysem, so I have a full manifest for the DomU.>>> What device does that correspond to (hint: run xl block-list or xm block-list)? >>> >> The output from block-list is: >> >> Vdev BE handle state evt-ch ring-ref BE-path >> 51729 0 764 3 10 10 /local/domain/0/backend/vbd/764/51729 >> 51745 0 764 3 11 11 /local/domain/0/backend/vbd/764/51745 >> 51713 0 764 4 8 8 >> /local/domain/0/backend/qdisk/764/51713 >> 51714 0 764 4 9 9 >> /local/domain/0/backend/qdisk/764/51714 >> >> The two vbds map to two LVM logical volumes in two different volume groups. > qdisk.. ok so it does swap over to QEMU internal AIO path. From the output it looks > like the ones that hang are the ''phy'' types? Is that right? >The ones that hang are phy and are the first two, with vdev numbers of 51729 & 51745.>> On 29/07/2011 17:06, Konrad Rzeszutek Wilk wrote: >>>>> I have installed virtually identical systems on two physical machines - >>>>> identical (and I mean identical) xen, dom0, domU with possibly a >>> md5sum match? >> Yes - md5sum match on all the key components, i.e. xen, dom0 kernel, >> 99.9% of the root filesystem, the domU kernel & 99.9% of the domU >> filesystem. Where there isn''t a precise match is on some of the config >> files. I don''t think these should have any effect, but I will have a go >> at mirroring the disks (I can''t swap disks since one is SATA & the other >> IDE). >> >> I also was having problems with the vif device, and got a kernel bug >> report that could potentially relate to it. I''ve attached two syslogs. > Yeah, that is bad. I actually see a similar issue if I kill forcibly the guests. > I hadn''t yet narrowed it down - .. you are looking to be using 4.1.. But not > 4.1.1 right?I started with 4.1.0, but upgraded to 4.1.1 in the hope that it might fix the problem. The vif timeouts have happened with both versions, but I think the kernel errors have only been happening since I upgraded to xen 4.1.1, however I''m not sure. I''ve also had a number of kernel Oops in place of the kernel errors as well.> Can you describe to me how you get the netbk crash?The DomU when it realises it has a problem with one of it''s disks issues a warning message and then shuts itself down. The netbk crash happens partway through that shutdown process, but not when the DomU is touching the network (as far as I know) - it''s issuing SIG KILLs to all processes. It''s always at the same point in the shutdown process, but the shutdown process pauses at that point for quite a while and since it doesn''t touch the network, I''m not convinced it''s triggered by something that DomU is doing. The netbk crash only happens the first time the DomU starts up & shuts down, it doesn''t happen on subsequent DomU startup/shutdown cycles. It also doesn''t happen if the disks work correctly. I do have a setup that consistently produces it.>> 2011 Jul 29 07:02:10 kernel: [ 33.242680] vbd vbd-1-51745: 1 mapping ring-ref 11 port 11 >> 2011 Jul 29 07:02:10 kernel: [ 33.253038] vif vif-1-0: vif1.0: failed to map tx ring. err=-12 status=-1 >> 2011 Jul 29 07:02:10 kernel: [ 33.253065] vif vif-1-0: 1 mapping shared-frames 768/769 port 12 >> 2011 Jul 29 07:02:43 kernel: [ 66.103514] vif vif-1-0: 2 reading script >> 2011 Jul 29 07:02:43 kernel: [ 66.106265] br-internal: port 1(vif1.0) entering disabled state >> 2011 Jul 29 07:02:43 kernel: [ 66.106309] libfcoe_device_notification: NETDEV_UNREGISTER vif1.0 >> 2011 Jul 29 07:02:43 kernel: [ 66.106333] br-internal: port 1(vif1.0) entering disabled state >> 2011 Jul 29 07:02:43 kernel: [ 66.106372] br-internal: mixed no checksumming and other settings. >> 2011 Jul 29 07:02:43 kernel: [ 66.114097] ------------[ cut here ]------------ >> 2011 Jul 29 07:02:43 kernel: [ 66.114878] kernel BUG at mm/vmalloc.c:2164! >> 2011 Jul 29 07:02:43 kernel: [ 66.115058] invalid opcode: 0000 [#1] SMP >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] Modules linked in: >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] Pid: 20, comm: xenwatch Not tainted 3.0.0 #1 MSI MS-7309/MS-7309 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] EIP: 0061:[<c0494bff>] EFLAGS: 00010203 CPU: 1 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] EIP is at free_vm_area+0xf/0x19 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] EAX: 00000000 EBX: cf866480 ECX: 00000018 EDX: 00000000 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] ESI: cfa06800 EDI: d076c400 EBP: cfa06c00 ESP: d0ce7eb4 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] Process xenwatch (pid: 20, ti=d0ce6000 task=d0c55140 task.ti=d0ce6000) >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] Stack: >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] cfa06c00 c09e87aa fffc6e63 c0c4bd65 d0ce7ecc cfa06844 d0ce7ecc d0ce7ecc >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] cfa06c00 cfa06800 d076c400 cfa06c94 c09eace0 d04cd380 00000000 fffffffe >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] d0ce7f9c c061fe74 d04cd2e0 d076c420 d076c400 d0ce7f9c c09e9f8c d076c400 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] Call Trace: >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c09e87aa>] ? xen_netbk_unmap_frontend_rings+0xbf/0xd3 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0c4bd65>] ? netdev_run_todo+0x1b7/0x1cc >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c09eace0>] ? xenvif_disconnect+0xd0/0xe4 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c061fe74>] ? xenbus_rm+0x37/0x3e >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c09e9f8c>] ? netback_remove+0x40/0x5d >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c062075d>] ? xenbus_dev_remove+0x2c/0x3d >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06620e6>] ? __device_release_driver+0x42/0x79 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06621ac>] ? device_release_driver+0xf/0x17 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0661818>] ? bus_remove_device+0x75/0x84 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0660693>] ? device_del+0xe6/0x125 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06606da>] ? device_unregister+0x8/0x10 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c06205f0>] ? xenbus_dev_changed+0x71/0x129 >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c0405394>] ? check_events+0x8/0xc >> 2011 Jul 29 07:02:43 kernel: [ 66.115376] [<c061f711>] ? xenwatch_thread+0xeb/0x113 >> 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c044792c>] ? wake_up_bit+0x53/0x53 >> 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c061f626>] ? xenbus_thread+0x1cc/0x1cc >> 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c0447616>] ? kthread+0x63/0x68 >> 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c04475b3>] ? kthread_worker_fn+0x122/0x122 >> 2011 Jul 29 07:02:43 kernel: [ 66.129624] [<c0e0f036>] ? kernel_thread_helper+0x6/0x10 >> 2011 Jul 29 07:02:43 kernel: [ 66.129624] Code: c1 00 00 00 01 89 f0 e8 a1 ff ff ff 81 6b 08 00 10 00 00 eb 02 31 db 89 d8 5b 5e c3 53 89 c3 8b 40 04 e8 9b ff ff ff 39 d8 74 04 <0f> 0b eb fe 5b e9 73 95 00 00 57 89 d7 56 31 f6 53 89 c3 eb 09 >> 2011 Jul 29 07:02:43 kernel: [ 66.129624] EIP: [<c0494bff>] free_vm_area+0xf/0x19 SS:ESP 0069:d0ce7eb4 >> 2011 Jul 29 07:02:43 kernel: [ 66.129624] ---[ end trace 7bb110af96f32256 ]---_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Aug-01 11:03 UTC
Re: [Xen-devel] phy disks and vifs timing out in DomU
> > Hm, which version of DomU were these? I wonder if this is related to > > the ''feature-barrier'' > > that is not supported with 3.0. Do you see anything in the DomU > > about the disks? > > or xen-blkfront? Can you run the guests with ''initcall_debug > > loglevel=8 debug'' to see > > if if the blkfront is actually running on those disks. > I have attached the domU console output with these options set. > > I have also spent a fair amount of time trying to narrow down the > conditions that cause it, with lots of hardware switching & disk > imaging. The conclusion I came to was that it''s not hardware related, > but there''s a subtle interaction going on with LVM that''s causing the > problem, but I''m struggling to work out how to narrow it down any > further than that. > > I started with a setup that works: Machine 1 with HDD 1 (IDE), and a > setup that didn''t: Machine 2 with HDD 2 (SATA). Machine 2 has an IDE > port so I unplugged HDD 2 and put HDD 1 in Machine 2 and that setup > worked thus excluding most of the hardware. Next I imaged HDD 3 (SATA) > from HDD 1 (IDE), unplugged HDD 1 and put HDD 3 in Machine 2, and that > setup worked, thus excluding an IDE/SATA issue, and giving me a disk I > could safely play with. The disks are organised into two partitions, > partition 1 is for Dom0, partition 2 is an LVM volume group and is > used > for the DomUs. One LV (called Main) in this volume group is used by > Dom0 > to hold the DomU kernels, config information and other static data & > executables, the rest of the VG is issued as LVs to the various DomUs > as > needed with a fair amount of free space left in the VG. I took the > Main > LV from HDD 2 (didn''t work) and imaged it onto HDD 3 and by judicious > LV > renaming booted against this image and the setup failed - great I > thought - it''s looks like a very subtle config issue. Next I created a > third LV this time imaged from the Main LV that worked, giving me > three > Main LVs (I called them Main-Works, Main-Broken & Main-Testing) and I > simply use lvrename to select the one I wanted as active. However now > I > couldn''t get the setup to work with any of these three Main LVs > including the one that originally worked. Removing the LVs I had > recently created, and going back to the original Main LV, the setup > started working again. > > I''m going to try an up to date version of LVM (the one I''m using is a > little out of date), and see if that makes any difference, but the > version I have at the moment has worked without problem in the past.I''ve managed to isolate it a little tighter, but it''s very strange. I''m also updated to the latest version of LVM but it makes no difference. I have a system with two partitions, the second of which is an LVM volume group. I have a VM which has one vif, two tap:aio disks based on files in a LV within the volume group and two phy disks based on LVs within the volume group. I have managed to get to the situation where I can boot the physical machine and the VM starts correctly. If however I create a new LV of any size and with any name, and restart the physical machine the VM fails to start correctly, with the two phy disks timing out, the vif timing out and a kernel bug 90% of the time and a kernel oops 10% of the time. If I remove this new LV and reboot the physical machine the VM starts correctly again. There is no reason within my code that would cause the new LV to have an effect on the VM, but somehow it does. Anthony. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-03 15:28 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote:> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with > the vga-support patch backported). I can''t get my DomU''s to work due to > the phy disks and vifs timing out in DomU and looking through my logs > this morning I''m getting a consistent kernel bug report with xen > mentioned at the top of the stack trace and vifdisconnect mentioned onYikes! Ian any ideas what to try? Anthony, can you compile the kernel with debug=y and when this happens see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which should dump the grants in use.. that might help a bit.> 2011 Jul 29 07:18:50 kernel: [ 33.213500] vif vif-1-0: vif1.0: failed to map tx ring. err=-12 status=-1 > 2011 Jul 29 07:18:50 kernel: [ 33.213516] vif vif-1-0: 1 mapping shared-frames 768/769 port 12 > 2011 Jul 29 07:19:01 /usr/sbin/cron[3719]: (root) CMD (/usr/monitor/monitor) > 2011 Jul 29 07:19:23 kernel: [ 66.043164] vif vif-1-0: 2 reading script > 2011 Jul 29 07:19:23 kernel: [ 66.045984] br-internal: port 1(vif1.0) entering disabled state > 2011 Jul 29 07:19:23 kernel: [ 66.046044] libfcoe_device_notification: NETDEV_UNREGISTER vif1.0 > 2011 Jul 29 07:19:23 kernel: [ 66.046082] br-internal: port 1(vif1.0) entering disabled state > 2011 Jul 29 07:19:23 kernel: [ 66.046279] br-internal: mixed no checksumming and other settings. > 2011 Jul 29 07:19:23 kernel: [ 66.050077] ------------[ cut here ]------------ > 2011 Jul 29 07:19:23 kernel: [ 66.050858] kernel BUG at mm/vmalloc.c:2164! > 2011 Jul 29 07:19:23 kernel: [ 66.051034] invalid opcode: 0000 [#1] SMP > 2011 Jul 29 07:19:23 kernel: [ 66.051034] Modules linked in: > 2011 Jul 29 07:19:23 kernel: [ 66.051034] > 2011 Jul 29 07:19:23 kernel: [ 66.051034] Pid: 20, comm: xenwatch Not tainted 3.0.0 #1 MSI MS-7309/MS-7309 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] EIP: 0061:[<c0494bff>] EFLAGS: 00010207 CPU: 1 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] EIP is at free_vm_area+0xf/0x19 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] EAX: 00000000 EBX: d0799700 ECX: 00000018 EDX: 00000000 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] ESI: cf9e5800 EDI: d051a600 EBP: cf9e5c00 ESP: d0ce7eb4 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] Process xenwatch (pid: 20, ti=d0ce6000 task=d0c55140 task.ti=d0ce6000) > 2011 Jul 29 07:19:23 kernel: [ 66.051034] Stack: > 2011 Jul 29 07:19:23 kernel: [ 66.051034] cf9e5c00 c09e87aa fffc6e23 c0c4bd65 d0ce7ecc cf9e5844 d0ce7ecc d0ce7ecc > 2011 Jul 29 07:19:23 kernel: [ 66.051034] cf9e5c00 cf9e5800 d051a600 cf9e5c94 c09eace0 cffdbfe0 00000000 fffffffe > 2011 Jul 29 07:19:23 kernel: [ 66.051034] d0ce7f9c c061fe74 cffdbe60 d051a620 d051a600 d0ce7f9c c09e9f8c d051a600 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] Call Trace: > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c09e87aa>] ? xen_netbk_unmap_frontend_rings+0xbf/0xd3 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0c4bd65>] ? netdev_run_todo+0x1b7/0x1cc > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c09eace0>] ? xenvif_disconnect+0xd0/0xe4 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c061fe74>] ? xenbus_rm+0x37/0x3e > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c09e9f8c>] ? netback_remove+0x40/0x5d > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c062075d>] ? xenbus_dev_remove+0x2c/0x3d > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c06620e6>] ? __device_release_driver+0x42/0x79 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c06621ac>] ? device_release_driver+0xf/0x17 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0661818>] ? bus_remove_device+0x75/0x84 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0660693>] ? device_del+0xe6/0x125 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c06606da>] ? device_unregister+0x8/0x10 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c06205f0>] ? xenbus_dev_changed+0x71/0x129 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0405394>] ? check_events+0x8/0xc > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c061f711>] ? xenwatch_thread+0xeb/0x113 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c044792c>] ? wake_up_bit+0x53/0x53 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c061f626>] ? xenbus_thread+0x1cc/0x1cc > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0447616>] ? kthread+0x63/0x68 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c04475b3>] ? kthread_worker_fn+0x122/0x122 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0e0f036>] ? kernel_thread_helper+0x6/0x10 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] Code: c1 00 00 00 01 89 f0 e8 a1 ff ff ff 81 6b 08 00 10 00 00 eb 02 31 db 89 d8 5b 5e c3 53 89 c3 8b 40 04 e8 9b ff ff ff 39 d8 74 04 <0f> 0b eb fe 5b e9 73 95 00 00 57 89 d7 56 31 f6 53 89 c3 eb 09 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] EIP: [<c0494bff>] free_vm_area+0xf/0x19 SS:ESP 0069:d0ce7eb4 > 2011 Jul 29 07:19:23 kernel: [ 66.051034] ---[ end trace b47a8d30fa29735c ]--- > 2011 Jul 29 07:19:23 logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/qdisk/1/51714 > 2011 Jul 29 07:19:23 logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/qdisk/1/51713_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-09 16:35 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Wed, Aug 03, 2011 at 11:28:41AM -0400, Konrad Rzeszutek Wilk wrote:> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: > > I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with > > the vga-support patch backported). I can''t get my DomU''s to work due to > > the phy disks and vifs timing out in DomU and looking through my logs > > this morning I''m getting a consistent kernel bug report with xen > > mentioned at the top of the stack trace and vifdisconnect mentioned on > > Yikes! Ian any ideas what to try?Actually, the patch that Stefano posted might be worth trying. See the attached file. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Aug-19 10:22 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote:> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: >> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with >> the vga-support patch backported). I can''t get my DomU''s to work due to >> the phy disks and vifs timing out in DomU and looking through my logs >> this morning I''m getting a consistent kernel bug report with xen >> mentioned at the top of the stack trace and vifdisconnect mentioned on > Yikes! Ian any ideas what to try? > > Anthony, can you compile the kernel with debug=y and when this happens > see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which > should dump the grants in use.. that might help a bit.I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other config values appeared at this point, and I took defaults for them). The output from /var/log/messages & ''xl dmesg'' is attached. There was no output from ''xl debug-keys g''. I''m going to try 3.0.3 for monday.>> 2011 Jul 29 07:18:50 kernel: [ 33.213500] vif vif-1-0: vif1.0: failed to map tx ring. err=-12 status=-1 >> 2011 Jul 29 07:18:50 kernel: [ 33.213516] vif vif-1-0: 1 mapping shared-frames 768/769 port 12 >> 2011 Jul 29 07:19:01 /usr/sbin/cron[3719]: (root) CMD (/usr/monitor/monitor) >> 2011 Jul 29 07:19:23 kernel: [ 66.043164] vif vif-1-0: 2 reading script >> 2011 Jul 29 07:19:23 kernel: [ 66.045984] br-internal: port 1(vif1.0) entering disabled state >> 2011 Jul 29 07:19:23 kernel: [ 66.046044] libfcoe_device_notification: NETDEV_UNREGISTER vif1.0 >> 2011 Jul 29 07:19:23 kernel: [ 66.046082] br-internal: port 1(vif1.0) entering disabled state >> 2011 Jul 29 07:19:23 kernel: [ 66.046279] br-internal: mixed no checksumming and other settings. >> 2011 Jul 29 07:19:23 kernel: [ 66.050077] ------------[ cut here ]------------ >> 2011 Jul 29 07:19:23 kernel: [ 66.050858] kernel BUG at mm/vmalloc.c:2164! >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] invalid opcode: 0000 [#1] SMP >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] Modules linked in: >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] Pid: 20, comm: xenwatch Not tainted 3.0.0 #1 MSI MS-7309/MS-7309 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] EIP: 0061:[<c0494bff>] EFLAGS: 00010207 CPU: 1 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] EIP is at free_vm_area+0xf/0x19 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] EAX: 00000000 EBX: d0799700 ECX: 00000018 EDX: 00000000 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] ESI: cf9e5800 EDI: d051a600 EBP: cf9e5c00 ESP: d0ce7eb4 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] Process xenwatch (pid: 20, ti=d0ce6000 task=d0c55140 task.ti=d0ce6000) >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] Stack: >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] cf9e5c00 c09e87aa fffc6e23 c0c4bd65 d0ce7ecc cf9e5844 d0ce7ecc d0ce7ecc >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] cf9e5c00 cf9e5800 d051a600 cf9e5c94 c09eace0 cffdbfe0 00000000 fffffffe >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] d0ce7f9c c061fe74 cffdbe60 d051a620 d051a600 d0ce7f9c c09e9f8c d051a600 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] Call Trace: >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c09e87aa>] ? xen_netbk_unmap_frontend_rings+0xbf/0xd3 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0c4bd65>] ? netdev_run_todo+0x1b7/0x1cc >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c09eace0>] ? xenvif_disconnect+0xd0/0xe4 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c061fe74>] ? xenbus_rm+0x37/0x3e >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c09e9f8c>] ? netback_remove+0x40/0x5d >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c062075d>] ? xenbus_dev_remove+0x2c/0x3d >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c06620e6>] ? __device_release_driver+0x42/0x79 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c06621ac>] ? device_release_driver+0xf/0x17 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0661818>] ? bus_remove_device+0x75/0x84 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0660693>] ? device_del+0xe6/0x125 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c06606da>] ? device_unregister+0x8/0x10 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c06205f0>] ? xenbus_dev_changed+0x71/0x129 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0405394>] ? check_events+0x8/0xc >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c061f711>] ? xenwatch_thread+0xeb/0x113 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c044792c>] ? wake_up_bit+0x53/0x53 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c061f626>] ? xenbus_thread+0x1cc/0x1cc >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0447616>] ? kthread+0x63/0x68 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c04475b3>] ? kthread_worker_fn+0x122/0x122 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] [<c0e0f036>] ? kernel_thread_helper+0x6/0x10 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] Code: c1 00 00 00 01 89 f0 e8 a1 ff ff ff 81 6b 08 00 10 00 00 eb 02 31 db 89 d8 5b 5e c3 53 89 c3 8b 40 04 e8 9b ff ff ff 39 d8 74 04 <0f> 0b eb fe 5b e9 73 95 00 00 57 89 d7 56 31 f6 53 89 c3 eb 09 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] EIP: [<c0494bff>] free_vm_area+0xf/0x19 SS:ESP 0069:d0ce7eb4 >> 2011 Jul 29 07:19:23 kernel: [ 66.051034] ---[ end trace b47a8d30fa29735c ]--- >> 2011 Jul 29 07:19:23 logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/qdisk/1/51714 >> 2011 Jul 29 07:19:23 logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/qdisk/1/51713_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-19 12:56 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote:> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: > > On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: > >> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with > >> the vga-support patch backported). I can''t get my DomU''s to work due to > >> the phy disks and vifs timing out in DomU and looking through my logs > >> this morning I''m getting a consistent kernel bug report with xen > >> mentioned at the top of the stack trace and vifdisconnect mentioned on > > Yikes! Ian any ideas what to try? > > > > Anthony, can you compile the kernel with debug=y and when this happens > > see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which > > should dump the grants in use.. that might help a bit. > I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other > config values appeared at this point, and I took defaults for them). > > The output from /var/log/messages & ''xl dmesg'' is attached. There was no > output from ''xl debug-keys g''.Ok, so I am hitting this too - I was hoping that the patch from Stefano would have fixed the issue, but sadly it did not. Let me (I am traveling right now) see if I can come up with an internim solution until Ian comes with the right fix. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Aug-22 11:02 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote:> On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: >> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: >>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: >>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with >>>> the vga-support patch backported). I can''t get my DomU''s to work due to >>>> the phy disks and vifs timing out in DomU and looking through my logs >>>> this morning I''m getting a consistent kernel bug report with xen >>>> mentioned at the top of the stack trace and vifdisconnect mentioned on >>> Yikes! Ian any ideas what to try? >>> >>> Anthony, can you compile the kernel with debug=y and when this happens >>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which >>> should dump the grants in use.. that might help a bit. >> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other >> config values appeared at this point, and I took defaults for them). >> >> The output from /var/log/messages & ''xl dmesg'' is attached. There was no >> output from ''xl debug-keys g''. > Ok, so I am hitting this too - I was hoping that the patch from Stefano > would have fixed the issue, but sadly it did not. > > Let me (I am traveling right now) see if I can come up with an internim > solution until Ian comes with the right fix. >I''ve just tested with a vanilla 3.0.3 kernel and I get exactly the same result. Anthony. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Aug-25 20:31 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote:> On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: >> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: >>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: >>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with >>>> the vga-support patch backported). I can''t get my DomU''s to work due to >>>> the phy disks and vifs timing out in DomU and looking through my logs >>>> this morning I''m getting a consistent kernel bug report with xen >>>> mentioned at the top of the stack trace and vifdisconnect mentioned on >>> Yikes! Ian any ideas what to try? >>> >>> Anthony, can you compile the kernel with debug=y and when this happens >>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which >>> should dump the grants in use.. that might help a bit. >> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other >> config values appeared at this point, and I took defaults for them). >> >> The output from /var/log/messages & ''xl dmesg'' is attached. There was no >> output from ''xl debug-keys g''. > Ok, so I am hitting this too - I was hoping that the patch from Stefano > would have fixed the issue, but sadly it did not. > > Let me (I am traveling right now) see if I can come up with an internim > solution until Ian comes with the right fix. >Hi Konrad - any progress on this - it''s a bit of a show stopper for me. One thing to add is that I''ve got a qemu-dm process running for a para-virtualised DomU, which I''m told shouldn''t happen. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Aug-25 21:11 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote:> On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: >> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: >>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: >>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with >>>> the vga-support patch backported). I can''t get my DomU''s to work due to >>>> the phy disks and vifs timing out in DomU and looking through my logs >>>> this morning I''m getting a consistent kernel bug report with xen >>>> mentioned at the top of the stack trace and vifdisconnect mentioned on >>> Yikes! Ian any ideas what to try? >>> >>> Anthony, can you compile the kernel with debug=y and when this happens >>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which >>> should dump the grants in use.. that might help a bit. >> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other >> config values appeared at this point, and I took defaults for them). >> >> The output from /var/log/messages & ''xl dmesg'' is attached. There was no >> output from ''xl debug-keys g''. > Ok, so I am hitting this too - I was hoping that the patch from Stefano > would have fixed the issue, but sadly it did not. > > Let me (I am traveling right now) see if I can come up with an internim > solution until Ian comes with the right fix. >On different hardware with the same software I''m also getting problems starting DomUs, but this time the error is different. I''ve attached a copy of the xl console output, but basically the server hang at "Mount-cache hash table entries: 512". Again the VM is paravirtualised, and again I get a qemu-dm process for it. The references to this message are normally related to memory issues, but the server has only 1000M of ram, so can''t see it causing too much of a problem. Is this related to the other problems I''m seeing or completely separate? thanks, Anthony _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sander Eikelenboom
2011-Aug-26 07:10 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
Hello Anthony, Perhaps you could try running with xend instead of the xl toolstack ? Since you have also changed the hypervisor version to 4.1.1, i think you were previously using xend instead of xl ? So in theory it could also be a problem in the xl toolstack causing the extra qemu processes when building the domain. -- Sander Thursday, August 25, 2011, 11:11:44 PM, you wrote:> On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote: >> On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: >>> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: >>>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: >>>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with >>>>> the vga-support patch backported). I can''t get my DomU''s to work due to >>>>> the phy disks and vifs timing out in DomU and looking through my logs >>>>> this morning I''m getting a consistent kernel bug report with xen >>>>> mentioned at the top of the stack trace and vifdisconnect mentioned on >>>> Yikes! Ian any ideas what to try? >>>> >>>> Anthony, can you compile the kernel with debug=y and when this happens >>>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which >>>> should dump the grants in use.. that might help a bit. >>> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other >>> config values appeared at this point, and I took defaults for them). >>> >>> The output from /var/log/messages & ''xl dmesg'' is attached. There was no >>> output from ''xl debug-keys g''. >> Ok, so I am hitting this too - I was hoping that the patch from Stefano >> would have fixed the issue, but sadly it did not. >> >> Let me (I am traveling right now) see if I can come up with an internim >> solution until Ian comes with the right fix. >> > On different hardware with the same software I''m also getting problems > starting DomUs, but this time the error is different. I''ve attached a > copy of the xl console output, but basically the server hang at > "Mount-cache hash table entries: 512". Again the VM is paravirtualised, > and again I get a qemu-dm process for it.> The references to this message are normally related to memory issues, > but the server has only 1000M of ram, so can''t see it causing too much > of a problem.> Is this related to the other problems I''m seeing or completely separate?> thanks,> Anthony-- Best regards, Sander mailto:linux@eikelenboom.it _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2011-Aug-26 11:23 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Fri, Aug 26, 2011 at 09:10:55AM +0200, Sander Eikelenboom wrote:> Hello Anthony, > > Perhaps you could try running with xend instead of the xl toolstack ? > Since you have also changed the hypervisor version to 4.1.1, i think you were previously using xend instead of xl ? > > So in theory it could also be a problem in the xl toolstack causing the extra qemu processes when building the domain. >Are you using pvfb for the domU? If yes, pvfb needs qemu-dm for the VNC server.. -- Pasi> > Sander > > Thursday, August 25, 2011, 11:11:44 PM, you wrote: > > > On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote: > >> On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: > >>> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: > >>>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: > >>>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with > >>>>> the vga-support patch backported). I can''t get my DomU''s to work due to > >>>>> the phy disks and vifs timing out in DomU and looking through my logs > >>>>> this morning I''m getting a consistent kernel bug report with xen > >>>>> mentioned at the top of the stack trace and vifdisconnect mentioned on > >>>> Yikes! Ian any ideas what to try? > >>>> > >>>> Anthony, can you compile the kernel with debug=y and when this happens > >>>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which > >>>> should dump the grants in use.. that might help a bit. > >>> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other > >>> config values appeared at this point, and I took defaults for them). > >>> > >>> The output from /var/log/messages & ''xl dmesg'' is attached. There was no > >>> output from ''xl debug-keys g''. > >> Ok, so I am hitting this too - I was hoping that the patch from Stefano > >> would have fixed the issue, but sadly it did not. > >> > >> Let me (I am traveling right now) see if I can come up with an internim > >> solution until Ian comes with the right fix. > >> > > On different hardware with the same software I''m also getting problems > > starting DomUs, but this time the error is different. I''ve attached a > > copy of the xl console output, but basically the server hang at > > "Mount-cache hash table entries: 512". Again the VM is paravirtualised, > > and again I get a qemu-dm process for it. > > > The references to this message are normally related to memory issues, > > but the server has only 1000M of ram, so can''t see it causing too much > > of a problem. > > > Is this related to the other problems I''m seeing or completely separate? > > > thanks, > > > Anthony > > > > > > -- > Best regards, > Sander mailto:linux@eikelenboom.it > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Aug-26 12:15 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 26/08/2011 13:16, Stefano Stabellini wrote:> On Thu, 25 Aug 2011, Anthony Wright wrote: >> On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote: >>> On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: >>>> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: >>>>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: >>>>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with >>>>>> the vga-support patch backported). I can''t get my DomU''s to work due to >>>>>> the phy disks and vifs timing out in DomU and looking through my logs >>>>>> this morning I''m getting a consistent kernel bug report with xen >>>>>> mentioned at the top of the stack trace and vifdisconnect mentioned on >>>>> Yikes! Ian any ideas what to try? >>>>> >>>>> Anthony, can you compile the kernel with debug=y and when this happens >>>>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which >>>>> should dump the grants in use.. that might help a bit. >>>> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other >>>> config values appeared at this point, and I took defaults for them). >>>> >>>> The output from /var/log/messages & ''xl dmesg'' is attached. There was no >>>> output from ''xl debug-keys g''. >>> Ok, so I am hitting this too - I was hoping that the patch from Stefano >>> would have fixed the issue, but sadly it did not. >>> >>> Let me (I am traveling right now) see if I can come up with an internim >>> solution until Ian comes with the right fix. >>> >> On different hardware with the same software I''m also getting problems >> starting DomUs, but this time the error is different. I''ve attached a >> copy of the xl console output, but basically the server hang at >> "Mount-cache hash table entries: 512". Again the VM is paravirtualised, >> and again I get a qemu-dm process for it. >> >> The references to this message are normally related to memory issues, >> but the server has only 1000M of ram, so can''t see it causing too much >> of a problem. >> >> Is this related to the other problems I''m seeing or completely separate? > Could you please post your VM config file?Attached are two VM config files. The file xen-config-A is the xen server that fails at the Mount-cache line. The file xen-config-B is the xen server that timeout attaching to some of the xvds and the vif. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Aug-26 12:16 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, 25 Aug 2011, Anthony Wright wrote:> On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote: > > On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: > >> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: > >>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: > >>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with > >>>> the vga-support patch backported). I can''t get my DomU''s to work due to > >>>> the phy disks and vifs timing out in DomU and looking through my logs > >>>> this morning I''m getting a consistent kernel bug report with xen > >>>> mentioned at the top of the stack trace and vifdisconnect mentioned on > >>> Yikes! Ian any ideas what to try? > >>> > >>> Anthony, can you compile the kernel with debug=y and when this happens > >>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which > >>> should dump the grants in use.. that might help a bit. > >> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other > >> config values appeared at this point, and I took defaults for them). > >> > >> The output from /var/log/messages & ''xl dmesg'' is attached. There was no > >> output from ''xl debug-keys g''. > > Ok, so I am hitting this too - I was hoping that the patch from Stefano > > would have fixed the issue, but sadly it did not. > > > > Let me (I am traveling right now) see if I can come up with an internim > > solution until Ian comes with the right fix. > > > On different hardware with the same software I''m also getting problems > starting DomUs, but this time the error is different. I''ve attached a > copy of the xl console output, but basically the server hang at > "Mount-cache hash table entries: 512". Again the VM is paravirtualised, > and again I get a qemu-dm process for it. > > The references to this message are normally related to memory issues, > but the server has only 1000M of ram, so can''t see it causing too much > of a problem. > > Is this related to the other problems I''m seeing or completely separate?Could you please post your VM config file? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Aug-26 12:32 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Fri, 26 Aug 2011, Anthony Wright wrote:> On 26/08/2011 13:16, Stefano Stabellini wrote: > > On Thu, 25 Aug 2011, Anthony Wright wrote: > >> On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote: > >>> On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: > >>>> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: > >>>>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: > >>>>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with > >>>>>> the vga-support patch backported). I can''t get my DomU''s to work due to > >>>>>> the phy disks and vifs timing out in DomU and looking through my logs > >>>>>> this morning I''m getting a consistent kernel bug report with xen > >>>>>> mentioned at the top of the stack trace and vifdisconnect mentioned on > >>>>> Yikes! Ian any ideas what to try? > >>>>> > >>>>> Anthony, can you compile the kernel with debug=y and when this happens > >>>>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which > >>>>> should dump the grants in use.. that might help a bit. > >>>> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other > >>>> config values appeared at this point, and I took defaults for them). > >>>> > >>>> The output from /var/log/messages & ''xl dmesg'' is attached. There was no > >>>> output from ''xl debug-keys g''. > >>> Ok, so I am hitting this too - I was hoping that the patch from Stefano > >>> would have fixed the issue, but sadly it did not. > >>> > >>> Let me (I am traveling right now) see if I can come up with an internim > >>> solution until Ian comes with the right fix. > >>> > >> On different hardware with the same software I''m also getting problems > >> starting DomUs, but this time the error is different. I''ve attached a > >> copy of the xl console output, but basically the server hang at > >> "Mount-cache hash table entries: 512". Again the VM is paravirtualised, > >> and again I get a qemu-dm process for it. > >> > >> The references to this message are normally related to memory issues, > >> but the server has only 1000M of ram, so can''t see it causing too much > >> of a problem. > >> > >> Is this related to the other problems I''m seeing or completely separate? > > Could you please post your VM config file? > Attached are two VM config files. The file xen-config-A is the xen > server that fails at the Mount-cache line. The file xen-config-B is the > xen server that timeout attaching to some of the xvds and the vif. >can you try to use losetup to setup a loop device for each of the tap:aio files you have and then specify phy:/dev/loopN in the config file rather than tap:aio? For example I mean: losetup /dev/loop0 /workspace/agent/appliances/XenFileServer-3.20/rootfs then in the config file: phy:/dev/loop0,xvda1,r _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-26 14:26 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, Aug 25, 2011 at 09:31:46PM +0100, Anthony Wright wrote:> On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote: > > On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: > >> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: > >>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: > >>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with > >>>> the vga-support patch backported). I can''t get my DomU''s to work due to > >>>> the phy disks and vifs timing out in DomU and looking through my logs > >>>> this morning I''m getting a consistent kernel bug report with xen > >>>> mentioned at the top of the stack trace and vifdisconnect mentioned on > >>> Yikes! Ian any ideas what to try? > >>> > >>> Anthony, can you compile the kernel with debug=y and when this happens > >>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which > >>> should dump the grants in use.. that might help a bit. > >> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other > >> config values appeared at this point, and I took defaults for them). > >> > >> The output from /var/log/messages & ''xl dmesg'' is attached. There was no > >> output from ''xl debug-keys g''. > > Ok, so I am hitting this too - I was hoping that the patch from Stefano > > would have fixed the issue, but sadly it did not. > > > > Let me (I am traveling right now) see if I can come up with an internim > > solution until Ian comes with the right fix. > > > Hi Konrad - any progress on this - it''s a bit of a show stopper for me.What is interesting is that it happens only with 32-bit guests and with not-so fast hardware: Atom D510 for me and in your case MSI MS-7309 motherboard (with what kind of processor?). I''ve a 64-bit hypervisor - not sure if you are using a 32-bit or 64-bit. I hadn''t tried to reproduce this on the Atom D510 with a 64-bit Dom0. But I was wondering if you had this setup before - with a 64-bit dom0? Or is that really not an option with your CPU? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-26 14:44 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Fri, Aug 26, 2011 at 10:26:06AM -0400, Konrad Rzeszutek Wilk wrote:> On Thu, Aug 25, 2011 at 09:31:46PM +0100, Anthony Wright wrote: > > On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote: > > > On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote: > > >> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote: > > >>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote: > > >>>> I''ve just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with > > >>>> the vga-support patch backported). I can''t get my DomU''s to work due to > > >>>> the phy disks and vifs timing out in DomU and looking through my logs > > >>>> this morning I''m getting a consistent kernel bug report with xen > > >>>> mentioned at the top of the stack trace and vifdisconnect mentioned on > > >>> Yikes! Ian any ideas what to try? > > >>> > > >>> Anthony, can you compile the kernel with debug=y and when this happens > > >>> see what ''xl dmesg'' gives? Also there is also the ''xl debug-keys g'' which > > >>> should dump the grants in use.. that might help a bit. > > >> I''ve compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other > > >> config values appeared at this point, and I took defaults for them). > > >> > > >> The output from /var/log/messages & ''xl dmesg'' is attached. There was no > > >> output from ''xl debug-keys g''. > > > Ok, so I am hitting this too - I was hoping that the patch from Stefano > > > would have fixed the issue, but sadly it did not. > > > > > > Let me (I am traveling right now) see if I can come up with an internim > > > solution until Ian comes with the right fix. > > > > > Hi Konrad - any progress on this - it''s a bit of a show stopper for me. > > What is interesting is that it happens only with 32-bit guests and with > not-so fast hardware: Atom D510 for me and in your case MSI MS-7309 motherboard > (with what kind of processor?). I''ve a 64-bit hypervisor - not sure if you > are using a 32-bit or 64-bit. > > I hadn''t tried to reproduce this on the Atom D510 with a 64-bit Dom0. > But I was wondering if you had this setup before - with a 64-bit dom0? > Or is that really not an option with your CPU?So while I am still looking at the hypervisor code to figure out why it would give me: (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 I''ve cobbled this patch^H^H^Hhack to retry the transaction to see if this is a tempory issue (race) or really - somehow that L1 PTE is gone. If you could, can you try it out and see if the errors that are spit are repeated - mainly the "Could not find L1 PTE". You will need to run the hypervisor with "loglvl=all" to get that information. to compile the hypervisor with debug=y to get that diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index fd00f25..7bee981 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1607,7 +1607,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif, struct gnttab_map_grant_ref op; struct xen_netif_tx_sring *txs; struct xen_netif_rx_sring *rxs; - + int retry = 3; int err = -ENOMEM; vif->tx_comms_area = alloc_vm_area(PAGE_SIZE); @@ -1620,7 +1620,8 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif, gnttab_set_map_op(&op, (unsigned long)vif->tx_comms_area->addr, GNTMAP_host_map, tx_ring_ref, vif->domid); - + op.status = 0; +retry_tx: if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1)) BUG(); @@ -1628,6 +1629,8 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif, netdev_warn(vif->dev, "failed to map tx ring. err=%d status=%d\n", err, op.status); + if (retry-- > 0) + goto retry_tx; err = op.status; goto err; } @@ -1641,6 +1644,9 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif, gnttab_set_map_op(&op, (unsigned long)vif->rx_comms_area->addr, GNTMAP_host_map, rx_ring_ref, vif->domid); + retry = 3; + op.status = 0; +retry_rx: if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1)) BUG(); @@ -1648,6 +1654,8 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif, netdev_warn(vif->dev, "failed to map rx ring. err=%d status=%d\n", err, op.status); + if (retry-- > 0) + goto retry_rx; err = op.status; goto err; }> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Aug-29 12:13 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 26/08/2011 15:44, Konrad Rzeszutek Wilk wrote:> On Fri, Aug 26, 2011 at 10:26:06AM -0400, Konrad Rzeszutek Wilk wrote: >> What is interesting is that it happens only with 32-bit guests and with >> not-so fast hardware: Atom D510 for me and in your case MSI MS-7309 motherboard >> (with what kind of processor?). I''ve a 64-bit hypervisor - not sure if you >> are using a 32-bit or 64-bit. >> >> I hadn''t tried to reproduce this on the Atom D510 with a 64-bit Dom0. >> But I was wondering if you had this setup before - with a 64-bit dom0? >> Or is that really not an option with your CPU? > So while I am still looking at the hypervisor code to figure out why > it would give me: > > (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 > > I''ve cobbled this patch^H^H^Hhack to retry the transaction to see if this is > a tempory issue (race) or really - somehow that L1 PTE is gone. > > If you could, can you try it out and see if the errors that are spit > are repeated - mainly the "Could not find L1 PTE". You will need to > run the hypervisor with "loglvl=all" to get that information. > > to compile the hypervisor with debug=y to get thatI built xen with debug on I think (make debug=y world ; make debug=y install) I''ve taken linux 3.0.3 and added the patch which seems to have compiled correctly. The DomU''s continue to fail as before. Attached are a number of logs: dmesg.0.log - After Dom0 had booted, but before any DomU''s had started dmesg.1.log - After the first DomU had started (subsequent DomUs generated no further messages) domU-1.log - the console log of the first time the dom U was started This dom U fails to run, and generates a kernel bug report domU-2.log - the console log of the second time the dom U was started This dom U also fails to run, but with different output. It does not generate a kernel bug report messages - the relevant /var/log/messages output Anthony. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Aug-29 17:33 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 26/08/2011 15:26, Konrad Rzeszutek Wilk wrote:> On Thu, Aug 25, 2011 at 09:31:46PM +0100, Anthony Wright wrote: >> Hi Konrad - any progress on this - it''s a bit of a show stopper for me. > What is interesting is that it happens only with 32-bit guests and with > not-so fast hardware: Atom D510 for me and in your case MSI MS-7309 motherboard > (with what kind of processor?). I''ve a 64-bit hypervisor - not sure if you > are using a 32-bit or 64-bit. > > I hadn''t tried to reproduce this on the Atom D510 with a 64-bit Dom0. > But I was wondering if you had this setup before - with a 64-bit dom0? > Or is that really not an option with your CPU?The processor for the system I''m having problems with is an "AMD Athlon II X2 250", I''ve attached the cpuinfo output. It''s a 64 bit processor that supports HVM. I run everything as 32 bit and use paravirtualisation, so that I can work with a wide range of systems, so we have a 32 bit Dom0 running 32 bit DomUs using paravirtualisation on a 64 bit processor that supports HVM. As an experiment I''ve spent my day building a 64 bit dom0 kernel, which I then discovered also needs a 64 bit Xen. I built a 64 bit Xen, but the 32 bit xen tools don''t seem to want to work with the 64 bit xen, and at that point things started going wrong since I trying to switch the entire system to 64 bit is going to be very painful (all the xen tools, all the libraries and all the other applications that rely on those libraries). Is there a simple way to get the 32 bit xen tools to control a 64 bit xen and dom0 kernel, or am I stuck? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Aug-31 16:58 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote:> > So while I am still looking at the hypervisor code to figure out why > it would give me [when trying to map a grant page]: > > (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000It is failing in guest_map_l1e() because the page for the vmalloc''d virtual address PTEs is not present. The test that fails is: (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT I think this is because the GNTTABOP_map_grant_ref hypercall is done when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into init_mm so when Xen looks in the page tables it doesn''t find the entries because they''re not there yet. Putting a call to vmalloc_sync_all() after create_vm_area() and before the hypercall makes it work for me. Classic Xen kernels used to have such a call. This presumably works on some systems/configuration and not others depending on what else is using vmalloc(). i.e., if another kernel thread (?) calls vmalloc() etc. then there will be a page for vmalloc area PTEs and it will work. I''ll try and post a patch tomorrow. Thanks to Ian Campbell for pointing me in the right direction. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-31 17:07 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote:> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: > > > > So while I am still looking at the hypervisor code to figure out why > > it would give me [when trying to map a grant page]: > > > > (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 > > It is failing in guest_map_l1e() because the page for the vmalloc''d > virtual address PTEs is not present. > > The test that fails is: > > (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT > > I think this is because the GNTTABOP_map_grant_ref hypercall is done > when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into > init_mm so when Xen looks in the page tables it doesn''t find the entries > because they''re not there yet. > > Putting a call to vmalloc_sync_all() after create_vm_area() and before > the hypercall makes it work for me. Classic Xen kernels used to have > such a call.That sounds quite reasonable.> > This presumably works on some systems/configuration and not others > depending on what else is using vmalloc(). i.e., if another kernel > thread (?) calls vmalloc() etc. then there will be a page for vmalloc > area PTEs and it will work. > > I''ll try and post a patch tomorrow. > > Thanks to Ian Campbell for pointing me in the right direction.Great! Thanks for hunting this one down.> > David_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-01 07:42 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote:> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: > > On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: > > > > > > So while I am still looking at the hypervisor code to figure out why > > > it would give me [when trying to map a grant page]: > > > > > > (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 > > > > It is failing in guest_map_l1e() because the page for the vmalloc''d > > virtual address PTEs is not present. > > > > The test that fails is: > > > > (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT > > > > I think this is because the GNTTABOP_map_grant_ref hypercall is done > > when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into > > init_mm so when Xen looks in the page tables it doesn''t find the entries > > because they''re not there yet. > > > > Putting a call to vmalloc_sync_all() after create_vm_area() and before > > the hypercall makes it work for me. Classic Xen kernels used to have > > such a call. > > That sounds quite reasonable.I was wondering why upstream was missing the vmalloc_sync_all() in alloc_vm_area() since the out-of-tree kernels did have it and the function was added by us. I found this: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a commit ef691947d8a3d479e67652312783aedcf629320a Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Date: Wed Dec 1 15:45:48 2010 -0800 vmalloc: remove vmalloc_sync_all() from alloc_vm_area() There''s no need for it: it will get faulted into the current pagetable as needed. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> The flaw in the reasoning here is that you cannot take a kernel fault while processing a hypercall, so hypercall arguments must have been faulted in beforehand and that is what the sync_all was for. It''s probably fair to say that the Xen specific caller should take care of that Xen-specific requirement rather than pushing it into common code. On the other hand Xen is the only user and creating a Xen specific helper/wrapper seems a bit pointless. Ian.> > > > This presumably works on some systems/configuration and not others > > depending on what else is using vmalloc(). i.e., if another kernel > > thread (?) calls vmalloc() etc. then there will be a page for vmalloc > > area PTEs and it will work. > > > > I''ll try and post a patch tomorrow. > > > > Thanks to Ian Campbell for pointing me in the right direction. > > Great! Thanks for hunting this one down. > > > > David_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-01 14:23 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, Sep 01, 2011 at 08:42:52AM +0100, Ian Campbell wrote:> On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: > > On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: > > > On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: > > > > > > > > So while I am still looking at the hypervisor code to figure out why > > > > it would give me [when trying to map a grant page]: > > > > > > > > (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 > > > > > > It is failing in guest_map_l1e() because the page for the vmalloc''d > > > virtual address PTEs is not present. > > > > > > The test that fails is: > > > > > > (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT > > > > > > I think this is because the GNTTABOP_map_grant_ref hypercall is done > > > when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into > > > init_mm so when Xen looks in the page tables it doesn''t find the entries > > > because they''re not there yet. > > > > > > Putting a call to vmalloc_sync_all() after create_vm_area() and before > > > the hypercall makes it work for me. Classic Xen kernels used to have > > > such a call. > > > > That sounds quite reasonable. > > I was wondering why upstream was missing the vmalloc_sync_all() in > alloc_vm_area() since the out-of-tree kernels did have it and the > function was added by us. I found this: > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a > > commit ef691947d8a3d479e67652312783aedcf629320a > Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > Date: Wed Dec 1 15:45:48 2010 -0800 > > vmalloc: remove vmalloc_sync_all() from alloc_vm_area() > > There''s no need for it: it will get faulted into the current pagetable > as needed. > > Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > > The flaw in the reasoning here is that you cannot take a kernel fault > while processing a hypercall, so hypercall arguments must have been > faulted in beforehand and that is what the sync_all was for. > > It''s probably fair to say that the Xen specific caller should take care > of that Xen-specific requirement rather than pushing it into common > code. On the other hand Xen is the only user and creating a Xen specific > helper/wrapper seems a bit pointless.Perhaps then doing the vmalloc_sync_all() (or are more precise one: vmalloc_sync_one) should be employed in the netback code then? And obviously guarded by the CONFIG_HIGHMEM case? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-01 15:12 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 01/09/11 15:23, Konrad Rzeszutek Wilk wrote:> On Thu, Sep 01, 2011 at 08:42:52AM +0100, Ian Campbell wrote: >> On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: >>> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: >>>> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: >>>>> >>>>> So while I am still looking at the hypervisor code to figure out why >>>>> it would give me [when trying to map a grant page]: >>>>> >>>>> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 >>>> >>>> It is failing in guest_map_l1e() because the page for the vmalloc''d >>>> virtual address PTEs is not present. >>>> >>>> The test that fails is: >>>> >>>> (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT >>>> >>>> I think this is because the GNTTABOP_map_grant_ref hypercall is done >>>> when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into >>>> init_mm so when Xen looks in the page tables it doesn''t find the entries >>>> because they''re not there yet. >>>> >>>> Putting a call to vmalloc_sync_all() after create_vm_area() and before >>>> the hypercall makes it work for me. Classic Xen kernels used to have >>>> such a call. >>> >>> That sounds quite reasonable. >> >> I was wondering why upstream was missing the vmalloc_sync_all() in >> alloc_vm_area() since the out-of-tree kernels did have it and the >> function was added by us. I found this: >> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a >> >> commit ef691947d8a3d479e67652312783aedcf629320a >> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >> Date: Wed Dec 1 15:45:48 2010 -0800 >> >> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() >> >> There''s no need for it: it will get faulted into the current pagetable >> as needed. >> >> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >> >> The flaw in the reasoning here is that you cannot take a kernel fault >> while processing a hypercall, so hypercall arguments must have been >> faulted in beforehand and that is what the sync_all was for. >> >> It''s probably fair to say that the Xen specific caller should take care >> of that Xen-specific requirement rather than pushing it into common >> code. On the other hand Xen is the only user and creating a Xen specific >> helper/wrapper seems a bit pointless. > > Perhaps then doing the vmalloc_sync_all() (or are more precise one: > vmalloc_sync_one) should be employed in the netback code then? > > And obviously guarded by the CONFIG_HIGHMEM case?Perhaps. But I think the correct thing to do initially is revert the change and then look at possible improvements. Particularly as the fix needs to be a backported to stable. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-01 15:12 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, 2011-09-01 at 15:23 +0100, Konrad Rzeszutek Wilk wrote:> On Thu, Sep 01, 2011 at 08:42:52AM +0100, Ian Campbell wrote: > > On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: > > > On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: > > > > On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: > > > > > > > > > > So while I am still looking at the hypervisor code to figure out why > > > > > it would give me [when trying to map a grant page]: > > > > > > > > > > (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 > > > > > > > > It is failing in guest_map_l1e() because the page for the vmalloc''d > > > > virtual address PTEs is not present. > > > > > > > > The test that fails is: > > > > > > > > (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT > > > > > > > > I think this is because the GNTTABOP_map_grant_ref hypercall is done > > > > when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into > > > > init_mm so when Xen looks in the page tables it doesn''t find the entries > > > > because they''re not there yet. > > > > > > > > Putting a call to vmalloc_sync_all() after create_vm_area() and before > > > > the hypercall makes it work for me. Classic Xen kernels used to have > > > > such a call. > > > > > > That sounds quite reasonable. > > > > I was wondering why upstream was missing the vmalloc_sync_all() in > > alloc_vm_area() since the out-of-tree kernels did have it and the > > function was added by us. I found this: > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a > > > > commit ef691947d8a3d479e67652312783aedcf629320a > > Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > > Date: Wed Dec 1 15:45:48 2010 -0800 > > > > vmalloc: remove vmalloc_sync_all() from alloc_vm_area() > > > > There''s no need for it: it will get faulted into the current pagetable > > as needed. > > > > Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > > > > The flaw in the reasoning here is that you cannot take a kernel fault > > while processing a hypercall, so hypercall arguments must have been > > faulted in beforehand and that is what the sync_all was for. > > > > It''s probably fair to say that the Xen specific caller should take care > > of that Xen-specific requirement rather than pushing it into common > > code. On the other hand Xen is the only user and creating a Xen specific > > helper/wrapper seems a bit pointless. > > Perhaps then doing the vmalloc_sync_all() (or are more precise one: > vmalloc_sync_one) should be employed in the netback code then?Not just netback but everywhere which uses this interface.> And obviously guarded by the CONFIG_HIGHMEM case?I don''t think this has anything to do with highmem, does it? It is potentially just as much of a problem on 64 bit for example. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-01 15:37 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
> >> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() > >> > >> There''s no need for it: it will get faulted into the current pagetable > >> as needed. > >> > >> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > >> > >> The flaw in the reasoning here is that you cannot take a kernel fault > >> while processing a hypercall, so hypercall arguments must have been > >> faulted in beforehand and that is what the sync_all was for. > >> > >> It''s probably fair to say that the Xen specific caller should take care > >> of that Xen-specific requirement rather than pushing it into common > >> code. On the other hand Xen is the only user and creating a Xen specific > >> helper/wrapper seems a bit pointless. > > > > Perhaps then doing the vmalloc_sync_all() (or are more precise one: > > vmalloc_sync_one) should be employed in the netback code then? > > > > And obviously guarded by the CONFIG_HIGHMEM case? > > Perhaps. But I think the correct thing to do initially is revert the > change and then look at possible improvements. Particularly as the fix > needs to be a backported to stable.I disagree. Ian pointed out properly that this a Xen requirment - and there is no reason for us to slow down non-Xen runs with vmalloc_sync_all plucked in a generic path. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-01 15:38 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
> > > The flaw in the reasoning here is that you cannot take a kernel fault > > > while processing a hypercall, so hypercall arguments must have been > > > faulted in beforehand and that is what the sync_all was for. > > > > > > It''s probably fair to say that the Xen specific caller should take care > > > of that Xen-specific requirement rather than pushing it into common > > > code. On the other hand Xen is the only user and creating a Xen specific > > > helper/wrapper seems a bit pointless. > > > > Perhaps then doing the vmalloc_sync_all() (or are more precise one: > > vmalloc_sync_one) should be employed in the netback code then? > > Not just netback but everywhere which uses this interface.Which is for right now netback :-). But yes - wherever we use that we should do follow with some sort of vmalloc.> > > And obviously guarded by the CONFIG_HIGHMEM case? > > I don''t think this has anything to do with highmem, does it? It is > potentially just as much of a problem on 64 bit for example.You are right. I somehow had vmalloc == highmem equated but that is bogus.> > Ian._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-01 15:43 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, 2011-09-01 at 16:37 +0100, Konrad Rzeszutek Wilk wrote:> > >> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() > > >> > > >> There''s no need for it: it will get faulted into the current pagetable > > >> as needed. > > >> > > >> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > > >> > > >> The flaw in the reasoning here is that you cannot take a kernel fault > > >> while processing a hypercall, so hypercall arguments must have been > > >> faulted in beforehand and that is what the sync_all was for. > > >> > > >> It''s probably fair to say that the Xen specific caller should take care > > >> of that Xen-specific requirement rather than pushing it into common > > >> code. On the other hand Xen is the only user and creating a Xen specific > > >> helper/wrapper seems a bit pointless. > > > > > > Perhaps then doing the vmalloc_sync_all() (or are more precise one: > > > vmalloc_sync_one) should be employed in the netback code then? > > > > > > And obviously guarded by the CONFIG_HIGHMEM case? > > > > Perhaps. But I think the correct thing to do initially is revert the > > change and then look at possible improvements. Particularly as the fix > > needs to be a backported to stable. > > I disagree. Ian pointed out properly that this a Xen requirment - and there > is no reason for us to slow down non-Xen runs with vmalloc_sync_all plucked in > a generic path.There is literally no other caller of alloc_vm_area than xen so you won''t be slowing anyone else down. Maybe we should add alloc_vm_area_sync and use that everywhere? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-01 15:44 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, 2011-09-01 at 16:38 +0100, Konrad Rzeszutek Wilk wrote:> > > > The flaw in the reasoning here is that you cannot take a kernel fault > > > > while processing a hypercall, so hypercall arguments must have been > > > > faulted in beforehand and that is what the sync_all was for. > > > > > > > > It''s probably fair to say that the Xen specific caller should take care > > > > of that Xen-specific requirement rather than pushing it into common > > > > code. On the other hand Xen is the only user and creating a Xen specific > > > > helper/wrapper seems a bit pointless. > > > > > > Perhaps then doing the vmalloc_sync_all() (or are more precise one: > > > vmalloc_sync_one) should be employed in the netback code then? > > > > Not just netback but everywhere which uses this interface. > > Which is for right now netback :-). But yes - wherever we use that > we should do follow with some sort of vmalloc.blkback, xenbus_client and the grant table stuff all use it as well and AFAICT have the same requirement for syncing. $ git grep alloc_vm_area arch/x86/include/asm/xen/grant_table.h:#define xen_alloc_vm_area(size) alloc_vm_area(size) -- this macro is unused... arch/x86/xen/grant-table.c: xen_alloc_vm_area(PAGE_SIZE * max_nr_gframes); drivers/block/xen-blkback/xenbus.c: blkif->blk_ring_area = alloc_vm_area(PAGE_SIZE); drivers/net/xen-netback/netback.c: vif->tx_comms_area = alloc_vm_area(PAGE_SIZE); drivers/net/xen-netback/netback.c: vif->rx_comms_area = alloc_vm_area(PAGE_SIZE); drivers/xen/xenbus/xenbus_client.c: area = xen_alloc_vm_area(PAGE_SIZE); Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-01 16:07 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, Sep 01, 2011 at 04:43:05PM +0100, Ian Campbell wrote:> On Thu, 2011-09-01 at 16:37 +0100, Konrad Rzeszutek Wilk wrote: > > > >> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() > > > >> > > > >> There''s no need for it: it will get faulted into the current pagetable > > > >> as needed. > > > >> > > > >> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > > > >> > > > >> The flaw in the reasoning here is that you cannot take a kernel fault > > > >> while processing a hypercall, so hypercall arguments must have been > > > >> faulted in beforehand and that is what the sync_all was for. > > > >> > > > >> It''s probably fair to say that the Xen specific caller should take care > > > >> of that Xen-specific requirement rather than pushing it into common > > > >> code. On the other hand Xen is the only user and creating a Xen specific > > > >> helper/wrapper seems a bit pointless. > > > > > > > > Perhaps then doing the vmalloc_sync_all() (or are more precise one: > > > > vmalloc_sync_one) should be employed in the netback code then? > > > > > > > > And obviously guarded by the CONFIG_HIGHMEM case? > > > > > > Perhaps. But I think the correct thing to do initially is revert the > > > change and then look at possible improvements. Particularly as the fix > > > needs to be a backported to stable. > > > > I disagree. Ian pointed out properly that this a Xen requirment - and there > > is no reason for us to slow down non-Xen runs with vmalloc_sync_all plucked in > > a generic path. > > There is literally no other caller of alloc_vm_area than xen so you > won''t be slowing anyone else down.Duh! I totally missed that. Sounds plausible then - let me ping Andrew Morton on re-adding the vmalloc back.> > Maybe we should add alloc_vm_area_sync and use that everywhere?That is an option too.> > Ian. > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-01 17:32 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 09/01/2011 12:42 AM, Ian Campbell wrote:> On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: >> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: >>> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: >>>> So while I am still looking at the hypervisor code to figure out why >>>> it would give me [when trying to map a grant page]: >>>> >>>> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 >>> It is failing in guest_map_l1e() because the page for the vmalloc''d >>> virtual address PTEs is not present. >>> >>> The test that fails is: >>> >>> (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT >>> >>> I think this is because the GNTTABOP_map_grant_ref hypercall is done >>> when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into >>> init_mm so when Xen looks in the page tables it doesn''t find the entries >>> because they''re not there yet. >>> >>> Putting a call to vmalloc_sync_all() after create_vm_area() and before >>> the hypercall makes it work for me. Classic Xen kernels used to have >>> such a call. >> That sounds quite reasonable. > I was wondering why upstream was missing the vmalloc_sync_all() in > alloc_vm_area() since the out-of-tree kernels did have it and the > function was added by us. I found this: > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a > > commit ef691947d8a3d479e67652312783aedcf629320a > Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > Date: Wed Dec 1 15:45:48 2010 -0800 > > vmalloc: remove vmalloc_sync_all() from alloc_vm_area() > > There''s no need for it: it will get faulted into the current pagetable > as needed. > > Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > > The flaw in the reasoning here is that you cannot take a kernel fault > while processing a hypercall, so hypercall arguments must have been > faulted in beforehand and that is what the sync_all was for.That''s a good point. (Maybe Xen should have generated pagefaults when hypercall arg pointers are bad...)> It''s probably fair to say that the Xen specific caller should take care > of that Xen-specific requirement rather than pushing it into common > code. On the other hand Xen is the only user and creating a Xen specific > helper/wrapper seems a bit pointless.There''s already a wrapper: xen_alloc_vm_area(), which is just a #define. But we could easily add a sync_all to it (and use it in netback, like we do in grant-table and xenbus). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-01 17:34 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 09/01/2011 08:44 AM, Ian Campbell wrote:> > blkback, xenbus_client and the grant table stuff all use it as well and > AFAICT have the same requirement for syncing. > > $ git grep alloc_vm_area > arch/x86/include/asm/xen/grant_table.h:#define xen_alloc_vm_area(size) alloc_vm_area(size) > > -- this macro is unused... > > arch/x86/xen/grant-table.c: xen_alloc_vm_area(PAGE_SIZE * max_nr_gframes); > drivers/block/xen-blkback/xenbus.c: blkif->blk_ring_area = alloc_vm_area(PAGE_SIZE); > drivers/net/xen-netback/netback.c: vif->tx_comms_area = alloc_vm_area(PAGE_SIZE); > drivers/net/xen-netback/netback.c: vif->rx_comms_area = alloc_vm_area(PAGE_SIZE); > drivers/xen/xenbus/xenbus_client.c: area = xen_alloc_vm_area(PAGE_SIZE);Well, 3/5ths unused. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-01 19:19 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, 2011-09-01 at 18:34 +0100, Jeremy Fitzhardinge wrote:> On 09/01/2011 08:44 AM, Ian Campbell wrote: > > > > blkback, xenbus_client and the grant table stuff all use it as well and > > AFAICT have the same requirement for syncing. > > > > $ git grep alloc_vm_area > > arch/x86/include/asm/xen/grant_table.h:#define xen_alloc_vm_area(size) alloc_vm_area(size) > > > > -- this macro is unused... > > > > arch/x86/xen/grant-table.c: xen_alloc_vm_area(PAGE_SIZE * max_nr_gframes); > > drivers/block/xen-blkback/xenbus.c: blkif->blk_ring_area = alloc_vm_area(PAGE_SIZE); > > drivers/net/xen-netback/netback.c: vif->tx_comms_area = alloc_vm_area(PAGE_SIZE); > > drivers/net/xen-netback/netback.c: vif->rx_comms_area = alloc_vm_area(PAGE_SIZE); > > drivers/xen/xenbus/xenbus_client.c: area = xen_alloc_vm_area(PAGE_SIZE); > > Well, 3/5ths unused.Hmm, yes, no sure how I missed that. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-01 19:21 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, 2011-09-01 at 18:32 +0100, Jeremy Fitzhardinge wrote:> On 09/01/2011 12:42 AM, Ian Campbell wrote: > > On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: > >> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: > >>> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: > >>>> So while I am still looking at the hypervisor code to figure out why > >>>> it would give me [when trying to map a grant page]: > >>>> > >>>> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 > >>> It is failing in guest_map_l1e() because the page for the vmalloc''d > >>> virtual address PTEs is not present. > >>> > >>> The test that fails is: > >>> > >>> (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT > >>> > >>> I think this is because the GNTTABOP_map_grant_ref hypercall is done > >>> when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into > >>> init_mm so when Xen looks in the page tables it doesn''t find the entries > >>> because they''re not there yet. > >>> > >>> Putting a call to vmalloc_sync_all() after create_vm_area() and before > >>> the hypercall makes it work for me. Classic Xen kernels used to have > >>> such a call. > >> That sounds quite reasonable. > > I was wondering why upstream was missing the vmalloc_sync_all() in > > alloc_vm_area() since the out-of-tree kernels did have it and the > > function was added by us. I found this: > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a > > > > commit ef691947d8a3d479e67652312783aedcf629320a > > Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > > Date: Wed Dec 1 15:45:48 2010 -0800 > > > > vmalloc: remove vmalloc_sync_all() from alloc_vm_area() > > > > There''s no need for it: it will get faulted into the current pagetable > > as needed. > > > > Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > > > > The flaw in the reasoning here is that you cannot take a kernel fault > > while processing a hypercall, so hypercall arguments must have been > > faulted in beforehand and that is what the sync_all was for. > > That''s a good point. (Maybe Xen should have generated pagefaults when > hypercall arg pointers are bad...)I think it would be a bit tricky to do in practice, you''d either have to support recursive hypercalls in the middle of other hypercalls (because the page fault handler is surely going to want to do some) or proper hypercall restart (so you can fully return to guest context to handle the fault then retry) or something along those and complexifying up the hypervisor one way or another. Probably not impossible if you were building something form the ground up, but not trivial.> > It''s probably fair to say that the Xen specific caller should take care > > of that Xen-specific requirement rather than pushing it into common > > code. On the other hand Xen is the only user and creating a Xen specific > > helper/wrapper seems a bit pointless. > > There''s already a wrapper: xen_alloc_vm_area(), which is just a > #define. But we could easily add a sync_all to it (and use it in > netback, like we do in grant-table and xenbus).OOI what was the wrapper for originally? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-01 20:34 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 09/01/2011 12:21 PM, Ian Campbell wrote:> On Thu, 2011-09-01 at 18:32 +0100, Jeremy Fitzhardinge wrote: >> On 09/01/2011 12:42 AM, Ian Campbell wrote: >>> On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: >>>> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: >>>>> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: >>>>>> So while I am still looking at the hypervisor code to figure out why >>>>>> it would give me [when trying to map a grant page]: >>>>>> >>>>>> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 >>>>> It is failing in guest_map_l1e() because the page for the vmalloc''d >>>>> virtual address PTEs is not present. >>>>> >>>>> The test that fails is: >>>>> >>>>> (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT >>>>> >>>>> I think this is because the GNTTABOP_map_grant_ref hypercall is done >>>>> when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into >>>>> init_mm so when Xen looks in the page tables it doesn''t find the entries >>>>> because they''re not there yet. >>>>> >>>>> Putting a call to vmalloc_sync_all() after create_vm_area() and before >>>>> the hypercall makes it work for me. Classic Xen kernels used to have >>>>> such a call. >>>> That sounds quite reasonable. >>> I was wondering why upstream was missing the vmalloc_sync_all() in >>> alloc_vm_area() since the out-of-tree kernels did have it and the >>> function was added by us. I found this: >>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a >>> >>> commit ef691947d8a3d479e67652312783aedcf629320a >>> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >>> Date: Wed Dec 1 15:45:48 2010 -0800 >>> >>> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() >>> >>> There''s no need for it: it will get faulted into the current pagetable >>> as needed. >>> >>> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >>> >>> The flaw in the reasoning here is that you cannot take a kernel fault >>> while processing a hypercall, so hypercall arguments must have been >>> faulted in beforehand and that is what the sync_all was for. >> That''s a good point. (Maybe Xen should have generated pagefaults when >> hypercall arg pointers are bad...) > I think it would be a bit tricky to do in practice, you''d either have to > support recursive hypercalls in the middle of other hypercalls (because > the page fault handler is surely going to want to do some) or proper > hypercall restart (so you can fully return to guest context to handle > the fault then retry) or something along those and complexifying up the > hypervisor one way or another. Probably not impossible if you were > building something form the ground up, but not trivial.Well, Xen already has the continuation machinery for dealing with hypercall restart, so that could be reused. And accesses to guest memory are already special events which must be checked so that EFAULT can be returned. If, rather than failing with EFAULT Xen set up a pagefault exception for the guest CPU with the return set up to retry the hypercall, it should all work... Of course, if the guest isn''t expecting that - or its buggy - then it could end up in an infinite loop. But maybe a flag (set a high bit in the hypercall number?), or a feature, or something? Might be worthwhile if it saves guests having to do something expensive (like a vmalloc_sync_all), even if they have to also deal with old hypervisors.>> There''s already a wrapper: xen_alloc_vm_area(), which is just a >> #define. But we could easily add a sync_all to it (and use it in >> netback, like we do in grant-table and xenbus). > OOI what was the wrapper for originally?Not sure; I brought it over from 2.6.18-xen. BTW, vmalloc_sync_all() is much hated, and is slated for removal at some point - there are definitely target sights on it. So we should think about not needing it. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-02 07:17 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Thu, 2011-09-01 at 21:34 +0100, Jeremy Fitzhardinge wrote:> On 09/01/2011 12:21 PM, Ian Campbell wrote: > > On Thu, 2011-09-01 at 18:32 +0100, Jeremy Fitzhardinge wrote: > >> On 09/01/2011 12:42 AM, Ian Campbell wrote: > >>> On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: > >>>> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: > >>>>> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: > >>>>>> So while I am still looking at the hypervisor code to figure out why > >>>>>> it would give me [when trying to map a grant page]: > >>>>>> > >>>>>> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 > >>>>> It is failing in guest_map_l1e() because the page for the vmalloc''d > >>>>> virtual address PTEs is not present. > >>>>> > >>>>> The test that fails is: > >>>>> > >>>>> (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT > >>>>> > >>>>> I think this is because the GNTTABOP_map_grant_ref hypercall is done > >>>>> when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into > >>>>> init_mm so when Xen looks in the page tables it doesn''t find the entries > >>>>> because they''re not there yet. > >>>>> > >>>>> Putting a call to vmalloc_sync_all() after create_vm_area() and before > >>>>> the hypercall makes it work for me. Classic Xen kernels used to have > >>>>> such a call. > >>>> That sounds quite reasonable. > >>> I was wondering why upstream was missing the vmalloc_sync_all() in > >>> alloc_vm_area() since the out-of-tree kernels did have it and the > >>> function was added by us. I found this: > >>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a > >>> > >>> commit ef691947d8a3d479e67652312783aedcf629320a > >>> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > >>> Date: Wed Dec 1 15:45:48 2010 -0800 > >>> > >>> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() > >>> > >>> There''s no need for it: it will get faulted into the current pagetable > >>> as needed. > >>> > >>> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > >>> > >>> The flaw in the reasoning here is that you cannot take a kernel fault > >>> while processing a hypercall, so hypercall arguments must have been > >>> faulted in beforehand and that is what the sync_all was for. > >> That''s a good point. (Maybe Xen should have generated pagefaults when > >> hypercall arg pointers are bad...) > > I think it would be a bit tricky to do in practice, you''d either have to > > support recursive hypercalls in the middle of other hypercalls (because > > the page fault handler is surely going to want to do some) or proper > > hypercall restart (so you can fully return to guest context to handle > > the fault then retry) or something along those and complexifying up the > > hypervisor one way or another. Probably not impossible if you were > > building something form the ground up, but not trivial. > > Well, Xen already has the continuation machinery for dealing with > hypercall restart, so that could be reused.That requires special support beyond just calling the continuation in each hypercall (often extending into the ABI) for pickling progress and picking it up again, only a small number of (usually long running) hypercalls have that support today. It also uses the guest context to store the state which perhaps isn''t helpful if you want to return to the guest, although I suppose building a nested frame would work. The guys doing paging and sharing etc looked into this and came to the conclusion that it would be intractably difficult to do this fully -- hence we now have the ability to sleep in hypercalls, which works because the pager/sharer is in a different domain/vcpu.> And accesses to guest > memory are already special events which must be checked so that EFAULT > can be returned. If, rather than failing with EFAULT Xen set up a > pagefault exception for the guest CPU with the return set up to retry > the hypercall, it should all work... > > Of course, if the guest isn''t expecting that - or its buggy - then it > could end up in an infinite loop. But maybe a flag (set a high bit in > the hypercall number?), or a feature, or something? Might be worthwhile > if it saves guests having to do something expensive (like a > vmalloc_sync_all), even if they have to also deal with old hypervisors.The vmalloc_sync_all is a pretty event even on Xen though, isn''t it?> >> There''s already a wrapper: xen_alloc_vm_area(), which is just a > >> #define. But we could easily add a sync_all to it (and use it in > >> netback, like we do in grant-table and xenbus). > > OOI what was the wrapper for originally? > > Not sure; I brought it over from 2.6.18-xen. > > BTW, vmalloc_sync_all() is much hated, and is slated for removal at some > point - there are definitely target sights on it. So we should think > about not needing it. > > J_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-02 20:26 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 09/02/2011 12:17 AM, Ian Campbell wrote:> On Thu, 2011-09-01 at 21:34 +0100, Jeremy Fitzhardinge wrote: >> On 09/01/2011 12:21 PM, Ian Campbell wrote: >>> On Thu, 2011-09-01 at 18:32 +0100, Jeremy Fitzhardinge wrote: >>>> On 09/01/2011 12:42 AM, Ian Campbell wrote: >>>>> On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: >>>>>> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: >>>>>>> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: >>>>>>>> So while I am still looking at the hypervisor code to figure out why >>>>>>>> it would give me [when trying to map a grant page]: >>>>>>>> >>>>>>>> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 >>>>>>> It is failing in guest_map_l1e() because the page for the vmalloc''d >>>>>>> virtual address PTEs is not present. >>>>>>> >>>>>>> The test that fails is: >>>>>>> >>>>>>> (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT >>>>>>> >>>>>>> I think this is because the GNTTABOP_map_grant_ref hypercall is done >>>>>>> when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into >>>>>>> init_mm so when Xen looks in the page tables it doesn''t find the entries >>>>>>> because they''re not there yet. >>>>>>> >>>>>>> Putting a call to vmalloc_sync_all() after create_vm_area() and before >>>>>>> the hypercall makes it work for me. Classic Xen kernels used to have >>>>>>> such a call. >>>>>> That sounds quite reasonable. >>>>> I was wondering why upstream was missing the vmalloc_sync_all() in >>>>> alloc_vm_area() since the out-of-tree kernels did have it and the >>>>> function was added by us. I found this: >>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a >>>>> >>>>> commit ef691947d8a3d479e67652312783aedcf629320a >>>>> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >>>>> Date: Wed Dec 1 15:45:48 2010 -0800 >>>>> >>>>> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() >>>>> >>>>> There''s no need for it: it will get faulted into the current pagetable >>>>> as needed. >>>>> >>>>> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >>>>> >>>>> The flaw in the reasoning here is that you cannot take a kernel fault >>>>> while processing a hypercall, so hypercall arguments must have been >>>>> faulted in beforehand and that is what the sync_all was for. >>>> That''s a good point. (Maybe Xen should have generated pagefaults when >>>> hypercall arg pointers are bad...) >>> I think it would be a bit tricky to do in practice, you''d either have to >>> support recursive hypercalls in the middle of other hypercalls (because >>> the page fault handler is surely going to want to do some) or proper >>> hypercall restart (so you can fully return to guest context to handle >>> the fault then retry) or something along those and complexifying up the >>> hypervisor one way or another. Probably not impossible if you were >>> building something form the ground up, but not trivial. >> Well, Xen already has the continuation machinery for dealing with >> hypercall restart, so that could be reused. > That requires special support beyond just calling the continuation in > each hypercall (often extending into the ABI) for pickling progress and > picking it up again, only a small number of (usually long running) > hypercalls have that support today. It also uses the guest context to > store the state which perhaps isn''t helpful if you want to return to the > guest, although I suppose building a nested frame would work.I guess it depends on how many hypercalls do work before touching guest memory, but any hypercall should be like that anyway, or at least be able to wind back work done if a later read EFAULTs. I was vaguely speculating about a scheme on the lines of: 1. In copy_to/from_user, if we touch a bad address, save it in a per-vcpu "bad_guest_addr" 2. when returning to the guest, if the errno is EFAULT and bad_guest_addr is set, then generate a memory fault frame with cr2 bad_guest_addr, and with the exception return restarting the hypercall Perhaps there should be a EFAULT_RETRY error return to trigger this behaviour, rather than doing it for all EFAULTs, so the faulting behaviour can be added incrementally. Maybe this is a lost cause for x86, but perhaps its worth considering for new ports?> The guys doing paging and sharing etc looked into this and came to the > conclusion that it would be intractably difficult to do this fully -- > hence we now have the ability to sleep in hypercalls, which works > because the pager/sharer is in a different domain/vcpu.Hmm. Were they looking at injecting faults back into the guest, or forwarding "missing page" events off to another domain?>> And accesses to guest >> memory are already special events which must be checked so that EFAULT >> can be returned. If, rather than failing with EFAULT Xen set up a >> pagefault exception for the guest CPU with the return set up to retry >> the hypercall, it should all work... >> >> Of course, if the guest isn''t expecting that - or its buggy - then it >> could end up in an infinite loop. But maybe a flag (set a high bit in >> the hypercall number?), or a feature, or something? Might be worthwhile >> if it saves guests having to do something expensive (like a >> vmalloc_sync_all), even if they have to also deal with old hypervisors. > The vmalloc_sync_all is a pretty event even on Xen though, isn''t it?Looks like an important word is missing there. But its very expensive, if that''s what you''re saying. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-03 10:27 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On Fri, 2011-09-02 at 21:26 +0100, Jeremy Fitzhardinge wrote:> On 09/02/2011 12:17 AM, Ian Campbell wrote: > > On Thu, 2011-09-01 at 21:34 +0100, Jeremy Fitzhardinge wrote: > >> On 09/01/2011 12:21 PM, Ian Campbell wrote: > >>> On Thu, 2011-09-01 at 18:32 +0100, Jeremy Fitzhardinge wrote: > >>>> On 09/01/2011 12:42 AM, Ian Campbell wrote: > >>>>> On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: > >>>>>> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: > >>>>>>> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: > >>>>>>>> So while I am still looking at the hypervisor code to figure out why > >>>>>>>> it would give me [when trying to map a grant page]: > >>>>>>>> > >>>>>>>> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 > >>>>>>> It is failing in guest_map_l1e() because the page for the vmalloc''d > >>>>>>> virtual address PTEs is not present. > >>>>>>> > >>>>>>> The test that fails is: > >>>>>>> > >>>>>>> (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT > >>>>>>> > >>>>>>> I think this is because the GNTTABOP_map_grant_ref hypercall is done > >>>>>>> when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into > >>>>>>> init_mm so when Xen looks in the page tables it doesn''t find the entries > >>>>>>> because they''re not there yet. > >>>>>>> > >>>>>>> Putting a call to vmalloc_sync_all() after create_vm_area() and before > >>>>>>> the hypercall makes it work for me. Classic Xen kernels used to have > >>>>>>> such a call. > >>>>>> That sounds quite reasonable. > >>>>> I was wondering why upstream was missing the vmalloc_sync_all() in > >>>>> alloc_vm_area() since the out-of-tree kernels did have it and the > >>>>> function was added by us. I found this: > >>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a > >>>>> > >>>>> commit ef691947d8a3d479e67652312783aedcf629320a > >>>>> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > >>>>> Date: Wed Dec 1 15:45:48 2010 -0800 > >>>>> > >>>>> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() > >>>>> > >>>>> There''s no need for it: it will get faulted into the current pagetable > >>>>> as needed. > >>>>> > >>>>> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > >>>>> > >>>>> The flaw in the reasoning here is that you cannot take a kernel fault > >>>>> while processing a hypercall, so hypercall arguments must have been > >>>>> faulted in beforehand and that is what the sync_all was for. > >>>> That''s a good point. (Maybe Xen should have generated pagefaults when > >>>> hypercall arg pointers are bad...) > >>> I think it would be a bit tricky to do in practice, you''d either have to > >>> support recursive hypercalls in the middle of other hypercalls (because > >>> the page fault handler is surely going to want to do some) or proper > >>> hypercall restart (so you can fully return to guest context to handle > >>> the fault then retry) or something along those and complexifying up the > >>> hypervisor one way or another. Probably not impossible if you were > >>> building something form the ground up, but not trivial. > >> Well, Xen already has the continuation machinery for dealing with > >> hypercall restart, so that could be reused. > > That requires special support beyond just calling the continuation in > > each hypercall (often extending into the ABI) for pickling progress and > > picking it up again, only a small number of (usually long running) > > hypercalls have that support today. It also uses the guest context to > > store the state which perhaps isn''t helpful if you want to return to the > > guest, although I suppose building a nested frame would work. > > I guess it depends on how many hypercalls do work before touching guest > memory, but any hypercall should be like that anyway, or at least be > able to wind back work done if a later read EFAULTs. > > I was vaguely speculating about a scheme on the lines of: > > 1. In copy_to/from_user, if we touch a bad address, save it in a > per-vcpu "bad_guest_addr" > 2. when returning to the guest, if the errno is EFAULT and > bad_guest_addr is set, then generate a memory fault frame with cr2 > bad_guest_addr, and with the exception return restarting the hypercall > > Perhaps there should be a EFAULT_RETRY error return to trigger this > behaviour, rather than doing it for all EFAULTs, so the faulting > behaviour can be added incrementally.The kernel uses -ERESTARTSSYS for something similar, doesn''t it? Does this scheme work if the hypercall causing the exception was itself runnnig in an exception handler? I guess it depends on the architecture +OSes handling of nested faults.> Maybe this is a lost cause for x86, but perhaps its worth considering > for new ports?Certainly worth thinking about.> > The guys doing paging and sharing etc looked into this and came to the > > conclusion that it would be intractably difficult to do this fully -- > > hence we now have the ability to sleep in hypercalls, which works > > because the pager/sharer is in a different domain/vcpu. > > Hmm. Were they looking at injecting faults back into the guest, or > forwarding "missing page" events off to another domain?Sharing and swapping are transparent to the domain, another domain runs the swapper/unshare process (actually, unshare might be in the h/v itself, not sure).> >> And accesses to guest > >> memory are already special events which must be checked so that EFAULT > >> can be returned. If, rather than failing with EFAULT Xen set up a > >> pagefault exception for the guest CPU with the return set up to retry > >> the hypercall, it should all work... > >> > >> Of course, if the guest isn''t expecting that - or its buggy - then it > >> could end up in an infinite loop. But maybe a flag (set a high bit in > >> the hypercall number?), or a feature, or something? Might be worthwhile > >> if it saves guests having to do something expensive (like a > >> vmalloc_sync_all), even if they have to also deal with old hypervisors. > > The vmalloc_sync_all is a pretty event even on Xen though, isn''t it? > > Looks like an important word is missing there. But its very expensive, > if that''s what you''re saying.Oops. "rare" was the missing word.> > J_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Sep-07 12:57 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 01/09/2011 16:12, David Vrabel wrote:> On 01/09/11 15:23, Konrad Rzeszutek Wilk wrote: >> On Thu, Sep 01, 2011 at 08:42:52AM +0100, Ian Campbell wrote: >>> On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: >>>> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: >>>>> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: >>>>>> So while I am still looking at the hypervisor code to figure out why >>>>>> it would give me [when trying to map a grant page]: >>>>>> >>>>>> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 >>>>> It is failing in guest_map_l1e() because the page for the vmalloc''d >>>>> virtual address PTEs is not present. >>>>> >>>>> The test that fails is: >>>>> >>>>> (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT >>>>> >>>>> I think this is because the GNTTABOP_map_grant_ref hypercall is done >>>>> when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into >>>>> init_mm so when Xen looks in the page tables it doesn''t find the entries >>>>> because they''re not there yet. >>>>> >>>>> Putting a call to vmalloc_sync_all() after create_vm_area() and before >>>>> the hypercall makes it work for me. Classic Xen kernels used to have >>>>> such a call. >>>> That sounds quite reasonable. >>> I was wondering why upstream was missing the vmalloc_sync_all() in >>> alloc_vm_area() since the out-of-tree kernels did have it and the >>> function was added by us. I found this: >>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a >>> >>> commit ef691947d8a3d479e67652312783aedcf629320a >>> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >>> Date: Wed Dec 1 15:45:48 2010 -0800 >>> >>> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() >>> >>> There''s no need for it: it will get faulted into the current pagetable >>> as needed. >>> >>> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >>> >>> The flaw in the reasoning here is that you cannot take a kernel fault >>> while processing a hypercall, so hypercall arguments must have been >>> faulted in beforehand and that is what the sync_all was for. >>> >>> It''s probably fair to say that the Xen specific caller should take care >>> of that Xen-specific requirement rather than pushing it into common >>> code. On the other hand Xen is the only user and creating a Xen specific >>> helper/wrapper seems a bit pointless. >> Perhaps then doing the vmalloc_sync_all() (or are more precise one: >> vmalloc_sync_one) should be employed in the netback code then? >> >> And obviously guarded by the CONFIG_HIGHMEM case? > Perhaps. But I think the correct thing to do initially is revert the > change and then look at possible improvements. Particularly as the fix > needs to be a backported to stable. > > David >I have implement a patch which does essentially this, i.e. calls vmalloc_sync_all() afer every alloc_vm_area() call (all 5 of them). Now my VMs start correctly, but I still get error messages in the xen dmesg output (attached). Is this expected? Anthony _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-07 18:35 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
> (XEN) mm.c:907:d0 Error getting mfn 3a09c (pfn 55555555) from L1 entry 000000003a09c023 for l1e_owner=0, pg_owner=0 > (XEN) mm.c:907:d0 Error getting mfn 3a09d (pfn 55555555) from L1 entry 000000003a09d023 for l1e_owner=0, pg_owner=0 > (XEN) mm.c:907:d0 Error getting mfn 3a09e (pfn 55555555) from L1 entry 000000003a09e023 for l1e_owner=0, pg_owner=0 > (XEN) mm.c:907:d0 Error getting mfn 3a09f (pfn 55555555) from L1 entry 000000003a09f023 for l1e_owner=0, pg_owner=0 > (XEN) traps.c:2388:d0 Domain attempted WRMSR c0010004 from 0x0000ab23d6d622da to 0x000000000000abcd.Do they show up during bootup? As in do you see those _when_ you launch your guests? To figure out this particular issue you should try using ''console_to_ring'' (so that dom0 output and Xen output are mingled togehter) and also post this under a new subject to not confuse this email thread.>> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Wright
2011-Sep-23 12:35 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 03/09/2011 11:27, Ian Campbell wrote:> On Fri, 2011-09-02 at 21:26 +0100, Jeremy Fitzhardinge wrote: >> On 09/02/2011 12:17 AM, Ian Campbell wrote: >>> On Thu, 2011-09-01 at 21:34 +0100, Jeremy Fitzhardinge wrote: >>>> On 09/01/2011 12:21 PM, Ian Campbell wrote: >>>>> On Thu, 2011-09-01 at 18:32 +0100, Jeremy Fitzhardinge wrote: >>>>>> On 09/01/2011 12:42 AM, Ian Campbell wrote: >>>>>>> On Wed, 2011-08-31 at 18:07 +0100, Konrad Rzeszutek Wilk wrote: >>>>>>>> On Wed, Aug 31, 2011 at 05:58:43PM +0100, David Vrabel wrote: >>>>>>>>> On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote: >>>>>>>>>> So while I am still looking at the hypervisor code to figure out why >>>>>>>>>> it would give me [when trying to map a grant page]: >>>>>>>>>> >>>>>>>>>> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000 >>>>>>>>> It is failing in guest_map_l1e() because the page for the vmalloc''d >>>>>>>>> virtual address PTEs is not present. >>>>>>>>> >>>>>>>>> The test that fails is: >>>>>>>>> >>>>>>>>> (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT >>>>>>>>> >>>>>>>>> I think this is because the GNTTABOP_map_grant_ref hypercall is done >>>>>>>>> when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into >>>>>>>>> init_mm so when Xen looks in the page tables it doesn''t find the entries >>>>>>>>> because they''re not there yet. >>>>>>>>> >>>>>>>>> Putting a call to vmalloc_sync_all() after create_vm_area() and before >>>>>>>>> the hypercall makes it work for me. Classic Xen kernels used to have >>>>>>>>> such a call. >>>>>>>> That sounds quite reasonable. >>>>>>> I was wondering why upstream was missing the vmalloc_sync_all() in >>>>>>> alloc_vm_area() since the out-of-tree kernels did have it and the >>>>>>> function was added by us. I found this: >>>>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=ef691947d8a3d479e67652312783aedcf629320a >>>>>>> >>>>>>> commit ef691947d8a3d479e67652312783aedcf629320a >>>>>>> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >>>>>>> Date: Wed Dec 1 15:45:48 2010 -0800 >>>>>>> >>>>>>> vmalloc: remove vmalloc_sync_all() from alloc_vm_area() >>>>>>> >>>>>>> There''s no need for it: it will get faulted into the current pagetable >>>>>>> as needed. >>>>>>> >>>>>>> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> >>>>>>> >>>>>>> The flaw in the reasoning here is that you cannot take a kernel fault >>>>>>> while processing a hypercall, so hypercall arguments must have been >>>>>>> faulted in beforehand and that is what the sync_all was for. >>>>>> That''s a good point. (Maybe Xen should have generated pagefaults when >>>>>> hypercall arg pointers are bad...) >>>>> I think it would be a bit tricky to do in practice, you''d either have to >>>>> support recursive hypercalls in the middle of other hypercalls (because >>>>> the page fault handler is surely going to want to do some) or proper >>>>> hypercall restart (so you can fully return to guest context to handle >>>>> the fault then retry) or something along those and complexifying up the >>>>> hypervisor one way or another. Probably not impossible if you were >>>>> building something form the ground up, but not trivial. >>>> Well, Xen already has the continuation machinery for dealing with >>>> hypercall restart, so that could be reused. >>> That requires special support beyond just calling the continuation in >>> each hypercall (often extending into the ABI) for pickling progress and >>> picking it up again, only a small number of (usually long running) >>> hypercalls have that support today. It also uses the guest context to >>> store the state which perhaps isn''t helpful if you want to return to the >>> guest, although I suppose building a nested frame would work. >> I guess it depends on how many hypercalls do work before touching guest >> memory, but any hypercall should be like that anyway, or at least be >> able to wind back work done if a later read EFAULTs. >> >> I was vaguely speculating about a scheme on the lines of: >> >> 1. In copy_to/from_user, if we touch a bad address, save it in a >> per-vcpu "bad_guest_addr" >> 2. when returning to the guest, if the errno is EFAULT and >> bad_guest_addr is set, then generate a memory fault frame with cr2 >> bad_guest_addr, and with the exception return restarting the hypercall >> >> Perhaps there should be a EFAULT_RETRY error return to trigger this >> behaviour, rather than doing it for all EFAULTs, so the faulting >> behaviour can be added incrementally. > The kernel uses -ERESTARTSSYS for something similar, doesn''t it? > > Does this scheme work if the hypercall causing the exception was itself > runnnig in an exception handler? I guess it depends on the architecture > +OSes handling of nested faults. > >> Maybe this is a lost cause for x86, but perhaps its worth considering >> for new ports? > Certainly worth thinking about. > >>> The guys doing paging and sharing etc looked into this and came to the >>> conclusion that it would be intractably difficult to do this fully -- >>> hence we now have the ability to sleep in hypercalls, which works >>> because the pager/sharer is in a different domain/vcpu. >> Hmm. Were they looking at injecting faults back into the guest, or >> forwarding "missing page" events off to another domain? > Sharing and swapping are transparent to the domain, another domain runs > the swapper/unshare process (actually, unshare might be in the h/v > itself, not sure). > >>>> And accesses to guest >>>> memory are already special events which must be checked so that EFAULT >>>> can be returned. If, rather than failing with EFAULT Xen set up a >>>> pagefault exception for the guest CPU with the return set up to retry >>>> the hypercall, it should all work... >>>> >>>> Of course, if the guest isn''t expecting that - or its buggy - then it >>>> could end up in an infinite loop. But maybe a flag (set a high bit in >>>> the hypercall number?), or a feature, or something? Might be worthwhile >>>> if it saves guests having to do something expensive (like a >>>> vmalloc_sync_all), even if they have to also deal with old hypervisors. >>> The vmalloc_sync_all is a pretty event even on Xen though, isn''t it? >> Looks like an important word is missing there. But its very expensive, >> if that''s what you''re saying. > Oops. "rare" was the missing word.Is there any progress on an official patch for this? I have my own unofficial patch which places a vmalloc_sync_all() after every alloc_vm_area() call and it works, but from the thread it sounds like there should be a more sophisticated solution to the problem. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Vrabel
2011-Sep-23 12:49 UTC
Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
On 23/09/11 13:35, Anthony Wright wrote:> > Is there any progress on an official patch for this [unsync''d vmalloc > address space bug]? I have my own unofficial patch which places a > vmalloc_sync_all() after every alloc_vm_area() call and it works, but > from the thread it sounds like there should be a more sophisticated > solution to the problem.The simple patch (re-adding the vmalloc_sync_all()) has been applied to 3.1-rc7 and should be in the next 3.0-stable release. I''m still working on a more elegant fix. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel