Kieran Mansley
2007-Jun-07 15:06 UTC
[Xen-devel] blkif_map error starting fourth guest domain
I''m having problems starting more than three domains. It sometimes works fine, but more often than not the 4th domain''s root block device times out and so the domU kernel panics as there''s no /dev/root: XENBUS: Timeout connecting to device: device/vbd/2057 (state 6) XENBUS: Timeout connecting to device: device/vif/0 (state 6) XENBUS: Timeout connecting to device: device/vif/1 (state 6) XENBUS: Device with no driver: device/console/0 Freeing unused kernel memory: 148k freed Red Hat nash version 5.0.32 starting Mounting proc filesystem Mounting sysfs filesystem Creating /dev Creating initial device nodes Setting up hotplug. Creating block device nodes. Creating root device. Mounting root filesystem. mount: could not find filesystem ''/dev/root'' Setting up other filesystems. Setting up new root fs setuproot: moving /dev failed: No such file or directory no fstab.sys, mounting internal defaults setuproot: error mounting /proc: No such file or directory setuproot: error mounting /sys: No such file or directory Switching to new root and running init. unmounting old /dev unmounting old /proc unmounting old /sys switchroot: mount failed: No such file or directory Kernel panic - not syncing: Attempted to kill init! My investigations suggest that this is due to the following in dom0''s kernel log: Jun 7 15:37:18 dell2950g kernel: vbd vbd-4-2057: 1 mapping ring-ref 8 port 6 This is printed by connect_ring() in drivers/xen/blkback/xenbus.c, as a result of the call to blkif_map() failing. The reason for blkif_map() failing seems to be the map_frontend_page() call failing. xm dmesg prints out: (XEN) mm.c:2610:d0 Could not find L1 PTE for address e1204000 which seems likely to be connected to map_frontend_page() failing. I suspect it has something to do with going above the 2G memory boundary. Each domain is configured to have 512M, and so together with dom0''s 512M that puts the domain that fails just into this region. If I decrease the amount of memory given to each domain so that they''d all fit into 2G, it doesn''t fail in this way (yet). I''ve never seen the first three domains fail in this way (yet). The source I''m running is based on xen-unstable.hg as of a couple of weeks ago. Advice for how to resolve this gratefully received. I''m happy to help debug it further. Many thanks Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Jun-07 15:12 UTC
Re: [Xen-devel] blkif_map error starting fourth guest domain
On 7/6/07 16:06, "Kieran Mansley" <kmansley@solarflare.com> wrote:> The reason for blkif_map() failing seems to be the map_frontend_page() > call failing. > > xm dmesg prints out: > (XEN) mm.c:2610:d0 Could not find L1 PTE for address e1204000 > which seems likely to be connected to map_frontend_page() failing.Yes, this is the problem. You''ll need to add some more tracing, find out exactly which grant_map operation issued by dom0 is failing, and find out why Xen thinks there is no pte mapping the specified virtual address. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kieran Mansley
2007-Jun-08 10:19 UTC
Re: [Xen-devel] blkif_map error starting fourth guest domain
On Thu, 2007-06-07 at 16:12 +0100, Keir Fraser wrote:> > > On 7/6/07 16:06, "Kieran Mansley" <kmansley@solarflare.com> wrote: > > > The reason for blkif_map() failing seems to be the map_frontend_page() > > call failing. > > > > xm dmesg prints out: > > (XEN) mm.c:2610:d0 Could not find L1 PTE for address e1204000 > > which seems likely to be connected to map_frontend_page() failing. > > Yes, this is the problem. You''ll need to add some more tracing, find out > exactly which grant_map operation issued by dom0 is failing, and find out > why Xen thinks there is no pte mapping the specified virtual address.The grant map that''s failing is (linux-2.6-xen- sparse/drivers/xen/blkback/interface.c:62): gnttab_set_map_op(&op, (unsigned long)blkif->blk_ring_area->addr, GNTMAP_host_map, shared_page, blkif->domid); if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1)) BUG(); I''ll get some more tracing and info about why this page supplied by the frontend doesn''t have a PTE when I get some spare time. Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Jun-08 10:23 UTC
Re: [Xen-devel] blkif_map error starting fourth guest domain
On 8/6/07 11:19, "Kieran Mansley" <kmansley@solarflare.com> wrote:> The grant map that''s failing is (linux-2.6-xen- > sparse/drivers/xen/blkback/interface.c:62): > > gnttab_set_map_op(&op, (unsigned long)blkif->blk_ring_area->addr, > GNTMAP_host_map, shared_page, blkif->domid); > > if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1)) > BUG(); > > I''ll get some more tracing and info about why this page supplied by the > frontend doesn''t have a PTE when I get some spare time.Thanks. The call to alloc_vm_area() should be ensuring that the pte does exist (as opposed to possibly needing to be allocated by the caller). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kieran Mansley
2007-Aug-01 15:22 UTC
Re: [Xen-devel] blkif_map error starting fourth guest domain
On Fri, 2007-06-08 at 11:19 +0100, Kieran Mansley wrote:> On Thu, 2007-06-07 at 16:12 +0100, Keir Fraser wrote: > > > > > > On 7/6/07 16:06, "Kieran Mansley" <kmansley@solarflare.com> wrote: > > > > > The reason for blkif_map() failing seems to be the map_frontend_page() > > > call failing. > > > > > > xm dmesg prints out: > > > (XEN) mm.c:2610:d0 Could not find L1 PTE for address e1204000 > > > which seems likely to be connected to map_frontend_page() failing. > > > > Yes, this is the problem. You''ll need to add some more tracing, find out > > exactly which grant_map operation issued by dom0 is failing, and find out > > why Xen thinks there is no pte mapping the specified virtual address. > > The grant map that''s failing is (linux-2.6-xen- > sparse/drivers/xen/blkback/interface.c:62): > > gnttab_set_map_op(&op, (unsigned long)blkif->blk_ring_area->addr, > GNTMAP_host_map, shared_page, blkif->domid); > > if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1)) > BUG(); > > I''ll get some more tracing and info about why this page supplied by the > frontend doesn''t have a PTE when I get some spare time. > > KieranBy way of a follow-up, I''m no longer able to reproduce this. It now works fine, and has done for some time. I suspect it may have been due to the bug/fix alluded to in this: http://lists.xensource.com/archives/html/xen-devel/2007-07/msg00402.html It doesn''t fit with that perfectly, so it may be the problem is different (and still there) but if it reappears I''ll let you know. Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel