Hello, I''m currently trying to understand some problems I had in the past with mixing look-back with blktap(2) for HV and PV domains. I''m stuck reading the source code, so I''d like to get some help from the list. Interrupt me if I got something fundamentally wrong in my understanding so far: 1. With pure-HV the domU gets an emulated IDE (or whatever) disk. The emulation is done by qemu-dm, which opens whatever is given to it: directly a file, a block-device in dom0, etc. 2. With pure-PV the domU does not get an emulated IDE, but must use blkfront to access the disk. For this to work blkback needs a block device, which is either directly taken from a dom0 block device or provided via look-back or blktap. 3. Because of PVonHVM Xen provides the domU with both an IDE view and an XVD view of evey block device. The domU boots using the IDE view and when the PV drivers are loaded, disables the emulated view and switches over to the XVD view. Now some questions: 4. I read somewhere that "mixing loobback with tapdisk is a bad idea", but I can''t find that again, so I''m wondering if that (wrong) claim got somehow stuck in my memory. I can only imagin two scenarios a) one domU accessing two disk images, one with lookback, the other one with blktap. That looks okay. b) two domUs accessing the same disk image, one with lookback, the other one with blktap. That looks broken but neither would I expect that to work (shared disk semantics aside) 5. While experimenting with a Linux domU I noticed that the boot process sometimes gets stuck when I declare one disk as "hda" and a second one as "xvda". The 2.6.32 kernel detects a clash in /sys/block naming and I''m stuck in the "XENBUS: Waiting for devices to initialise: 295s..." count down. Is this because hda and xvda overlap because of the PVonHVM case (ide-block-major has 64 minors per device for partitions, while scsi-block-major and xen-block-major only have 16 minors per device, that is hda=xvd[abcd], hdb=xvd[efgh], ...)? 6. How should I make an .iso image file accassable to a domU? If a use tap:/var/lib/libvirt/images/some.iso tapdisk2 claims the image and passes phy:/dev/xen/blktapX to qemu-dm, which I can access fine, but eject does not work, since qemu only sees the phy: device and can''t open another file. xen-blockfront in PVonHVM and Windows-GPLPV driver both reject CDROM-devices, so the CDROM remains IDE emulated. With GPLPV /local/domain/0/backend/vbd/18/768/state changes from "1" → "4", while with Linux it changes to "6" as soon as the Xen modules probe the XenBus. After that Xend refused to change the medium with the following error:> error: POST operation failed: xend_post: error from xen daemon:(xend.err ''Device 5632 not connected'') Today I did another test, were it worked fine, so my problem might not even be related to the state change. Since Xend needs to allocate a lookback device for each ISO, which is then never used by the PVonHVM drivers, can I tell Xend to not allocate a loopback device and only let qemu-dm open the file? So this looks like I should use file:ioemu:/var/lib/libvirt/images/some.iso instead for HV domUs, because QEMU would be able to open the file itself (and change it). Any loopback or blktap would be pointless, because the PVonHVM drivers refuse to work for CDROMs any way. But for PV-domUs there''s no qemu-dm doing IDE emulation, so using blktap or loopback there is a must. Correct? Thanks for your time and feedback in advance. Sincerely Philipp -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH be open. fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/ _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
On Sat, 2012-10-27 at 09:54 +0100, Philipp Hahn wrote:> 1. With pure-HV the domU gets an emulated IDE (or whatever) disk. The > emulation is done by qemu-dm, which opens whatever is given to it: directly a > file, a block-device in dom0, etc.Correct. In this case I think the PV device is technically created but never opened since the frontend never connects.> 2. With pure-PV the domU does not get an emulated IDE, but must use blkfront > to access the disk. For this to work blkback needs a block device, which is > either directly taken from a dom0 block device or provided via look-back or > blktap.specifically blktap2, yes. blktap1 worked differently (as will blktap3!)> 3. Because of PVonHVM Xen provides the domU with both an IDE view and an XVD > view of evey block device. The domU boots using the IDE view and when the PV > drivers are loaded, disables the emulated view and switches over to the XVD > view.Correct.> Now some questions: > > 4. I read somewhere that "mixing loobback with tapdisk is a bad idea", but I > can''t find that again, so I''m wondering if that (wrong) claim got somehow > stuck in my memory. I can only imagin two scenarios > a) one domU accessing two disk images, one with lookback, the other one with > blktap. That looks okay. > b) two domUs accessing the same disk image, one with lookback, the other one > with blktap. That looks broken but neither would I expect that to work > (shared disk semantics aside)loopback is dangerous because it reports success before the data has really hit the underlying device. This is a problem because e.g. your filesystem journalling relies on these guarantees. This affects anything using it including blkback (this is one of the reasons xl doesn''t setup this configuration automatically). There have been patches to fix this (by implementing O_DIRECT in the loop driver IIRC) ages ago which looked like getting resurrected recently (by upstream for non-Xen related reasons which I can''t recall), but I don''t know what the status is there One reason I can think of for your scenario b to be dangerous is if qemu doesn''t use O_DIRECT while blktap2 does. So qemu gets caching while blktap2 doesn''t and all hell breaks loose. Probably ok for r/o devices though and r/w shared disks need care for plenty of other reasons!> 5. While experimenting with a Linux domU I noticed that the boot process > sometimes gets stuck when I declare one disk as "hda" and a second one > as "xvda". The 2.6.32 kernel detects a clash in /sys/block naming and I''m > stuck in the "XENBUS: Waiting for devices to initialise: 295s..." count down. > Is this because hda and xvda overlap because of the PVonHVM case > (ide-block-major has 64 minors per device for partitions, while > scsi-block-major and xen-block-major only have 16 minors per deviceWhen you ask for hda you actually get hda+xvda in order to allow for the switchover described above. So you''ve actually asked for 2 xvda''s -- don''t do that ;-)> , that is > hda=xvd[abcd], hdb=xvd[efgh], ...)?Almost. hda=xvda, hda[1234...]=xvda[1234...], hdb=xvdb ... hdd=xvdd. There is no hde so xvde is a good starting point for pure PV disks if you are mixing ide and pure PV.> 6. How should I make an .iso image file accassable to a domU? > If a use tap:/var/lib/libvirt/images/some.iso tapdisk2 claims the image and > passes phy:/dev/xen/blktapX to qemu-dm, which I can access fine, but eject > does not work, since qemu only sees the phy: device and can''t open another > file. > xen-blockfront in PVonHVM and Windows-GPLPV driver both reject CDROM-devices, > so the CDROM remains IDE emulated.That''s right -- PV drivers aren''t used for HVM CD-ROM devices so that media change etc can be supported. I think in this case you want to use file:// and let qemu open the device direct. There will be no PV path in this case, so no need for tap etc.> So this looks like I should use file:ioemu:/var/lib/libvirt/images/some.iso > instead for HV domUs, because QEMU would be able to open the file itself (and > change it). Any loopback or blktap would be pointless, because the PVonHVM > drivers refuse to work for CDROMs any way.Correct.> But for PV-domUs there''s no qemu-dm doing IDE emulation, so using blktap or > loopback there is a must.Correct. You don''t get any media change etc capabilities here. Anything which looks like you do for a PV guest is actually doing hotplug of the vbd device. loopback isn''t so dangerous for cdroms since they are readonly.> Correct?I think you''ve mostly got it right, yes. You are using xend whereas my most up to date knowledge is libxl, there are a few subtle differences regarding which backend is selected for various disk configurations but I think they are broadly speaking the same. Ian.
I wanted to open a new thread with respect to Blktap2 and Blktap,but i''ll rather post here . I have been using a loopdevice for all my VMs because,i cant seem to find Blktap2 on any of the newer kernels .Has Blktap2 been dropped,even though its better than Blktap? Also,for some odd reason i have never been able to bring up my VMs with Blktap (tap:tapdisk:aio) since all my vms get stuck at ''XENBUS: Waiting for devices to initialise: 295s...'' Am i doing anything wrong? I have seen this happen 2-3 different VM servers with different Dom0 kernels,which is quite odd. Thanks !!> From: Ian.Campbell@citrix.com > To: hahn@univention.de > Date: Fri, 9 Nov 2012 15:49:32 +0000 > CC: Xen-users@lists.xen.org > Subject: Re: [Xen-users] RFH: loopback & blktap(2) and CDROM > > On Sat, 2012-10-27 at 09:54 +0100, Philipp Hahn wrote: > > 1. With pure-HV the domU gets an emulated IDE (or whatever) disk. The > > emulation is done by qemu-dm, which opens whatever is given to it: directly a > > file, a block-device in dom0, etc. > > Correct. > > In this case I think the PV device is technically created but never > opened since the frontend never connects. > > > 2. With pure-PV the domU does not get an emulated IDE, but must use blkfront > > to access the disk. For this to work blkback needs a block device, which is > > either directly taken from a dom0 block device or provided via look-back or > > blktap. > > specifically blktap2, yes. blktap1 worked differently (as will blktap3!) > > > 3. Because of PVonHVM Xen provides the domU with both an IDE view and an XVD > > view of evey block device. The domU boots using the IDE view and when the PV > > drivers are loaded, disables the emulated view and switches over to the XVD > > view. > > Correct. > > > Now some questions: > > > > 4. I read somewhere that "mixing loobback with tapdisk is a bad idea", but I > > can''t find that again, so I''m wondering if that (wrong) claim got somehow > > stuck in my memory. I can only imagin two scenarios > > a) one domU accessing two disk images, one with lookback, the other one with > > blktap. That looks okay. > > b) two domUs accessing the same disk image, one with lookback, the other one > > with blktap. That looks broken but neither would I expect that to work > > (shared disk semantics aside) > > loopback is dangerous because it reports success before the data has > really hit the underlying device. This is a problem because e.g. your > filesystem journalling relies on these guarantees. This affects anything > using it including blkback (this is one of the reasons xl doesn''t setup > this configuration automatically). > > There have been patches to fix this (by implementing O_DIRECT in the > loop driver IIRC) ages ago which looked like getting resurrected > recently (by upstream for non-Xen related reasons which I can''t recall), > but I don''t know what the status is there > > One reason I can think of for your scenario b to be dangerous is if qemu > doesn''t use O_DIRECT while blktap2 does. So qemu gets caching while > blktap2 doesn''t and all hell breaks loose. Probably ok for r/o devices > though and r/w shared disks need care for plenty of other reasons! > > > 5. While experimenting with a Linux domU I noticed that the boot process > > sometimes gets stuck when I declare one disk as "hda" and a second one > > as "xvda". The 2.6.32 kernel detects a clash in /sys/block naming and I''m > > stuck in the "XENBUS: Waiting for devices to initialise: 295s..." count down. > > Is this because hda and xvda overlap because of the PVonHVM case > > (ide-block-major has 64 minors per device for partitions, while > > scsi-block-major and xen-block-major only have 16 minors per device > > When you ask for hda you actually get hda+xvda in order to allow for the > switchover described above. So you''ve actually asked for 2 xvda''s -- > don''t do that ;-) > > > , that is > > hda=xvd[abcd], hdb=xvd[efgh], ...)? > > Almost. hda=xvda, hda[1234...]=xvda[1234...], hdb=xvdb ... hdd=xvdd. > > There is no hde so xvde is a good starting point for pure PV disks if > you are mixing ide and pure PV. > > > 6. How should I make an .iso image file accassable to a domU? > > If a use tap:/var/lib/libvirt/images/some.iso tapdisk2 claims the image and > > passes phy:/dev/xen/blktapX to qemu-dm, which I can access fine, but eject > > does not work, since qemu only sees the phy: device and can''t open another > > file. > > xen-blockfront in PVonHVM and Windows-GPLPV driver both reject CDROM-devices, > > so the CDROM remains IDE emulated. > > That''s right -- PV drivers aren''t used for HVM CD-ROM devices so that > media change etc can be supported. I think in this case you want to use > file:// and let qemu open the device direct. There will be no PV path in > this case, so no need for tap etc. > > > So this looks like I should use file:ioemu:/var/lib/libvirt/images/some.iso > > instead for HV domUs, because QEMU would be able to open the file itself (and > > change it). Any loopback or blktap would be pointless, because the PVonHVM > > drivers refuse to work for CDROMs any way. > > Correct. > > > But for PV-domUs there''s no qemu-dm doing IDE emulation, so using blktap or > > loopback there is a must. > > Correct. You don''t get any media change etc capabilities here. Anything > which looks like you do for a PV guest is actually doing hotplug of the > vbd device. > > loopback isn''t so dangerous for cdroms since they are readonly. > > > Correct? > > I think you''ve mostly got it right, yes. > > You are using xend whereas my most up to date knowledge is libxl, there > are a few subtle differences regarding which backend is selected for > various disk configurations but I think they are broadly speaking the > same. > > Ian. > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Hello Riyan,> I wanted to open a new thread with respect to Blktap2 and Blktap,but i''ll > rather post here.Then you shoud at least trim the quoted part of the mail you''re highjacking. On Saturday 10 November 2012 19:40:12 Riyan S wrote:> I have been using a loopdevice for all my VMs because,i > cant seem to find Blktap2 on any of the newer kernels .Has Blktap2 been > dropped,even though its better than Blktap?blktap2 was (AFAIK) never included in the main Linux Kernel. For Debian you can install blktap-dkms which compiles the module for your current kernel.> Also,for some odd reason i have > never been able to bring up my VMs with Blktap (tap:tapdisk:aio) since all > my vms get stuck at ''XENBUS: Waiting for devices to initialise: 295s...''If there''s a Linux kernel traceback before that, you have the same problem as me which multiple devices overlapping each other. Otherwise take a look at the debug output of Xend and all the other xen tools (In Debian: /var/log/xen/*) At least provide your exact configuration, so people spending their time to help you have some useful facts. Sincerely Philipp -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH be open. fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/ _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Hello Ian, thank you for your excellent answer. On Friday 09 November 2012 16:49:32 Ian Campbell wrote:> > 4. I read somewhere that "mixing loobback with tapdisk is a bad idea", > > but I can''t find that again, so I''m wondering if that (wrong) claim got > > somehow stuck in my memory. I can only imagin two scenarios > > a) one domU accessing two disk images, one with lookback, the other one > > with blktap. That looks okay. > > b) two domUs accessing the same disk image, one with lookback, the other > > one with blktap. That looks broken but neither would I expect that to > > work (shared disk semantics aside) > > loopback is dangerous because it reports success before the data has > really hit the underlying device....> Probably ok for r/o devices > though and r/w shared disks need care for plenty of other reasons!Okay, my current problem is getting the CDROM case right.> > 5. While experimenting with a Linux domU I noticed that the boot process > > sometimes gets stuck when I declare one disk as "hda" and a second one > > as "xvda". The 2.6.32 kernel detects a clash in /sys/block naming and I''m > > stuck in the "XENBUS: Waiting for devices to initialise: 295s..." count > > down. Is this because hda and xvda overlap because of the PVonHVM case > > (ide-block-major has 64 minors per device for partitions, while > > scsi-block-major and xen-block-major only have 16 minors per device > > When you ask for hda you actually get hda+xvda in order to allow for the > switchover described above. So you''ve actually asked for 2 xvda''s -- > don''t do that ;-) > > > , that is > > hda=xvd[abcd], hdb=xvd[efgh], ...)? > > Almost. hda=xvda, hda[1234...]=xvda[1234...], hdb=xvdb ... hdd=xvdd. > > There is no hde so xvde is a good starting point for pure PV disks if > you are mixing ide and pure PV.With Linux 3.2.30 as domU I get the same as you, but with 2.6.32 I get a different: domU is configured with hda=hdb=disk, hdc=cdrom, but inside the domU I get /dev/xdva and /dev/xvde for the disk, and /dev/scd0 for the cdrom. With "xen_emul_unplug=never" I get hda* and hdb*. For testing I created 60 partitions (3 primary+57 extended) on hdb, but as most SCSI, SATA and XEN majors only support 16 minors per device, I only see the first 15 partitions on /dev/xvde{,1..15}. With "..unplug=never" I see them all, but 16..60 are provided by block-major 259 (blkext). There also seems to be a problem when I swap hdb and hdc: If hdb=cdrom is sandwiched between hda=hdc=disk, the Linux-2.6.32 kernel waits in the "XENBUS: Waiting for devices to initialise: 295s..." count down after an OOPs, because /dev/block/202:0 couldn''t be double-created. Its fine with 3.2.30. # dmesg | grep -i xen XENBUS: Device with no driver: device/vbd/768 XENBUS: Device with no driver: device/vbd/75632 XENBUS: Device with no driver: device/vbd/832 ... blkfront: xvda: barriers enabled xvda: vbd vbd-832: 19 xenbus_dev_probe on device/vbd/832 blkfront: xvda: barriers enabled WARNING: at ... fs/sysfs/dir.c:491 sysfs_add_one+0xcc/0xe4() sysfs: cannot create duplicate filename ''/dev/block/202:0'' Pid 12, comm: xenwatch Not Tainted 2.6.32-ucs57amd64 #1 ... ? sysfs_add_one+0xcc/0xe4 ... ? backend_changed+0x44e/0x468 [xen_blkfront] ? xenwatch_thread+0x117/0x14a ... # xenstore-ls | egrep ''tap2|vbd'' /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/768/frontend = "/local/domain/5/device/vbd/768" /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/768/frontend-id = "5" /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/768/backend-id = "0" /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/768/backend = "/local/domain/0/backend/vbd/5/768" ... /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/5632/frontend = "/local/domain/5/device/vbd/5632" /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/5632/frontend-id = "5" /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/5632/backend-id = "0" /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/5632/backend = "/local/domain/0/backend/vbd/5/5632" ... /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/vbd/832/frontend = "/local/domain/5/device/vbd/832" /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/vbd/832/frontend-id = "5" /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/vbd/832/backend-id = "0" /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/vbd/832/backend = "/local/domain/0/backend/vbd/5/832" ... /local/domain/0/backend/vbd/5/768/domain = "ucs-hv-hd" /local/domain/0/backend/vbd/5/768/frontend = "/local/domain/5/device/vbd/768" /local/domain/0/backend/vbd/5/768/uuid = "22c313da-e8bd-5f63-6246-69adbf18408c" /local/domain/0/backend/vbd/5/768/bootable = "1" /local/domain/0/backend/vbd/5/768/dev = "hda" /local/domain/0/backend/vbd/5/768/state = "4" /local/domain/0/backend/vbd/5/768/params = "/dev/xen/blktap-2/tapdev0" /local/domain/0/backend/vbd/5/768/mode = "w" /local/domain/0/backend/vbd/5/768/online = "1" /local/domain/0/backend/vbd/5/768/frontend-id = "5" /local/domain/0/backend/vbd/5/768/type = "phy" /local/domain/0/backend/vbd/5/768/tapdisk-params = "aio:/var/lib/libvirt/images/ucs-hv-hd.raw" /local/domain/0/backend/vbd/5/768/physical-device = "fd:0" /local/domain/0/backend/vbd/5/768/hotplug-status = "connected" /local/domain/0/backend/vbd/5/768/feature-barrier = "1" /local/domain/0/backend/vbd/5/768/sectors = "41943040" /local/domain/0/backend/vbd/5/768/info = "0" /local/domain/0/backend/vbd/5/768/sector-size = "512" ... /local/domain/0/backend/vbd/5/5632/domain = "ucs-hv-hd" /local/domain/0/backend/vbd/5/5632/frontend = "/local/domain/5/device/vbd/5632" /local/domain/0/backend/vbd/5/5632/uuid = "62456f87-d3c0-1cec-a15f-dfe2cc5ea22d" /local/domain/0/backend/vbd/5/5632/bootable = "0" /local/domain/0/backend/vbd/5/5632/dev = "hdc" /local/domain/0/backend/vbd/5/5632/state = "4" /local/domain/0/backend/vbd/5/5632/params = "/dev/xen/blktap-2/tapdev1" /local/domain/0/backend/vbd/5/5632/mode = "w" /local/domain/0/backend/vbd/5/5632/online = "1" /local/domain/0/backend/vbd/5/5632/frontend-id = "5" /local/domain/0/backend/vbd/5/5632/type = "phy" /local/domain/0/backend/vbd/5/5632/tapdisk-params = "aio:/var/lib/libvirt/images/ucs-hv-hd2.raw" /local/domain/0/backend/vbd/5/5632/physical-device = "fd:1" /local/domain/0/backend/vbd/5/5632/hotplug-status = "connected" /local/domain/0/backend/vbd/5/5632/feature-barrier = "1" /local/domain/0/backend/vbd/5/5632/sectors = "41943040" /local/domain/0/backend/vbd/5/5632/info = "0" /local/domain/0/backend/vbd/5/5632/sector-size = "512" ... /local/domain/0/backend/vbd/5/832/domain = "ucs-hv-hd" /local/domain/0/backend/vbd/5/832/frontend = "/local/domain/5/device/vbd/832" /local/domain/0/backend/vbd/5/832/uuid = "eccf7fb5-7182-fb1b-62af-bd28f7620573" /local/domain/0/backend/vbd/5/832/bootable = "0" /local/domain/0/backend/vbd/5/832/dev = "hdb" /local/domain/0/backend/vbd/5/832/state = "6" /local/domain/0/backend/vbd/5/832/params = "/var/lib/libvirt/images/ucs_2.4-0-sec7-20120131133624-dvd-amd64.iso" /local/domain/0/backend/vbd/5/832/mode = "r" /local/domain/0/backend/vbd/5/832/online = "1" /local/domain/0/backend/vbd/5/832/frontend-id = "5" /local/domain/0/backend/vbd/5/832/type = "file" /local/domain/0/backend/vbd/5/832/node = "/dev/loop0" /local/domain/0/backend/vbd/5/832/physical-device = "7:0" /local/domain/0/backend/vbd/5/832/hotplug-status = "connected" ... /local/domain/5/device/vbd/768/backend-id = "0" /local/domain/5/device/vbd/768/virtual-device = "768" /local/domain/5/device/vbd/768/device-type = "disk" /local/domain/5/device/vbd/768/state = "4" /local/domain/5/device/vbd/768/backend = "/local/domain/0/backend/vbd/5/768" /local/domain/5/device/vbd/768/ring-ref = "8" /local/domain/5/device/vbd/768/event-channel = "4" /local/domain/5/device/vbd/768/protocol = "x86_64-abi" ... /local/domain/5/device/vbd/5632/backend-id = "0" /local/domain/5/device/vbd/5632/virtual-device = "5632" /local/domain/5/device/vbd/5632/device-type = "disk" /local/domain/5/device/vbd/5632/state = "4" /local/domain/5/device/vbd/5632/backend = "/local/domain/0/backend/vbd/5/5632" /local/domain/5/device/vbd/5632/ring-ref = "9" /local/domain/5/device/vbd/5632/event-channel = "5" /local/domain/5/device/vbd/5632/protocol = "x86_64-abi" ... /local/domain/5/device/vbd/832/backend-id = "0" /local/domain/5/device/vbd/832/virtual-device = "832" /local/domain/5/device/vbd/832/device-type = "cdrom" /local/domain/5/device/vbd/832/state = "6" /local/domain/5/device/vbd/832/backend = "/local/domain/0/backend/vbd/5/832" ... /local/domain/5/error/device/vbd/832/error = "19 xenbus_dev_probe on device/vbd/832"> > 6. How should I make an .iso image file accassable to a domU? > > If a use tap:/var/lib/libvirt/images/some.iso tapdisk2 claims the image > > and passes phy:/dev/xen/blktapX to qemu-dm, which I can access fine, but > > eject does not work, since qemu only sees the phy: device and can''t open > > another file. > > xen-blockfront in PVonHVM and Windows-GPLPV driver both reject > > CDROM-devices, so the CDROM remains IDE emulated. > > That''s right -- PV drivers aren''t used for HVM CD-ROM devices so that > media change etc can be supported. I think in this case you want to use > file:// and let qemu open the device direct. There will be no PV path in > this case, so no need for tap etc.Can I tell Xend to not create a lookback device for each such ISO image using file://? I had hoped for some "ioemu:" flag to get only the ioemu emulated device as it is (or was?) the case with network interfaces, but for disk devices the "ioemu:" seems to be stripped just everywhere without effect. Currently all my domUs use the same ISO image, but each domU gets its own lookback device, so I have to set "max_loop=256" when loading the loop module to have enought lookback devices available. (not doing the loopback stuff would be the easiest solution, since xen-blockfront doesn''t use it anyway). Sincerely Philipp PS: Mostly tested on Xen-3.4.3 with libvirt-0.8.7 and linux-2.6.32, but mostly the same with Xen-4.1.3 with libvirt-0.9.12 and linux-3.2.30. -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH be open. fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/ _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
On Wed, 2012-11-14 at 11:13 +0000, Philipp Hahn wrote:> Hello Ian, > > thank you for your excellent answer. > > On Friday 09 November 2012 16:49:32 Ian Campbell wrote: > > > 4. I read somewhere that "mixing loobback with tapdisk is a bad idea", > > > but I can''t find that again, so I''m wondering if that (wrong) claim got > > > somehow stuck in my memory. I can only imagin two scenarios > > > a) one domU accessing two disk images, one with lookback, the other one > > > with blktap. That looks okay. > > > b) two domUs accessing the same disk image, one with lookback, the other > > > one with blktap. That looks broken but neither would I expect that to > > > work (shared disk semantics aside) > > > > loopback is dangerous because it reports success before the data has > > really hit the underlying device. > ... > > Probably ok for r/o devices > > though and r/w shared disks need care for plenty of other reasons! > > Okay, my current problem is getting the CDROM case right. > > > > 5. While experimenting with a Linux domU I noticed that the boot process > > > sometimes gets stuck when I declare one disk as "hda" and a second one > > > as "xvda". The 2.6.32 kernel detects a clash in /sys/block naming and I''m > > > stuck in the "XENBUS: Waiting for devices to initialise: 295s..." count > > > down. Is this because hda and xvda overlap because of the PVonHVM case > > > (ide-block-major has 64 minors per device for partitions, while > > > scsi-block-major and xen-block-major only have 16 minors per device > > > > When you ask for hda you actually get hda+xvda in order to allow for the > > switchover described above. So you''ve actually asked for 2 xvda''s -- > > don''t do that ;-) > > > > > , that is > > > hda=xvd[abcd], hdb=xvd[efgh], ...)? > > > > Almost. hda=xvda, hda[1234...]=xvda[1234...], hdb=xvdb ... hdd=xvdd. > > > > There is no hde so xvde is a good starting point for pure PV disks if > > you are mixing ide and pure PV. > > With Linux 3.2.30 as domU I get the same as you, but with 2.6.32 I get a > different:Ah, that''s right. Some early versions of PVHVM disk support tried to rename things to avoid clashes. I suspect that either 2.6.32 or the backport to Debian of the PVHVM stuff might have included that. 196cfe2ae8fcdc03b3c7d627e7dfe8c0ce7229f9 is the upstream commit which removed this behaviour.> domU is configured with hda=hdb=disk, hdc=cdrom, but inside the > domU I get /dev/xdva and /dev/xvde for the disk, and /dev/scd0 for the cdrom.xvda and xvde is odd, I''d have expected either xvda+b or xvde+f. Perhaps Stefano can remember what the behaviour was supposed to be here.> With "xen_emul_unplug=never" I get hda* and hdb*.As expected, good.> For testing I created 60 partitions (3 primary+57 extended) on hdb, but as > most SCSI, SATA and XEN majors only support 16 minors per device, I only see > the first 15 partitions on /dev/xvde{,1..15}. With "..unplug=never" I see > them all, but 16..60 are provided by block-major 259 (blkext).Hrm. I wonder of blkfront needs to do some magic to enable this blkext thing then?> There also seems to be a problem when I swap hdb and hdc: If hdb=cdrom is > sandwiched between hda=hdc=disk, the Linux-2.6.32 kernel waits in > the "XENBUS: Waiting for devices to initialise: 295s..." count down after an > OOPs, because /dev/block/202:0 couldn''t be double-created. > Its fine with 3.2.30. > > # dmesg | grep -i xen > XENBUS: Device with no driver: device/vbd/768 > XENBUS: Device with no driver: device/vbd/75632 > XENBUS: Device with no driver: device/vbd/832 > ... > blkfront: xvda: barriers enabled > xvda: > vbd vbd-832: 19 xenbus_dev_probe on device/vbd/832 > blkfront: xvda: barriers enabled > WARNING: at ... fs/sysfs/dir.c:491 sysfs_add_one+0xcc/0xe4() > sysfs: cannot create duplicate filename ''/dev/block/202:0'' > Pid 12, comm: xenwatch Not Tainted 2.6.32-ucs57amd64 #1 > ... > ? sysfs_add_one+0xcc/0xe4 > ... > ? backend_changed+0x44e/0x468 [xen_blkfront] > ? xenwatch_thread+0x117/0x14a > ... > # xenstore-ls | egrep ''tap2|vbd'' > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/768/frontend > = "/local/domain/5/device/vbd/768" > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/768/frontend-id = "5" > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/768/backend-id = "0" > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/768/backend > = "/local/domain/0/backend/vbd/5/768" > ... > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/5632/frontend > = "/local/domain/5/device/vbd/5632" > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/5632/frontend-id = "5" > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/5632/backend-id = "0" > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/tap2/5632/backend > = "/local/domain/0/backend/vbd/5/5632" > ... > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/vbd/832/frontend > = "/local/domain/5/device/vbd/832" > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/vbd/832/frontend-id = "5" > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/vbd/832/backend-id = "0" > /vm/77021a09-cbcb-35d6-d480-76ff7c2cf288/device/vbd/832/backend > = "/local/domain/0/backend/vbd/5/832" > ... > /local/domain/0/backend/vbd/5/768/domain = "ucs-hv-hd" > /local/domain/0/backend/vbd/5/768/frontend = "/local/domain/5/device/vbd/768" > /local/domain/0/backend/vbd/5/768/uuid > = "22c313da-e8bd-5f63-6246-69adbf18408c" > /local/domain/0/backend/vbd/5/768/bootable = "1" > /local/domain/0/backend/vbd/5/768/dev = "hda" > /local/domain/0/backend/vbd/5/768/state = "4" > /local/domain/0/backend/vbd/5/768/params = "/dev/xen/blktap-2/tapdev0" > /local/domain/0/backend/vbd/5/768/mode = "w" > /local/domain/0/backend/vbd/5/768/online = "1" > /local/domain/0/backend/vbd/5/768/frontend-id = "5" > /local/domain/0/backend/vbd/5/768/type = "phy" > /local/domain/0/backend/vbd/5/768/tapdisk-params > = "aio:/var/lib/libvirt/images/ucs-hv-hd.raw" > /local/domain/0/backend/vbd/5/768/physical-device = "fd:0" > /local/domain/0/backend/vbd/5/768/hotplug-status = "connected" > /local/domain/0/backend/vbd/5/768/feature-barrier = "1" > /local/domain/0/backend/vbd/5/768/sectors = "41943040" > /local/domain/0/backend/vbd/5/768/info = "0" > /local/domain/0/backend/vbd/5/768/sector-size = "512" > ... > /local/domain/0/backend/vbd/5/5632/domain = "ucs-hv-hd" > /local/domain/0/backend/vbd/5/5632/frontend > = "/local/domain/5/device/vbd/5632" > /local/domain/0/backend/vbd/5/5632/uuid > = "62456f87-d3c0-1cec-a15f-dfe2cc5ea22d" > /local/domain/0/backend/vbd/5/5632/bootable = "0" > /local/domain/0/backend/vbd/5/5632/dev = "hdc" > /local/domain/0/backend/vbd/5/5632/state = "4" > /local/domain/0/backend/vbd/5/5632/params = "/dev/xen/blktap-2/tapdev1" > /local/domain/0/backend/vbd/5/5632/mode = "w" > /local/domain/0/backend/vbd/5/5632/online = "1" > /local/domain/0/backend/vbd/5/5632/frontend-id = "5" > /local/domain/0/backend/vbd/5/5632/type = "phy" > /local/domain/0/backend/vbd/5/5632/tapdisk-params > = "aio:/var/lib/libvirt/images/ucs-hv-hd2.raw" > /local/domain/0/backend/vbd/5/5632/physical-device = "fd:1" > /local/domain/0/backend/vbd/5/5632/hotplug-status = "connected" > /local/domain/0/backend/vbd/5/5632/feature-barrier = "1" > /local/domain/0/backend/vbd/5/5632/sectors = "41943040" > /local/domain/0/backend/vbd/5/5632/info = "0" > /local/domain/0/backend/vbd/5/5632/sector-size = "512" > ... > /local/domain/0/backend/vbd/5/832/domain = "ucs-hv-hd" > /local/domain/0/backend/vbd/5/832/frontend = "/local/domain/5/device/vbd/832" > /local/domain/0/backend/vbd/5/832/uuid > = "eccf7fb5-7182-fb1b-62af-bd28f7620573" > /local/domain/0/backend/vbd/5/832/bootable = "0" > /local/domain/0/backend/vbd/5/832/dev = "hdb" > /local/domain/0/backend/vbd/5/832/state = "6" > /local/domain/0/backend/vbd/5/832/params > = "/var/lib/libvirt/images/ucs_2.4-0-sec7-20120131133624-dvd-amd64.iso" > /local/domain/0/backend/vbd/5/832/mode = "r" > /local/domain/0/backend/vbd/5/832/online = "1" > /local/domain/0/backend/vbd/5/832/frontend-id = "5" > /local/domain/0/backend/vbd/5/832/type = "file" > /local/domain/0/backend/vbd/5/832/node = "/dev/loop0" > /local/domain/0/backend/vbd/5/832/physical-device = "7:0" > /local/domain/0/backend/vbd/5/832/hotplug-status = "connected" > ... > /local/domain/5/device/vbd/768/backend-id = "0" > /local/domain/5/device/vbd/768/virtual-device = "768" > /local/domain/5/device/vbd/768/device-type = "disk" > /local/domain/5/device/vbd/768/state = "4" > /local/domain/5/device/vbd/768/backend = "/local/domain/0/backend/vbd/5/768" > /local/domain/5/device/vbd/768/ring-ref = "8" > /local/domain/5/device/vbd/768/event-channel = "4" > /local/domain/5/device/vbd/768/protocol = "x86_64-abi" > ... > /local/domain/5/device/vbd/5632/backend-id = "0" > /local/domain/5/device/vbd/5632/virtual-device = "5632" > /local/domain/5/device/vbd/5632/device-type = "disk" > /local/domain/5/device/vbd/5632/state = "4" > /local/domain/5/device/vbd/5632/backend = "/local/domain/0/backend/vbd/5/5632" > /local/domain/5/device/vbd/5632/ring-ref = "9" > /local/domain/5/device/vbd/5632/event-channel = "5" > /local/domain/5/device/vbd/5632/protocol = "x86_64-abi" > ... > /local/domain/5/device/vbd/832/backend-id = "0" > /local/domain/5/device/vbd/832/virtual-device = "832" > /local/domain/5/device/vbd/832/device-type = "cdrom" > /local/domain/5/device/vbd/832/state = "6" > /local/domain/5/device/vbd/832/backend = "/local/domain/0/backend/vbd/5/832" > ... > /local/domain/5/error/device/vbd/832/error = "19 xenbus_dev_probe on > device/vbd/832" > > > > > 6. How should I make an .iso image file accassable to a domU? > > > If a use tap:/var/lib/libvirt/images/some.iso tapdisk2 claims the image > > > and passes phy:/dev/xen/blktapX to qemu-dm, which I can access fine, but > > > eject does not work, since qemu only sees the phy: device and can''t open > > > another file. > > > xen-blockfront in PVonHVM and Windows-GPLPV driver both reject > > > CDROM-devices, so the CDROM remains IDE emulated. > > > > That''s right -- PV drivers aren''t used for HVM CD-ROM devices so that > > media change etc can be supported. I think in this case you want to use > > file:// and let qemu open the device direct. There will be no PV path in > > this case, so no need for tap etc. > > Can I tell Xend to not create a lookback device for each such ISO image using > file://? I had hoped for some "ioemu:" flag to get only the ioemu emulated > device as it is (or was?) the case with network interfaces, but for disk > devices the "ioemu:" seems to be stripped just everywhere without effect.I''ve no idea about this WRT xend I''m afraid.> Currently all my domUs use the same ISO image, but each domU gets its own > lookback device, so I have to set "max_loop=256" when loading the loop module > to have enought lookback devices available. (not doing the loopback stuff > would be the easiest solution, since xen-blockfront doesn''t use it anyway). > > Sincerely > Philipp > > PS: Mostly tested on Xen-3.4.3 with libvirt-0.8.7 and linux-2.6.32, but mostly > the same with Xen-4.1.3 with libvirt-0.9.12 and linux-3.2.30.
On Wed, 14 Nov 2012, Ian Campbell wrote:> On Wed, 2012-11-14 at 11:13 +0000, Philipp Hahn wrote: > > Hello Ian, > > > > thank you for your excellent answer. > > > > On Friday 09 November 2012 16:49:32 Ian Campbell wrote: > > > > 4. I read somewhere that "mixing loobback with tapdisk is a bad idea", > > > > but I can''t find that again, so I''m wondering if that (wrong) claim got > > > > somehow stuck in my memory. I can only imagin two scenarios > > > > a) one domU accessing two disk images, one with lookback, the other one > > > > with blktap. That looks okay. > > > > b) two domUs accessing the same disk image, one with lookback, the other > > > > one with blktap. That looks broken but neither would I expect that to > > > > work (shared disk semantics aside) > > > > > > loopback is dangerous because it reports success before the data has > > > really hit the underlying device. > > ... > > > Probably ok for r/o devices > > > though and r/w shared disks need care for plenty of other reasons! > > > > Okay, my current problem is getting the CDROM case right. > > > > > > 5. While experimenting with a Linux domU I noticed that the boot process > > > > sometimes gets stuck when I declare one disk as "hda" and a second one > > > > as "xvda". The 2.6.32 kernel detects a clash in /sys/block naming and I''m > > > > stuck in the "XENBUS: Waiting for devices to initialise: 295s..." count > > > > down. Is this because hda and xvda overlap because of the PVonHVM case > > > > (ide-block-major has 64 minors per device for partitions, while > > > > scsi-block-major and xen-block-major only have 16 minors per device > > > > > > When you ask for hda you actually get hda+xvda in order to allow for the > > > switchover described above. So you''ve actually asked for 2 xvda''s -- > > > don''t do that ;-) > > > > > > > , that is > > > > hda=xvd[abcd], hdb=xvd[efgh], ...)? > > > > > > Almost. hda=xvda, hda[1234...]=xvda[1234...], hdb=xvdb ... hdd=xvdd. > > > > > > There is no hde so xvde is a good starting point for pure PV disks if > > > you are mixing ide and pure PV. > > > > With Linux 3.2.30 as domU I get the same as you, but with 2.6.32 I get a > > different: > > Ah, that''s right. Some early versions of PVHVM disk support tried to > rename things to avoid clashes. I suspect that either 2.6.32 or the > backport to Debian of the PVHVM stuff might have included that. > 196cfe2ae8fcdc03b3c7d627e7dfe8c0ce7229f9 is the upstream commit which > removed this behaviour. > > > domU is configured with hda=hdb=disk, hdc=cdrom, but inside the > > domU I get /dev/xdva and /dev/xvde for the disk, and /dev/scd0 for the cdrom. > > xvda and xvde is odd, I''d have expected either xvda+b or xvde+f. Perhaps > Stefano can remember what the behaviour was supposed to be here.there might have been a version of the blkfront patch that if you had hda and xvda in your config file would get you: xvda - the PV disk corresponding to hda xvde - the PV disk that is called xvda in your config file but that has been renamed to avoid clashes upstream that behavior is long gone> > With "xen_emul_unplug=never" I get hda* and hdb*. > > As expected, good. > > > For testing I created 60 partitions (3 primary+57 extended) on hdb, but as > > most SCSI, SATA and XEN majors only support 16 minors per device, I only see > > the first 15 partitions on /dev/xvde{,1..15}. With "..unplug=never" I see > > them all, but 16..60 are provided by block-major 259 (blkext). > > Hrm. I wonder of blkfront needs to do some magic to enable this blkext > thing then?I thought that it is not possible to have more than 16 partitions on an IDE disk, that would be the reason why you also can''t have more than 16 partitions on a PV disk corresponding to an emulated IDE disk (xvda corresponding to hda).
Hello, On Friday 16 November 2012 17:54:57 Stefano Stabellini wrote:> On Wed, 14 Nov 2012, Ian Campbell wrote: > > On Wed, 2012-11-14 at 11:13 +0000, Philipp Hahn wrote:...> > > For testing I created 60 partitions (3 primary+57 extended) on hdb, but > > > as most SCSI, SATA and XEN majors only support 16 minors per device, I > > > only see the first 15 partitions on /dev/xvde{,1..15}. With > > > "..unplug=never" I see them all, but 16..60 are provided by block-major > > > 259 (blkext). > > > > Hrm. I wonder of blkfront needs to do some magic to enable this blkext > > thing then? > > I thought that it is not possible to have more than 16 partitions on an > IDE disk, that would be the reason why you also can''t have more than 16 > partitions on a PV disk corresponding to an emulated IDE disk (xvda > corresponding to hda).That''s the difference between the old PATA ''hdX'' and the newer SATA/SCSI/SAS ''sdX'' block devices: the former support up to 63 partitions. linux/Documentation/devices.txt:> 3 block First MFM, RLL and IDE hard disk/CD-ROM interface > 0 = /dev/hda Master: whole disk (or CD-ROM) > 64 = /dev/hdb Slave: whole disk (or CD-ROM) > > For partitions, add to the whole disk device number: > 0 = /dev/hd? Whole disk > 1 = /dev/hd?1 First partition > 2 = /dev/hd?2 Second partition > ... > 63 = /dev/hd?63 63rd partition > > For Linux/i386, partitions 1-4 are the primary > partitions, and 5 and above are logical partitions. > Other versions of Linux use partitioning schemes > appropriate to their respective architectures....> 8 block SCSI disk devices (0-15) > 0 = /dev/sda First SCSI disk whole disk > 16 = /dev/sdb Second SCSI disk whole disk > 32 = /dev/sdc Third SCSI disk whole disk > ... > 240 = /dev/sdp Sixteenth SCSI disk whole disk > > Partitions are handled in the same way as for IDE > disks (see major number 3) except that the limit on > partitions is 15.But with LVM that''s mostly irrelevant (I hope). BYtE Philipp -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH be open. fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/ _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Hello, On Friday 09 November 2012 16:49:32 Ian Campbell wrote:> > 4. I read somewhere that "mixing loobback with tapdisk is a bad idea", but I > > can''t find that again, so I''m wondering if that (wrong) claim got somehow > > stuck in my memory. I can only imagin two scenariosI found it again in <http://wiki.xenproject.org/wiki/Blktap2#Usage>> Notice: In Xen 4.0.0/4.0.1 don''t mix file and tap simultaneously, in this case domU unable to see any disk device. May be bug or feature!I hope that bug is fixed since 4.0.1, because the descriptions sounds a little bit like my other finding:> > 5. While experimenting with a Linux domU I noticed that the boot process > > sometimes gets stuck when I declare one disk as "hda" and a second one > > as "xvda". The 2.6.32 kernel detects a clash in /sys/block naming and I''m > > stuck in the "XENBUS: Waiting for devices to initialise: 295s..." count > > down. Is this because hda and xvda overlap because of the PVonHVM case > > (ide-block-major has 64 minors per device for partitions, while > > scsi-block-major and xen-block-major only have 16 minors per device > > When you ask for hda you actually get hda+xvda in order to allow for the > switchover described above. So you''ve actually asked for 2 xvda''s -- > don''t do that ;-)Sincerely Philipp PS: The link to the blktap2 README on the Wiki-page is broken -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH be open. fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/