So it seems that "xl block-attach" allows a variety of ways to specify the devices ID (xvdN, dNpN, decimal or hex number) - why is it that the inverse operation ("xl block-detach") can''t deal with anything by a decimal number? Further, why is it that with no blktap module loaded I''m getting an incomplete attach when using the (deprecated) file:/ format for specifying the backing file? It reports that it would be using qdisk, and blkfront also sees the device appearing, but all I''m seeing in the kernel log is the single message from blkfront''s probe function. (With no blktap in pv-ops, I wonder how file backed disks work there.) When trying to detach such a broken device I''m getting "unrecognized disk backend type: 0", and the remove fails. Jan
On Thu, 2012-03-01 at 16:33 +0000, Jan Beulich wrote:> So it seems that "xl block-attach" allows a variety of ways to specify the > devices ID (xvdN, dNpN, decimal or hex number) - why is it that the > inverse operation ("xl block-detach") can''t deal with anything by a decimal > number?Just an omission?> Further, why is it that with no blktap module loaded I''m getting an > incomplete attach when using the (deprecated) file:/ format for > specifying the backing file? It reports that it would be using qdisk, > and blkfront also sees the device appearing, but all I''m seeing in the > kernel log is the single message from blkfront''s probe function.What do you mean by "incomplete"? What else would you expect to see? What do the xenstore entries for the device look like?> (With no blktap in pv-ops, I wonder how file backed disks work there.)file backed disks without blktap use the qdisk backend supplied by qemu.> When trying to detach such a broken device I''m getting > "unrecognized disk backend type: 0", and the remove fails.What version of xen is this? libxl__device_disk_from_xs_be tries to read the backend type from xenstore, the be "type" node. Is that present for you? libxl_string_to_backend looks a bit suspect to me... Ian.
>>> On 01.03.12 at 17:45, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Thu, 2012-03-01 at 16:33 +0000, Jan Beulich wrote: >> So it seems that "xl block-attach" allows a variety of ways to specify the >> devices ID (xvdN, dNpN, decimal or hex number) - why is it that the >> inverse operation ("xl block-detach") can''t deal with anything by a decimal >> number? > > Just an omission?Maybe. How many else are there throughout the xl stack then? (I''m asking because any time I do something more advanced than "xl dmesg" or "xl debug-key ..." with xl, I''m running into endless problems.>> Further, why is it that with no blktap module loaded I''m getting an >> incomplete attach when using the (deprecated) file:/ format for >> specifying the backing file? It reports that it would be using qdisk, >> and blkfront also sees the device appearing, but all I''m seeing in the >> kernel log is the single message from blkfront''s probe function. > > What do you mean by "incomplete"? What else would you expect to see?Either a fully failed operation (including an error message indicating so even without wading through the -vvv output), or a fully working one (after all, the -vvv messages suggest that a usable backend was selected).> What do the xenstore entries for the device look like?local = "" domain = "" 0 = "" name = "Domain-0" device = "" vbd = "" 51712 = "" backend = "/local/domain/0/backend/qdisk/0/51712" backend-id = "0" state = "1" virtual-device = "51712" device-type = "disk" 51728 = "" backend = "/local/domain/0/backend/qdisk/0/51728" backend-id = "0" state = "3" virtual-device = "51728" device-type = "disk" ring-ref = "9" event-channel = "60" protocol = "x86_64-abi" 51760 = "" backend = "/local/domain/0/backend/qdisk/0/51760" backend-id = "0" state = "3" virtual-device = "51760" device-type = "disk" ring-ref = "10" event-channel = "61" protocol = "x86_64-abi" backend = "" qdisk = "" 0 = "" 51712 = "" frontend = "/local/domain/0/device/vbd/51712" params = "aio:/srv/SuSE/SLES-11-SP1-MINI-ISO-x86_64-GMC3-CD.iso" frontend-id = "0" online = "1" removable = "0" bootable = "1" state = "1" dev = "xvda" type = "qdisk" mode = "r" device-type = "disk" 51728 = "" frontend = "/local/domain/0/device/vbd/51728" params = "aio:/srv/SuSE/SLES-11-SP1-MINI-ISO-x86_64-GMC3-CD.iso" frontend-id = "0" online = "1" removable = "0" bootable = "1" state = "1" dev = "xvdb" type = "qdisk" mode = "r" device-type = "disk" 51760 = "" frontend = "/local/domain/0/device/vbd/51760" params = "aio:/srv/SuSE/SLES-11-SP1-MINI-ISO-x86_64-GMC3-CD.iso" frontend-id = "0" online = "1" removable = "0" bootable = "1" state = "1" dev = "d3p0" type = "qdisk" mode = "r" device-type = "disk">> (With no blktap in pv-ops, I wonder how file backed disks work there.) > > file backed disks without blktap use the qdisk backend supplied by qemu.Which apparently doesn''t work (for me).>> When trying to detach such a broken device I''m getting >> "unrecognized disk backend type: 0", and the remove fails. > > What version of xen is this?-unstable c/s 24691:3432abcf9380. I should probably add that I''m running the tools from the build tree, but with xend this never caused any problems after I had added the necessary paths to various environment variables in a wrapper script. I''d expect xl to be similarly tolerable of such an environment, otherwise it''s not a drop-in replacement.> libxl__device_disk_from_xs_be tries to read the backend type from > xenstore, the be "type" node. Is that present for you?See above. Jan
Jan Beulich writes ("[Xen-devel] xl block-attach vs block-detach"):> So it seems that "xl block-attach" allows a variety of ways to specify the > devices ID (xvdN, dNpN, decimal or hex number) - why is it that the > inverse operation ("xl block-detach") can''t deal with anything by a decimal > number?Probably because I didn''t notice that xl block-detach was using a different parser when I replaced the block-attach one with a call to the same parser as is used for disks specified in the config file. Would you like to fix it ? :-)> Further, why is it that with no blktap module loaded I''m getting an > incomplete attach when using the (deprecated) file:/ format for > specifying the backing file? It reports that it would be using qdisk, > and blkfront also sees the device appearing, but all I''m seeing in the > kernel log is the single message from blkfront''s probe function. (With > no blktap in pv-ops, I wonder how file backed disks work there.) > When trying to detach such a broken device I''m getting > "unrecognized disk backend type: 0", and the remove fails.That might well be a bug. In addition to Ian''s questions, what do you get if you turn on the debug by passing xl lots of -v flags (before the block-attach) ? Can you attach the disk by naming it in the config file ? thanks, Ian.
>>> On 01.03.12 at 17:45, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Thu, 2012-03-01 at 16:33 +0000, Jan Beulich wrote: >> Further, why is it that with no blktap module loaded I''m getting an >> incomplete attach when using the (deprecated) file:/ format for >> specifying the backing file? It reports that it would be using qdisk, >> and blkfront also sees the device appearing, but all I''m seeing in the >> kernel log is the single message from blkfront''s probe function. >> (With no blktap in pv-ops, I wonder how file backed disks work there.) > > file backed disks without blktap use the qdisk backend supplied by qemu.What is the rationale for using blkback over blktap for file backed disks anyway? If using blktap, why not directly do so? And if using blkback, why not (as in xend) via loop devices? Jan
>>> On 01.03.12 at 18:30, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: >> Further, why is it that with no blktap module loaded I''m getting an >> incomplete attach when using the (deprecated) file:/ format for >> specifying the backing file? It reports that it would be using qdisk, >> and blkfront also sees the device appearing, but all I''m seeing in the >> kernel log is the single message from blkfront''s probe function. (With >> no blktap in pv-ops, I wonder how file backed disks work there.) >> When trying to detach such a broken device I''m getting >> "unrecognized disk backend type: 0", and the remove fails. > > That might well be a bug. In addition to Ian''s questions, what do you > get if you turn on the debug by passing xl lots of -v flags (before > the block-attach) ?+ xl -vvvvv block-attach 0 file:/srv/SuSE/SLES-11-SP1-MINI-ISO-x86_64-GMC3-CD.iso 0xca00 r libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk vdev=0xca00 spec.backend=unknown libxl: debug: libxl_device.c:137:disk_try_backend: Disk vdev=0xca00, backend phy unsuitable as phys path not a block device libxl: debug: libxl_device.c:144:disk_try_backend: Disk vdev=0xca00, backend tap unsuitable because blktap not available libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk vdev=0xca00, using backend qdisk libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk vdev=0xca00 spec.backend=qdisk xc: debug: hypercall buffer: total allocations:2 total releases:2 xc: debug: hypercall buffer: current allocations:0 maximum allocations:2 xc: debug: hypercall buffer: cache current size:2 xc: debug: hypercall buffer: cache hits:0 misses:2 toobig:0 + xl -vvvvv block-attach 0 file:/srv/SuSE/SLES-11-SP1-MINI-ISO-x86_64-GMC3-CD.iso 0xca00 r libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk vdev=0xca00 spec.backend=unknown libxl: debug: libxl_device.c:137:disk_try_backend: Disk vdev=0xca00, backend phy unsuitable as phys path not a block device libxl: debug: libxl_device.c:144:disk_try_backend: Disk vdev=0xca00, backend tap unsuitable because blktap not available libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk vdev=0xca00, using backend qdisk libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk vdev=0xca00 spec.backend=qdisk xc: debug: hypercall buffer: total allocations:2 total releases:2 xc: debug: hypercall buffer: current allocations:0 maximum allocations:2 xc: debug: hypercall buffer: cache current size:2 xc: debug: hypercall buffer: cache hits:0 misses:2 toobig:0 + xl -vvvvv block-detach 0 51712 libxl: error: libxl.c:1223:libxl__device_from_disk: unrecognized disk backend type: 0 libxl_device_disk_remove failed. xc: debug: hypercall buffer: total allocations:2 total releases:2 xc: debug: hypercall buffer: current allocations:0 maximum allocations:2 xc: debug: hypercall buffer: cache current size:2 xc: debug: hypercall buffer: cache hits:0 misses:2 toobig:0 + xl -vvvvv block-detach 0 51712 libxl: error: libxl.c:1223:libxl__device_from_disk: unrecognized disk backend type: 0 libxl_device_disk_remove failed. xc: debug: hypercall buffer: total allocations:2 total releases:2 xc: debug: hypercall buffer: current allocations:0 maximum allocations:2 xc: debug: hypercall buffer: cache current size:2 xc: debug: hypercall buffer: cache hits:0 misses:2 toobig:0> Can you attach the disk by naming it in the config file ?Didn''t try, for the purpose at hand I want the disk attached to Dom0. Jan
On Fri, 2012-03-02 at 07:53 +0000, Jan Beulich wrote:> >>> On 01.03.12 at 18:30, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > >> Further, why is it that with no blktap module loaded I''m getting an > >> incomplete attach when using the (deprecated) file:/ format for > >> specifying the backing file? It reports that it would be using qdisk, > >> and blkfront also sees the device appearing, but all I''m seeing in the > >> kernel log is the single message from blkfront''s probe function. (With > >> no blktap in pv-ops, I wonder how file backed disks work there.) > >> When trying to detach such a broken device I''m getting > >> "unrecognized disk backend type: 0", and the remove fails. > > > > That might well be a bug. In addition to Ian''s questions, what do you > > get if you turn on the debug by passing xl lots of -v flags (before > > the block-attach) ? > > + xl -vvvvv block-attach 0 file:/srv/SuSE/SLES-11-SP1-MINI-ISO-x86_64-GMC3-CD.iso 0xca00 r > libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk vdev=0xca00 spec.backend=unknown > libxl: debug: libxl_device.c:137:disk_try_backend: Disk vdev=0xca00, backend phy unsuitable as phys path not a block device > libxl: debug: libxl_device.c:144:disk_try_backend: Disk vdev=0xca00, backend tap unsuitable because blktap not available > libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk vdev=0xca00, using backend qdisk > libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk vdev=0xca00 spec.backend=qdisk > xc: debug: hypercall buffer: total allocations:2 total releases:2 > xc: debug: hypercall buffer: current allocations:0 maximum allocations:2 > xc: debug: hypercall buffer: cache current size:2 > xc: debug: hypercall buffer: cache hits:0 misses:2 toobig:0 > + xl -vvvvv block-attach 0 file:/srv/SuSE/SLES-11-SP1-MINI-ISO-x86_64-GMC3-CD.iso 0xca00 r > libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk vdev=0xca00 spec.backend=unknown > libxl: debug: libxl_device.c:137:disk_try_backend: Disk vdev=0xca00, backend phy unsuitable as phys path not a block device > libxl: debug: libxl_device.c:144:disk_try_backend: Disk vdev=0xca00, backend tap unsuitable because blktap not available > libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk vdev=0xca00, using backend qdisk > libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk vdev=0xca00 spec.backend=qdisk > xc: debug: hypercall buffer: total allocations:2 total releases:2 > xc: debug: hypercall buffer: current allocations:0 maximum allocations:2 > xc: debug: hypercall buffer: cache current size:2 > xc: debug: hypercall buffer: cache hits:0 misses:2 toobig:0 > + xl -vvvvv block-detach 0 51712 > libxl: error: libxl.c:1223:libxl__device_from_disk: unrecognized disk backend type: 0 > > libxl_device_disk_remove failed. > xc: debug: hypercall buffer: total allocations:2 total releases:2 > xc: debug: hypercall buffer: current allocations:0 maximum allocations:2 > xc: debug: hypercall buffer: cache current size:2 > xc: debug: hypercall buffer: cache hits:0 misses:2 toobig:0 > + xl -vvvvv block-detach 0 51712 > libxl: error: libxl.c:1223:libxl__device_from_disk: unrecognized disk backend type: 0 > > libxl_device_disk_remove failed. > xc: debug: hypercall buffer: total allocations:2 total releases:2 > xc: debug: hypercall buffer: current allocations:0 maximum allocations:2 > xc: debug: hypercall buffer: cache current size:2 > xc: debug: hypercall buffer: cache hits:0 misses:2 toobig:0 > > > Can you attach the disk by naming it in the config file ? > > Didn''t try, for the purpose at hand I want the disk attached to Dom0.AH, I bet that is it -- it is very unlikely that dom0 has a qemu which would process the qdisk backend stuff. Hrm, in fact I wonder if block-attach handles starting a qemu at all if one isn''t already running. I also wonder how well qemu handles hotplug of disks if it is running. I think you may have opened a can of works here. Hopefully someone will correct me but I expect there is work to be done here... Ian.
On Fri, 2012-03-02 at 07:40 +0000, Jan Beulich wrote:> >>> On 01.03.12 at 17:45, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Thu, 2012-03-01 at 16:33 +0000, Jan Beulich wrote: > >> Further, why is it that with no blktap module loaded I''m getting an > >> incomplete attach when using the (deprecated) file:/ format for > >> specifying the backing file? It reports that it would be using qdisk, > >> and blkfront also sees the device appearing, but all I''m seeing in the > >> kernel log is the single message from blkfront''s probe function. > >> (With no blktap in pv-ops, I wonder how file backed disks work there.) > > > > file backed disks without blktap use the qdisk backend supplied by qemu. > > What is the rationale for using blkback over blktap for file backed > disks anyway? If using blktap, why not directly do so?"directly" how? blktap always requires a blkback to export it to the guest. The old blktap1 incorporated a disk backend but this lead to duplicated code, complex set up and teardown interlocking and interesting interactions between the "backdev" (which exposed the blktap device to dom0 for toolstack and qemu use) and the pv disk backend. blktap2 was intended to fix those by layering blkback on blktap2, which now just provides a block device in dom0 (blktap2 is almost independent of Xen). (I originally misread the above and wrote the following, so it''s not actually relevant to your questions, but they are perhaps interest facts so I''ve left them in:... There is no blktap in upstream kernels and the kernel module is considered non-upstreamable. If blktap is available then libxl will use it in preference to qdisk. There are some folks working on a purely userspace blktap ("blktap3" =blktap2 with a userspace disk backend bolted on). I hope this will be ready in time for 4.2 but I''m not sure. ...)> And if using blkback, why not (as in xend) via loop devices?Support for block device script= in xl/libxl is on the 4.2 blocker list, this feature would re-enable the loop+blkback case -- I think this would be a better option than blktap* or qdisk once it becomes available. Ian.
>>> On 02.03.12 at 09:05, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Fri, 2012-03-02 at 07:40 +0000, Jan Beulich wrote: >> And if using blkback, why not (as in xend) via loop devices? > > Support for block device script= in xl/libxl is on the 4.2 blocker list,Ah, that''s good to know. (However, you mentioning script= makes me assume that one would have to specify this among the block-attach options, which again wouldn''t be a drop-in replacement of how xm worked in this regard.)> this feature would re-enable the loop+blkback case -- I think this would > be a better option than blktap* or qdisk once it becomes available.Jan
On Fri, 2012-03-02 at 08:14 +0000, Jan Beulich wrote:> >>> On 02.03.12 at 09:05, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Fri, 2012-03-02 at 07:40 +0000, Jan Beulich wrote: > >> And if using blkback, why not (as in xend) via loop devices? > > > > Support for block device script= in xl/libxl is on the 4.2 blocker list, > > Ah, that''s good to know. (However, you mentioning script= makes me > assume that one would have to specify this among the block-attach > options, which again wouldn''t be a drop-in replacement of how xm > worked in this regard.)We could make a decision about the default script to run based on the type of the device and the selected backend.> > this feature would re-enable the loop+blkback case -- I think this would > > be a better option than blktap* or qdisk once it becomes available. > > Jan >
On Fri, 2 Mar 2012, Ian Campbell wrote:> > And if using blkback, why not (as in xend) via loop devices? > > Support for block device script= in xl/libxl is on the 4.2 blocker list, > this feature would re-enable the loop+blkback case -- I think this would > be a better option than blktap* or qdisk once it becomes available.Of course we need to make sure that blkback with loop devices performs well before doing that, and it was certainly not the case in my last tests.
>>> On 02.03.12 at 14:49, Stefano Stabellini <stefano.stabellini@eu.citrix.com>wrote:> On Fri, 2 Mar 2012, Ian Campbell wrote: >> > And if using blkback, why not (as in xend) via loop devices? >> >> Support for block device script= in xl/libxl is on the 4.2 blocker list, >> this feature would re-enable the loop+blkback case -- I think this would >> be a better option than blktap* or qdisk once it becomes available. > > Of course we need to make sure that blkback with loop devices performs > well before doing that, and it was certainly not the case in my last tests.No - no policy should be involved here: If file:/ is specified, one should get blkback alone (no tap, qdisk, or what not. If another protocol was specified, that one should be used. Only if guessing is needed, some sort of heuristic (into which performance considerations may play) will (naturally) be required. Jan
>>> On 02.03.12 at 15:35, Stefano Stabellini <stefano.stabellini@eu.citrix.com>wrote:> On Fri, 2 Mar 2012, Jan Beulich wrote: >> >>> On 02.03.12 at 14:49, Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> wrote: >> > On Fri, 2 Mar 2012, Ian Campbell wrote: >> >> > And if using blkback, why not (as in xend) via loop devices? >> >> >> >> Support for block device script= in xl/libxl is on the 4.2 blocker list, >> >> this feature would re-enable the loop+blkback case -- I think this would >> >> be a better option than blktap* or qdisk once it becomes available. >> > >> > Of course we need to make sure that blkback with loop devices performs >> > well before doing that, and it was certainly not the case in my last tests. >> >> No - no policy should be involved here: If file:/ is specified, one should >> get blkback alone (no tap, qdisk, or what not. If another protocol was >> specified, that one should be used. Only if guessing is needed, some >> sort of heuristic (into which performance considerations may play) will >> (naturally) be required. > > Let''s suppose we find out that blkback+loop is slower than qdisk and > that block-attach/detach work well with qdisk by the time of the next > release. > > What would be the rationale behind using blkback+loop for "file:"? > Backward compatibility?Yes.> Do you think it might break something for users if we change the backend > from xend to xl?This cannot be excluded, particularly because (just like me here) users tend to do things you didn''t expect them to when you write the code.> On the other hand do you think that using qdisk with the new disk syntax > introduced with xl is reasonable because users are not supposed to make > any assumptions there?Perhaps yes. Jan
On Fri, 2 Mar 2012, Jan Beulich wrote:> >>> On 02.03.12 at 14:49, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > wrote: > > On Fri, 2 Mar 2012, Ian Campbell wrote: > >> > And if using blkback, why not (as in xend) via loop devices? > >> > >> Support for block device script= in xl/libxl is on the 4.2 blocker list, > >> this feature would re-enable the loop+blkback case -- I think this would > >> be a better option than blktap* or qdisk once it becomes available. > > > > Of course we need to make sure that blkback with loop devices performs > > well before doing that, and it was certainly not the case in my last tests. > > No - no policy should be involved here: If file:/ is specified, one should > get blkback alone (no tap, qdisk, or what not. If another protocol was > specified, that one should be used. Only if guessing is needed, some > sort of heuristic (into which performance considerations may play) will > (naturally) be required.Let''s suppose we find out that blkback+loop is slower than qdisk and that block-attach/detach work well with qdisk by the time of the next release. What would be the rationale behind using blkback+loop for "file:"? Backward compatibility? Do you think it might break something for users if we change the backend from xend to xl? On the other hand do you think that using qdisk with the new disk syntax introduced with xl is reasonable because users are not supposed to make any assumptions there?
On Fri, 2 Mar 2012, Jan Beulich wrote:> > What would be the rationale behind using blkback+loop for "file:"? > > Backward compatibility? > > Yes. > > > Do you think it might break something for users if we change the backend > > from xend to xl? > > This cannot be excluded, particularly because (just like me here) > users tend to do things you didn''t expect them to when you write > the code.I see your point but actually that is quite an obvious bug, not a very subtle one that only happens in strange user configs.
On Fri, 2 Mar 2012, Stefano Stabellini wrote:> On Fri, 2 Mar 2012, Jan Beulich wrote: > > > What would be the rationale behind using blkback+loop for "file:"? > > > Backward compatibility? > > > > Yes. > > > > > Do you think it might break something for users if we change the backend > > > from xend to xl? > > > > This cannot be excluded, particularly because (just like me here) > > users tend to do things you didn''t expect them to when you write > > the code. > > I see your point but actually that is quite an obvious bug, not a very > subtle one that only happens in strange user configs. >Scratch that: I have just tried on Linux 3.3 and the performance of blkback with loopback is very good. We should use it whenever we can.
Hi, As I understand it prefering tapdisk over loop+blkback has never been for performance reasons historically. (tapdisk2:aio does however exhibit very good performance) The primary reason that tapdisk was always recommended over file: is that the Linux file cache does very interesting things to your data and sync is returned to the blkback backend much sooner than the data actually resides safely on disk (which can sit in the linux disk cache for a sizeable amount of time if they machine has alot of ram). Unfortunately changing the default behavior to tapdisk probably isn''t viable at this time for a number of reasons - not least of which is the fact it is yet to be included in mainline. However it would definitely be preferable in the long term - atleast from the perspective of data integrity and principle of least surprises. Just my 2c. Joseph. On 3 March 2012 04:37, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:> On Fri, 2 Mar 2012, Stefano Stabellini wrote: >> On Fri, 2 Mar 2012, Jan Beulich wrote: >> > > What would be the rationale behind using blkback+loop for "file:"? >> > > Backward compatibility? >> > >> > Yes. >> > >> > > Do you think it might break something for users if we change the backend >> > > from xend to xl? >> > >> > This cannot be excluded, particularly because (just like me here) >> > users tend to do things you didn''t expect them to when you write >> > the code. >> >> I see your point but actually that is quite an obvious bug, not a very >> subtle one that only happens in strange user configs. >> > > Scratch that: I have just tried on Linux 3.3 and the performance of > blkback with loopback is very good. We should use it whenever we can. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel-- Founder | Director | VP Research Orion Virtualisation Solutions | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846
Please don''t top post, it destroys the flow of the conversation. On Fri, 2012-03-02 at 22:54 +0000, Joseph Glanville wrote:> Hi, > > As I understand it prefering tapdisk over loop+blkback has never been > for performance reasons historically. (tapdisk2:aio does however > exhibit very good performance) > The primary reason that tapdisk was always recommended over file: is > that the Linux file cache does very interesting things to your data > and sync is returned to the blkback backend much sooner than the data > actually resides safely on disk (which can sit in the linux disk cache > for a sizeable amount of time if they machine has alot of ram).Are you suggesting that the loop device doesn''t support O_DIRECT and will leave stuff dirty in the page cache even when direct access is used? That is worth knowing!> Unfortunately changing the default behavior to tapdiskWhat exactly needs changing?> probably isn''t > viable at this time for a number of reasons - not least of which is > the fact it is yet to be included in mainline.tapdisk is not going to be included in mainline. The kernel side is deemed to be non-upstreamble. Someone is working on a fully userspace version of bkltap which we hope will be ready soon. Ian.> However it would definitely be preferable in the long term - atleast > from the perspective of data integrity and principle of least > surprises. > > Just my 2c. > > Joseph. > > On 3 March 2012 04:37, Stefano Stabellini > <stefano.stabellini@eu.citrix.com> wrote: > > On Fri, 2 Mar 2012, Stefano Stabellini wrote: > >> On Fri, 2 Mar 2012, Jan Beulich wrote: > >> > > What would be the rationale behind using blkback+loop for "file:"? > >> > > Backward compatibility? > >> > > >> > Yes. > >> > > >> > > Do you think it might break something for users if we change the backend > >> > > from xend to xl? > >> > > >> > This cannot be excluded, particularly because (just like me here) > >> > users tend to do things you didn''t expect them to when you write > >> > the code. > >> > >> I see your point but actually that is quite an obvious bug, not a very > >> subtle one that only happens in strange user configs. > >> > > > > Scratch that: I have just tried on Linux 3.3 and the performance of > > blkback with loopback is very good. We should use it whenever we can. > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel > > >
On 3 March 2012 16:25, Ian Campbell <Ian.Campbell@citrix.com> wrote:> Please don''t top post, it destroys the flow of the conversation.My apologies, GMail is a very irritating client.> > On Fri, 2012-03-02 at 22:54 +0000, Joseph Glanville wrote: >> Hi, >> >> As I understand it prefering tapdisk over loop+blkback has never been >> for performance reasons historically. (tapdisk2:aio does however >> exhibit very good performance) >> The primary reason that tapdisk was always recommended over file: is >> that the Linux file cache does very interesting things to your data >> and sync is returned to the blkback backend much sooner than the data >> actually resides safely on disk (which can sit in the linux disk cache >> for a sizeable amount of time if they machine has alot of ram). > > Are you suggesting that the loop device doesn''t support O_DIRECT and > will leave stuff dirty in the page cache even when direct access is > used? That is worth knowing!As far as I am aware this is the case. It doesn''t support O_DIRECT in any capacity. There was some patches submitted to the list a very long time ago ( I could probably find them if I tried) but they were knocked back by upstream. It is possible this could have changed so I will take a glance at the source of loop.c tonight to see if that is actually the case.> >> Unfortunately changing the default behavior to tapdisk > > What exactly needs changing?I was suggesting that tapdisk would be a much safer default in terms of data integrity for the above reason that the loop driver doesn''t support O_DIRECT.> >> probably isn''t >> viable at this time for a number of reasons - not least of which is >> the fact it is yet to be included in mainline. > > tapdisk is not going to be included in mainline. The kernel side is > deemed to be non-upstreamble. > > Someone is working on a fully userspace version of bkltap which we hope > will be ready soon.That is unfortunate, would you mind pointing me in the direction of the pure userspace version? Would the performance be considerably worse than the current implementation?> > Ian. > >> However it would definitely be preferable in the long term - atleast >> from the perspective of data integrity and principle of least >> surprises. >> >> Just my 2c. >> >> Joseph. >> >> On 3 March 2012 04:37, Stefano Stabellini >> <stefano.stabellini@eu.citrix.com> wrote: >> > On Fri, 2 Mar 2012, Stefano Stabellini wrote: >> >> On Fri, 2 Mar 2012, Jan Beulich wrote: >> >> > > What would be the rationale behind using blkback+loop for "file:"? >> >> > > Backward compatibility? >> >> > >> >> > Yes. >> >> > >> >> > > Do you think it might break something for users if we change the backend >> >> > > from xend to xl? >> >> > >> >> > This cannot be excluded, particularly because (just like me here) >> >> > users tend to do things you didn''t expect them to when you write >> >> > the code. >> >> >> >> I see your point but actually that is quite an obvious bug, not a very >> >> subtle one that only happens in strange user configs. >> >> >> > >> > Scratch that: I have just tried on Linux 3.3 and the performance of >> > blkback with loopback is very good. We should use it whenever we can. >> > >> > _______________________________________________ >> > Xen-devel mailing list >> > Xen-devel@lists.xen.org >> > http://lists.xen.org/xen-devel >> >> >> > >All of that being said it is still relatively OK to leave blkback+loop as the default as the majority of users I would assume use actual block devices (though I could be horribly wrong here). All documentation and best-practices on the wiki outline this in detail. Joseph. -- Founder | Director | VP Research Orion Virtualisation Solutions | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846
On 3 March 2012 16:25, Ian Campbell <Ian.Campbell@citrix.com> wrote:> Please don''t top post, it destroys the flow of the conversation. > > On Fri, 2012-03-02 at 22:54 +0000, Joseph Glanville wrote: >> Hi, >> >> As I understand it prefering tapdisk over loop+blkback has never been >> for performance reasons historically. (tapdisk2:aio does however >> exhibit very good performance) >> The primary reason that tapdisk was always recommended over file: is >> that the Linux file cache does very interesting things to your data >> and sync is returned to the blkback backend much sooner than the data >> actually resides safely on disk (which can sit in the linux disk cache >> for a sizeable amount of time if they machine has alot of ram). > > Are you suggesting that the loop device doesn''t support O_DIRECT and > will leave stuff dirty in the page cache even when direct access is > used? That is worth knowing! > >> Unfortunately changing the default behavior to tapdisk > > What exactly needs changing?Doh! I finally get what you meant by this comment. Tapdisk -is- the current default for xl. xm uses loop and thus my mistake, sigh must learn to think before typing.> >> probably isn''t >> viable at this time for a number of reasons - not least of which is >> the fact it is yet to be included in mainline. > > tapdisk is not going to be included in mainline. The kernel side is > deemed to be non-upstreamble. > > Someone is working on a fully userspace version of bkltap which we hope > will be ready soon. > > Ian. > >> However it would definitely be preferable in the long term - atleast >> from the perspective of data integrity and principle of least >> surprises. >> >> Just my 2c. >> >> Joseph. >> >> On 3 March 2012 04:37, Stefano Stabellini >> <stefano.stabellini@eu.citrix.com> wrote: >> > On Fri, 2 Mar 2012, Stefano Stabellini wrote: >> >> On Fri, 2 Mar 2012, Jan Beulich wrote: >> >> > > What would be the rationale behind using blkback+loop for "file:"? >> >> > > Backward compatibility? >> >> > >> >> > Yes. >> >> > >> >> > > Do you think it might break something for users if we change the backend >> >> > > from xend to xl? >> >> > >> >> > This cannot be excluded, particularly because (just like me here) >> >> > users tend to do things you didn''t expect them to when you write >> >> > the code. >> >> >> >> I see your point but actually that is quite an obvious bug, not a very >> >> subtle one that only happens in strange user configs. >> >> >> > >> > Scratch that: I have just tried on Linux 3.3 and the performance of >> > blkback with loopback is very good. We should use it whenever we can. >> > >> > _______________________________________________ >> > Xen-devel mailing list >> > Xen-devel@lists.xen.org >> > http://lists.xen.org/xen-devel >> >> >> > >I did have a trawl through the linux kernel and linux-util repos last night however and it appears that none of the direct-io/O_DIRECT patches were ever merged. Joseph. -- Founder | Director | VP Research Orion Virtualisation Solutions | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846
On 4 March 2012 14:01, Joseph Glanville <joseph.glanville@orionvm.com.au> wrote:> On 3 March 2012 16:25, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> Please don''t top post, it destroys the flow of the conversation. >> >> On Fri, 2012-03-02 at 22:54 +0000, Joseph Glanville wrote: >>> Hi, >>> >>> As I understand it prefering tapdisk over loop+blkback has never been >>> for performance reasons historically. (tapdisk2:aio does however >>> exhibit very good performance) >>> The primary reason that tapdisk was always recommended over file: is >>> that the Linux file cache does very interesting things to your data >>> and sync is returned to the blkback backend much sooner than the data >>> actually resides safely on disk (which can sit in the linux disk cache >>> for a sizeable amount of time if they machine has alot of ram). >> >> Are you suggesting that the loop device doesn''t support O_DIRECT and >> will leave stuff dirty in the page cache even when direct access is >> used? That is worth knowing! >> >>> Unfortunately changing the default behavior to tapdisk >> >> What exactly needs changing? > > Doh! I finally get what you meant by this comment. Tapdisk -is- the > current default for xl. > xm uses loop and thus my mistake, sigh must learn to think before typing. > >> >>> probably isn''t >>> viable at this time for a number of reasons - not least of which is >>> the fact it is yet to be included in mainline. >> >> tapdisk is not going to be included in mainline. The kernel side is >> deemed to be non-upstreamble. >> >> Someone is working on a fully userspace version of bkltap which we hope >> will be ready soon. >> >> Ian. >> >>> However it would definitely be preferable in the long term - atleast >>> from the perspective of data integrity and principle of least >>> surprises. >>> >>> Just my 2c. >>> >>> Joseph. >>> >>> On 3 March 2012 04:37, Stefano Stabellini >>> <stefano.stabellini@eu.citrix.com> wrote: >>> > On Fri, 2 Mar 2012, Stefano Stabellini wrote: >>> >> On Fri, 2 Mar 2012, Jan Beulich wrote: >>> >> > > What would be the rationale behind using blkback+loop for "file:"? >>> >> > > Backward compatibility? >>> >> > >>> >> > Yes. >>> >> > >>> >> > > Do you think it might break something for users if we change the backend >>> >> > > from xend to xl? >>> >> > >>> >> > This cannot be excluded, particularly because (just like me here) >>> >> > users tend to do things you didn''t expect them to when you write >>> >> > the code. >>> >> >>> >> I see your point but actually that is quite an obvious bug, not a very >>> >> subtle one that only happens in strange user configs. >>> >> >>> > >>> > Scratch that: I have just tried on Linux 3.3 and the performance of >>> > blkback with loopback is very good. We should use it whenever we can. >>> > >>> > _______________________________________________ >>> > Xen-devel mailing list >>> > Xen-devel@lists.xen.org >>> > http://lists.xen.org/xen-devel >>> >>> >>> >> >> > > I did have a trawl through the linux kernel and linux-util repos last > night however and it appears that none of the direct-io/O_DIRECT > patches were ever merged. > > Joseph. > > > -- > Founder | Director | VP Research > Orion Virtualisation Solutions | www.orionvm.com.au | Phone: 1300 56 > 99 52 | Mobile: 0428 754 846For interested parties this is the original patch series I was reffering to: http://www.spinics.net/lists/linux-fsdevel/msg27514.html There also seems to be a revival of this effort as of about a week ago: https://lkml.org/lkml/2012/2/28/251 Kind regards, Joseph. -- Founder | Director | VP Research Orion Virtualisation Solutions | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846
On Sat, 2012-03-03 at 22:46 -0500, Joseph Glanville wrote:> On 4 March 2012 14:01, Joseph Glanville <joseph.glanville@orionvm.com.au> wrote: > > On 3 March 2012 16:25, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > I did have a trawl through the linux kernel and linux-util repos last > > night however and it appears that none of the direct-io/O_DIRECT > > patches were ever merged.> For interested parties this is the original patch series I was reffering to: > http://www.spinics.net/lists/linux-fsdevel/msg27514.html > > > There also seems to be a revival of this effort as of about a week ago: > https://lkml.org/lkml/2012/2/28/251Thanks for all the useful info and pointers. We should keep an eye on this series. Ian.