Ian Pratt
2005-Apr-14 20:43 UTC
RE: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
> On Thu, 2005-04-14 at 14:22 -0400, Philip R Auld wrote: > > As a slight but related digression, has any thought been given to > > using something other than /dev/[hs]d* names for specifying block > > devices? The name /dev/sdb only has meaning for the current > state of > > the running dom0 OS. It may not mean the same device on a different > > dom0. This can lead to problems in say migration. > > In fact, it may not be the same when the domain is restored locally. > > To be even slightly more radical, I don''t think that for the > guests, the vbds should be being exposed as SCSI or IDE since > they''re not. By being presented as such, there''s a set of > assumptions about ioctls and behaviors which aren''t the case > for disks on blkfront. Even with all of the pain using your > own major/minor for a disk device is, I think it probably is > the right thing to do. And I''ve added the code to > parted/lvm2/.../etc enough to be able to do it in my sleep > this point if this route is taken :)I don''t think there''s a huge problem with exposing vbd''s as sdX/hdX within a guest. We used to have our own xdX device, but it just broke too much stuff, hence we hijacked had/sda. I haven''t seen too many problems with ioctls. I think the key issue is that in domain configs, you want to specify the source of the vbd in some high-level name and have the control tools do the necessary to map it to a local device and then export it. This already happens with file: disk paths. We just need something similar for iscsi. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Katz
2005-Apr-14 20:52 UTC
RE: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Thu, 2005-04-14 at 21:43 +0100, Ian Pratt wrote:> > On Thu, 2005-04-14 at 14:22 -0400, Philip R Auld wrote: > > > As a slight but related digression, has any thought been given to > > > using something other than /dev/[hs]d* names for specifying block > > > devices? The name /dev/sdb only has meaning for the current > > state of > > > the running dom0 OS. It may not mean the same device on a different > > > dom0. This can lead to problems in say migration. > > > In fact, it may not be the same when the domain is restored locally. > > > > To be even slightly more radical, I don''t think that for the > > guests, the vbds should be being exposed as SCSI or IDE since > > they''re not. By being presented as such, there''s a set of > > assumptions about ioctls and behaviors which aren''t the case > > for disks on blkfront. Even with all of the pain using your > > own major/minor for a disk device is, I think it probably is > > the right thing to do. And I''ve added the code to > > parted/lvm2/.../etc enough to be able to do it in my sleep > > this point if this route is taken :) > > I don''t think there''s a huge problem with exposing vbd''s as sdX/hdX > within a guest. We used to have our own xdX device, but it just broke > too much stuff, hence we hijacked had/sda. I haven''t seen too many > problems with ioctls.Things start getting odd if you start mixing anything that natively shows up as scsi with a vbd scsi, though, don''t they? And upstream has been fairly resistant to other !scsi or !ide devices sitting on those devices. I''ve had this argument with Jeff Garzik before about a few of the esoteric SATA drivers that don''t even pretend to be scsi. And things like partitioning tools and hardware probing will start showing more problems with ioctls not working as that starts to show up.> I think the key issue is that in domain configs, you want to specify the > source of the vbd in some high-level name and have the control tools do > the necessary to map it to a local device and then export it. This > already happens with file: disk paths. We just need something similar > for iscsi.Yeah, I can see this being useful. Jeremy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Knorr
2005-Apr-14 21:59 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
"Ian Pratt" <m+Ian.Pratt@cl.cam.ac.uk> writes:> I don''t think there''s a huge problem with exposing vbd''s as sdX/hdX > within a guest. We used to have our own xdX device, but it just broke > too much stuff, hence we hijacked had/sda.Huh? Really? What exactly broke? I can share a bit of experience with user mode linux again: UML uses it''s own major range for virtual block devices. Making the suse installer work within a UML machine was surprisingly easy. The major stumbling block was that parted had a hard-coded white-list for block device major numbers, which didn''t include #98 (the uml virtual block device major), thus the installer didn''t recognised the virtual disk or refused to partition it (don''t remember exactly). The other issue was that the device files in /dev didn''t exist (should be easier these days with udev ;). Once these two points where fixed it worked just fine. Well, there still were some minor issues, like the installer being a bit confused and printing a warning due to the fact that it hasn''t found any PCI IDE or SCSI storage controller. The important bit is that you''ll have to take care to use the usual linux interfaces, naming schemes and so on. If the virtual disks and partitions show up in /proc/partitions and /sys/block correctly most software will be happy. I think we should have our own xd virtual block devices for xen and use them by default. It''s cleaner, and probably also has less problems when using virtual disks and iSCSI at the same time ;) Using hd/sd instead probably is a useful option in some cases, I wouldn''t drop that altogether, but only use that if the user explicitly asks for it. Gerd -- #define printk(args...) fprintf(stderr, ## args) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Adam Heath
2005-Apr-14 22:10 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Thu, 14 Apr 2005, Gerd Knorr wrote:> I think we should have our own xd virtual block devices for xen and > use them by default. It''s cleaner, and probably also has less > problems when using virtual disks and iSCSI at the same time ;) > > Using hd/sd instead probably is a useful option in some cases, I > wouldn''t drop that altogether, but only use that if the user > explicitly asks for it.Or, implement a xen scsi host driver, that acts like a real scsi host controller, but instead imports virtual blocks. This seems like the best of both worlds. Note, this is not a overlay system like is done currently, nor is it a block driver. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2005-Apr-14 23:24 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
> I think the key issue is that in domain configs, you want to specify the > source of the vbd in some high-level name and have the control tools do > the necessary to map it to a local device and then export it. This > already happens with file: disk paths. We just need something similar > for iscsi.To clarify for anyone who hasn''t looked into this code, funky (anything that is not "phy:") block devices are handled by Xend calling external programs (e.g. scripts) with a particular parameter format. If you specify a VBD source device as "file:/my/file" then the following happens (roughly): Xend looks in its config for an item called "block-file". In the default config (tools/examples/xend-config.sxp) the value of this item is also "block-file", which is the name of the script in /etc/xen/scripts. It then calls this script like this: /etc/xen/scripts/block-file bind /my/file The script is required to bind /my/file to a loop device, the name of which it outputs (e.g. "/dev/loop0") to stdout. When the domain is destroyed, Xend will call: /etc/xen/scripts/block-file unbind /dev/loop0 Likewise specifying a block device "enbd:servername:ctlport" causes a call to "/etc/xen/scripts/block-enbd bind servername ctlport" and a subsequent "/etc/xen/scripts/block-enbd unbind /dev/enbd_node". For iSCSI you''d define a syntax like "iscsi:target:lun", write a script to run iscsiadm (if you''re using OpeniSCSI) and stick it in the config. You could probably do a similar thing to deal with your SAN devices. Thoughts anyone? It''d be very desirable to have more block scripts in the distribution. If anyone comes up with some, I don''t think there''d be any problem with them going into the -testing tree. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Adam Heath
2005-Apr-14 23:42 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Fri, 15 Apr 2005, Mark Williamson wrote:> > I think the key issue is that in domain configs, you want to specify the > > source of the vbd in some high-level name and have the control tools do > > the necessary to map it to a local device and then export it. This > > already happens with file: disk paths. We just need something similar > > for iscsi. > > To clarify for anyone who hasn''t looked into this code, funky (anything that > is not "phy:") block devices are handled by Xend calling external programs > (e.g. scripts) with a particular parameter format. If you specify a VBD > source device as "file:/my/file" then the following happens (roughly): > > Xend looks in its config for an item called "block-file". In the default > config (tools/examples/xend-config.sxp) the value of this item is also > "block-file", which is the name of the script in /etc/xen/scripts. It then > calls this script like this: > > /etc/xen/scripts/block-file bind /my/file > > The script is required to bind /my/file to a loop device, the name of which it > outputs (e.g. "/dev/loop0") to stdout. When the domain is destroyed, Xend > will call:Would it not be better to just echo /my/file?> /etc/xen/scripts/block-file unbind /dev/loop0 > > Likewise specifying a block device "enbd:servername:ctlport" causes a call to > "/etc/xen/scripts/block-enbd bind servername ctlport" and a subsequent > "/etc/xen/scripts/block-enbd unbind /dev/enbd_node". > > For iSCSI you''d define a syntax like "iscsi:target:lun", write a script to run > iscsiadm (if you''re using OpeniSCSI) and stick it in the config. You could > probably do a similar thing to deal with your SAN devices. > > Thoughts anyone? It''d be very desirable to have more block scripts in the > distribution. If anyone comes up with some, I don''t think there''d be any > problem with them going into the -testing tree._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2005-Apr-14 23:43 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
> Would it not be better to just echo /my/file?When you write the script you echo whatever you want Xend to pass you back when it asks you to unbind the device. Echoing the device makes the "unbind" case of the block-file script nice and simple, since it can just call "losetup -d " with the argument passed by Xend. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Theurer
2005-Apr-15 01:51 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
Adam Heath wrote:>On Thu, 14 Apr 2005, Gerd Knorr wrote: > > > >>I think we should have our own xd virtual block devices for xen and >>use them by default. It''s cleaner, and probably also has less >>problems when using virtual disks and iSCSI at the same time ;) >> >>Using hd/sd instead probably is a useful option in some cases, I >>wouldn''t drop that altogether, but only use that if the user >>explicitly asks for it. >> >> > >Or, implement a xen scsi host driver, that acts like a real scsi host >controller, but instead imports virtual blocks. This seems like the best of >both worlds. > >Note, this is not a overlay system like is done currently, nor is it a block >driver. > > > >There may already be something like this, the IBM virtual scsi server/client drivers in mainline kernel (used for para-virtualized POWER5 systems). Might be worth looking at and modifying a bit to fit Xen''s inter domain communication model. -Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Philip R Auld
2005-Apr-15 13:32 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
Rumor has it that on Thu, Apr 14, 2005 at 08:51:22PM -0500 Andrew Theurer said:> Adam Heath wrote: > > >Or, implement a xen scsi host driver, that acts like a real scsi host > >controller, but instead imports virtual blocks. This seems like the best > >of > >both worlds. > > > >Note, this is not a overlay system like is done currently, nor is it a > >block > >driver. > > > > > > > > > There may already be something like this, the IBM virtual scsi > server/client drivers in mainline kernel (used for para-virtualized > POWER5 systems). Might be worth looking at and modifying a bit to fit > Xen''s inter domain communication model.Does this emulate the full target logic? Once you make the guests see a scsi host you''ll need that. The back end device is a scsi LUN then you can pass commands through, but what does the inqiury command return for a file backed vbd? I''m not sure the complexity is worth it going in this direction. One thing nice about using just block devices is that the backend''s kernel and drivers have already sroted out most of this. Blocks are blocks :) Cheers, Phil> > -Andrew > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Philip R. Auld, Ph.D. Egenera, Inc. Software Architect 165 Forest St. (508) 858-2628 Marlboro, MA 01752 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Philip R Auld
2005-Apr-15 13:58 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
Hi, Rumor has it that on Fri, Apr 15, 2005 at 12:24:17AM +0100 Mark Williamson said:> > I think the key issue is that in domain configs, you want to specify the > > source of the vbd in some high-level name and have the control tools do > > the necessary to map it to a local device and then export it. This > > already happens with file: disk paths. We just need something similar > > for iscsi. > > To clarify for anyone who hasn''t looked into this code, funky (anything that > is not "phy:") block devices are handled by Xend calling external programs > (e.g. scripts) with a particular parameter format. If you specify a VBD > source device as "file:/my/file" then the following happens (roughly): > > Xend looks in its config for an item called "block-file". In the default > config (tools/examples/xend-config.sxp) the value of this item is also > "block-file", which is the name of the script in /etc/xen/scripts. It then > calls this script like this: > > /etc/xen/scripts/block-file bind /my/file > > The script is required to bind /my/file to a loop device, the name of which it > outputs (e.g. "/dev/loop0") to stdout. When the domain is destroyed, Xend > will call: > > /etc/xen/scripts/block-file unbind /dev/loop0 > > Likewise specifying a block device "enbd:servername:ctlport" causes a call to > "/etc/xen/scripts/block-enbd bind servername ctlport" and a subsequent > "/etc/xen/scripts/block-enbd unbind /dev/enbd_node". >This is actually making the connection the nbd server though right?> For iSCSI you''d define a syntax like "iscsi:target:lun", write a script to run > iscsiadm (if you''re using OpeniSCSI) and stick it in the config. You could > probably do a similar thing to deal with your SAN devices. >Is iscsiadm used here to instantiate this target and lun or would it be already scanned in? I was under the impression iscsi gets scanned like any other scsi host.> Thoughts anyone? It''d be very desirable to have more block scripts in the > distribution. If anyone comes up with some, I don''t think there''d be any > problem with them going into the -testing tree.Thanks for the write-up. I''ve not used nor looked at how the file backed vbds work. I think a script callout something like this would work. Why not just make it a scsi call out? I don''t think there is a need to make it different for iscsi. scsi will cover all of those types of LUs; iscsi, SAN, local scsi, sata, etc. Assuming the device is already present some sort of uuid may be a better choice as a parameter. The target and lun on one machine may also not be the same as on another. Maybe something like "scsi_id:WWN"? This could call a generic scsi script that used the scsi_id tool to find a where the matching device is current located. This is 2.6 dom0 specific of course. Specifying (host channel target and lun) could be sufficient if the scripting had some sort or remapping ability. Then something with a more global view could make sure the mapping is setup right. But then it might be simpler to use specially named files links to point to the right device... Phil> > Cheers, > Mark > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Philip R. Auld, Ph.D. Egenera, Inc. Software Architect 165 Forest St. (508) 858-2628 Marlboro, MA 01752 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Adam Heath
2005-Apr-15 16:18 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Fri, 15 Apr 2005, Mark Williamson wrote:> > Would it not be better to just echo /my/file? > > When you write the script you echo whatever you want Xend to pass you back > when it asks you to unbind the device. > > Echoing the device makes the "unbind" case of the block-file script nice and > simple, since it can just call "losetup -d " with the argument passed by > Xend.Yes, I understand that part. But what does xend read(or pass to the kernel) to enable the domU to access the data? If xend(or the kernel) doesn''t care, wouldn''t it be more efficient to pass the raw file(or device, in the case of iscsi/etc)? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2005-Apr-15 16:43 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
> But what does xend read(or pass to the kernel) to enable the domU to access > the data? If xend(or the kernel) doesn''t care, wouldn''t it be more > efficient to pass the raw file(or device, in the case of iscsi/etc)?I think we''re still talking cross purposes here. Also, when I wrote the previous e-mail I wasn''t remembering everything about the code ;-) Xend enables domU to access the data by telling the blkback driver in dom0''s kernel to export a given dom0 block device to the guest. It has to be a block device. The purpose of the block script is a) to make sure such a block device exists (by binding something to it if necessary) and b) to tell Xend what it is. For file backed VBDs, the script has to find a free loop device and bind the file to it so that there is a block device for the backend to export in the first place. For the SAN / iSCSI setups discussed the script may just serve the purpose of finding the correct block device. In either case, it has to echo the device node to Xend so that it knows what the blkback should be told to export. For things like NBD and loop files, this device node is also used to unbind the device after the domain is destroyed. Does that answer your question or am I still off target? Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Adam Heath
2005-Apr-15 17:33 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Fri, 15 Apr 2005, Mark Williamson wrote:> > But what does xend read(or pass to the kernel) to enable the domU to access > > the data? If xend(or the kernel) doesn''t care, wouldn''t it be more > > efficient to pass the raw file(or device, in the case of iscsi/etc)? > > I think we''re still talking cross purposes here. Also, when I wrote the > previous e-mail I wasn''t remembering everything about the code ;-) > > Xend enables domU to access the data by telling the blkback driver in dom0''s > kernel to export a given dom0 block device to the guest. It has to be a > block device. The purpose of the block script is a) to make sure such a > block device exists (by binding something to it if necessary) and b) to tell > Xend what it is. > > For file backed VBDs, the script has to find a free loop device and bind the > file to it so that there is a block device for the backend to export in the > first place. For the SAN / iSCSI setups discussed the script may just serve > the purpose of finding the correct block device. > > In either case, it has to echo the device node to Xend so that it knows what > the blkback should be told to export. For things like NBD and loop files, > this device node is also used to unbind the device after the domain is > destroyed. > > Does that answer your question or am I still off target?It does. So, it''s the blkback in dom0 that requires a device node. How about it the blkback were extended to support files in filesystems? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2005-Apr-15 17:33 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
> > Does that answer your question or am I still off target? > > It does. So, it''s the blkback in dom0 that requires a device node.Yup, that''s why we have to go through the gyrations with the block scripts in order to make things look nice at the user level.> How > about it the blkback were extended to support files in filesystems?Right now, all the backend needs to know about is what block device to plumb requests through to. I don''t really think there''s a nice way to make it aware of files within filesystems without effectively reimplementing the loopback driver. From what I''ve heard the loop driver itself could use a bit of work to make it really useful: it comes at a performance / memory usage hit :-(. We recommend using LVM for any really serious environments for this reason. An alternative architecture would be to have a userspace daemon for file backed VBDs, using the blktap framework (unstable tree only). I''m not sure if this would work any better than the current way of doing files, tho... Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Adam Heath
2005-Apr-15 17:55 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Fri, 15 Apr 2005, Mark Williamson wrote:> > > Does that answer your question or am I still off target? > > > > It does. So, it''s the blkback in dom0 that requires a device node. > > Yup, that''s why we have to go through the gyrations with the block scripts in > order to make things look nice at the user level. > > > How > > about it the blkback were extended to support files in filesystems? > > Right now, all the backend needs to know about is what block device to plumb > requests through to. I don''t really think there''s a nice way to make it > aware of files within filesystems without effectively reimplementing the > loopback driver.The blkfront driver should request id+offset+length from the blkback driver. The blkback driver then converts the id into either a device, or a file. Seems rather straight forward to me.> From what I''ve heard the loop driver itself could use a bit of work to make it > really useful: it comes at a performance / memory usage hit :-(. We > recommend using LVM for any really serious environments for this reason. > > An alternative architecture would be to have a userspace daemon for file > backed VBDs, using the blktap framework (unstable tree only). I''m not sure > if this would work any better than the current way of doing files, tho...blktap? Is that similiar to what I described above? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2005-Apr-15 17:56 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
> The blkfront driver should request id+offset+length from the blkback > driver. The blkback driver then converts the id into either a device, or a > file. Seems rather straight forward to me.OK, it''s not a very cunning hack to do but it makes the blkback driver more complex whilst duplicating functionality that''s already in the kernel. You''d have to deal with a load of VFS APIs as well as the existing block APIs, which would be unfortunate. Does anybody know why the existing loopback device performs badly anyway? As Adam says, it shouldn''t be rocket science to make it work well...> > From what I''ve heard the loop driver itself could use a bit of work to > > make it really useful: it comes at a performance / memory usage hit :-(. > > We recommend using LVM for any really serious environments for this > > reason. > > > > An alternative architecture would be to have a userspace daemon for file > > backed VBDs, using the blktap framework (unstable tree only). I''m not > > sure if this would work any better than the current way of doing files, > > tho... > > blktap? Is that similiar to what I described above?Blktap allows you to write a userspace program to provide a block device to another domain. It makes what you described a bit more straightforward than implementing directly in the blkback driver. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2005-Apr-15 17:57 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
Adam Heath wrote:>It does. So, it''s the blkback in dom0 that requires a device node. How about >it the blkback were extended to support files in filesystems? > >I think the hardest thing to deal with is the fact that the control messages are limited to 60 bytes meaning that if you changed the be_vbd_create message to pass file names instead of device types you''d have to support continuations to suport filenames that are > ~40 bytes long. The registry will make this a lot easier since the blkif_be device could just read a filename out of the registry. I''ve thought about changing the blkif_be driver to be able to access the file ops directly before and this has been the limiting factor in my mind (unless someone else has a more creative solution :-)). Regards, Anthony Liguori>_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Warfield
2005-Apr-15 18:02 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
> An alternative architecture would be to have a userspace daemon for file > backed VBDs, using the blktap framework (unstable tree only). I''m not sure > if this would work any better than the current way of doing files, tho...The current block tap code allows arbitrary numbers of image files to be exported directly to domains without loopback. I haven''t done any performance comparison to using loopback, but it might be useful if people are mounting loads of images. If it''s something that people would use, I could certainly clean it up over the next few weeks. a. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Adam Heath
2005-Apr-15 20:07 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Fri, 15 Apr 2005, Anthony Liguori wrote:> Adam Heath wrote: > > >It does. So, it''s the blkback in dom0 that requires a device node. How about > >it the blkback were extended to support files in filesystems? > > > > > I think the hardest thing to deal with is the fact that the control > messages are limited to 60 bytes meaning that if you changed the > be_vbd_create message to pass file names instead of device types you''d > have to support continuations to suport filenames that are > ~40 bytes long.Er, no. The blkback allocates it''s own id, which is passed around between them. The blkback them maps the id into a handle structure, which then has a void *data(or a union, if you want) that maintains a pointer to a filename, or reference to a block device, then a function dispatch table that knows how to handle the requests. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Adam Heath
2005-Apr-15 20:08 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Fri, 15 Apr 2005, Mark Williamson wrote:> > The blkfront driver should request id+offset+length from the blkback > > driver. The blkback driver then converts the id into either a device, or a > > file. Seems rather straight forward to me. > > OK, it''s not a very cunning hack to do but it makes the blkback driver more > complex whilst duplicating functionality that''s already in the kernel. You''d > have to deal with a load of VFS APIs as well as the existing block APIs, > which would be unfortunate.Maybe reuse the loopback code? Maybe it has an abstract device->file conversion group of functions/structures.> Does anybody know why the existing loopback device performs badly anyway? As > Adam says, it shouldn''t be rocket science to make it work well...Default kernel limit of 8 loops. Requires a recompile, and then probably only supports 256 devices max.> > blktap? Is that similiar to what I described above? > > Blktap allows you to write a userspace program to provide a block device to > another domain. It makes what you described a bit more straightforward than > implementing directly in the blkback driver.That would be more complex I would think then doing file access in the kernel. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2005-Apr-15 20:14 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
> > OK, it''s not a very cunning hack to do but it makes the blkback driver > > more complex whilst duplicating functionality that''s already in the > > kernel. You''d have to deal with a load of VFS APIs as well as the > > existing block APIs, which would be unfortunate. > > Maybe reuse the loopback code? Maybe it has an abstract device->file > conversion group of functions/structures.Arguably the ideal in-kernel solution is to fix the loopback driver itself.> > Does anybody know why the existing loopback device performs badly anyway? > > As Adam says, it shouldn''t be rocket science to make it work well... > > Default kernel limit of 8 loops. Requires a recompile, and then probably > only supports 256 devices max.Yup the default limit could (arguably) be raised in our default config. I was especially referring to the performance, which is reported to be rather bad.> > > blktap? Is that similiar to what I described above? > > > > Blktap allows you to write a userspace program to provide a block device > > to another domain. It makes what you described a bit more > > straightforward than implementing directly in the blkback driver. > > That would be more complex I would think then doing file access in the > kernel.I''m not sure it would, the blocktap library is designed for this sort of thing. From what Andy says there is actually already a tool to do it, which might be worth investigating further. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2005-Apr-15 20:16 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
> Er, no. The blkback allocates it''s own id, which is passed around between > them.But how would you tell the blkback what file it was meant to be accessing before it allocates that ID.> The blkback them maps the id into a handle structure, which then has a void > *data(or a union, if you want) that maintains a pointer to a filename, or > reference to a block device, then a function dispatch table that knows how > to handle the requests.Yup. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2005-Apr-15 20:29 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Friday 15 April 2005 21:46, Adam Heath wrote:> On Fri, 15 Apr 2005, Mark Williamson wrote: > > > Er, no. The blkback allocates it''s own id, which is passed around > > > between them. > > > > But how would you tell the blkback what file it was meant to be accessing > > before it allocates that ID. > > xend would tell the blkback driver(in dom0) what it s/b accessing, before > constructing the domU instance.But Xend configures the blkback driver using the control interface messages, so the sizing of the ctrlif messages is still a problem (that was what Anthony was saying, I think). That said, there would, of course, be ways this could be fixed. One nice side effect of supporting passing filenames to the blkback would be the ability to configure file based block devices when the backend is not dom0. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2005-Apr-15 20:40 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
Adam Heath wrote:>Er, no. The blkback allocates it''s own id, which is passed around between >them. > >The blkback them maps the id into a handle structure, which then has a void >*data(or a union, if you want) that maintains a pointer to a filename, or >reference to a block device, then a function dispatch table that knows how to >handle the requests. > >Are you suggesting this is how it should work? In xen-unstable, the control tools communicate through a ring-queue (a fixed length queue with a maximum message size of 60) to the device backends and frontends. They do not map any memory (besides the ring queue). The virtual block device creation process looks something like this: 1) control tools send a create message through the ring queue to the backend 2) for each virtual block device, control tools send a BLKIF_BE_VBD_CREATE message to the backend This message is fixed length (see /usr/include/xen/io/domain_controller.h--blkif_be_vbd_create_t) and has two fields for the backend device number (pdevice) and the frontend device number (vdevice). When the frontend boots up, it sends the control tools a series of messages. One of those message contains a share memory handle which the controls then pass to the backend. The backend maps this memory address and the frontend and backend use this memory area to do the actual operations of the block device. The control tools can only communicate with the backend (right now) via the ring queue. You can''t assume that the tools are in the same domain as the backend (so you can''t just do an ioctl or something to the kernel). If you wanted to support passing files, you would have to extend the blkif_be_vbd_create_t structure to communicate a filename. This is the problem. Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2005-Apr-15 20:45 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
Mark Williamson wrote:>>Maybe reuse the loopback code? Maybe it has an abstract device->file >>conversion group of functions/structures. >> >> > >Arguably the ideal in-kernel solution is to fix the loopback driver itself. > >I agree. There''s no reason to special case files over other block devices.>I''m not sure it would, the blocktap library is designed for this sort of >thing. From what Andy says there is actually already a tool to do it, which >might be worth investigating further. > >All the read/writes for a block device go to userspace when using blocktap right? Has any performance testing been done?>Cheers, >Mark > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Adam Heath
2005-Apr-15 20:46 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Fri, 15 Apr 2005, Mark Williamson wrote:> > Er, no. The blkback allocates it''s own id, which is passed around between > > them. > > But how would you tell the blkback what file it was meant to be accessing > before it allocates that ID.xend would tell the blkback driver(in dom0) what it s/b accessing, before constructing the domU instance. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2005-Apr-15 20:47 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
Adam Heath wrote:>On Fri, 15 Apr 2005, Anthony Liguori wrote: > >I hope you aren''t thinking I''m suggesting having domU be able to request >access to any arbitrary file; I''m suggesting that dom0 configure the mapping. >Read my other mail. > >The problem is how does Xend communicate the file->id mapping. It''s gotta do this over the control channel which puts the length of the filename to 60 characters unless you support continuations. That''s not extraordinarily difficult but annoying enough that it makes just fixing the loopback device appealing. Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Adam Heath
2005-Apr-15 20:48 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
On Fri, 15 Apr 2005, Anthony Liguori wrote:> Adam Heath wrote: > > >Er, no. The blkback allocates it''s own id, which is passed around between > >them. > > > >The blkback them maps the id into a handle structure, which then has a void > >*data(or a union, if you want) that maintains a pointer to a filename, or > >reference to a block device, then a function dispatch table that knows how to > >handle the requests. > > > > > Are you suggesting this is how it should work? > > In xen-unstable, the control tools communicate through a ring-queue (a > fixed length queue with a maximum message size of 60) to the device > backends and frontends. They do not map any memory (besides the ring > queue). > > The virtual block device creation process looks something like this: > > 1) control tools send a create message through the ring queue to the backend > 2) for each virtual block device, control tools send a > BLKIF_BE_VBD_CREATE message to the backend > > This message is fixed length (see > /usr/include/xen/io/domain_controller.h--blkif_be_vbd_create_t) and has > two fields for the backend device number (pdevice) and the frontend > device number (vdevice). > > When the frontend boots up, it sends the control tools a series of > messages. One of those message contains a share memory handle which the > controls then pass to the backend. The backend maps this memory address > and the frontend and backend use this memory area to do the actual > operations of the block device. > > The control tools can only communicate with the backend (right now) via > the ring queue. You can''t assume that the tools are in the same domain > as the backend (so you can''t just do an ioctl or something to the kernel). > > If you wanted to support passing files, you would have to extend the > blkif_be_vbd_create_t structure to communicate a filename. This is the > problem.I hope you aren''t thinking I''m suggesting having domU be able to request access to any arbitrary file; I''m suggesting that dom0 configure the mapping. Read my other mail. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2005-Apr-15 21:05 UTC
Re: Disk naming (Was Re: [Xen-devel] [PATCH] Guest boot loadersupport [1/2])
Mark Williamson wrote:>But Xend configures the blkback driver using the control interface messages, >so the sizing of the ctrlif messages is still a problem (that was what >Anthony was saying, I think). > >I''m was trying at least :-)>That said, there would, of course, be ways this could be fixed. One nice side >effect of supporting passing filenames to the blkback would be the ability to >configure file based block devices when the backend is not dom0. > >This brings up a really interesting point. Is there a good story yet for how more complex devices can be created on driver domains? For instance, how would you create an iSCSI device that existed on a driver domain (or is this something that wouldn''t be all that useful)? Can we assume an rexec capability between dom0 and a driver domain? Regards, Anthony Liguori>Cheers, >Mark > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel