The following series adds support for NetBSD gntdev to libxc, and makes libxl use Qemu as a disk backend if an image file stored on a remote filesystem is used. Patch 2/3 is just a fix for the comment in function libxl__try_phy_backend.
On Thu, Nov 29, 2012 at 06:31:45PM +0100, Roger Pau Monne wrote:> The following series adds support for NetBSD gntdev to libxc, and > makes libxl use Qemu as a disk backend if an image file stored on a > remote filesystem is used.Can''t you let the administrator decide this in the domU''s config file ? Right now vnd on nfs has problems, but nothing unfixable. So one day you''ll want to use the kernel driver for nfs too. But maybe other local filesystems will have the problem. So it''s not for the tool do decide in the back of the admin. -- Manuel Bouyer <bouyer@antioche.eu.org> NetBSD: 26 ans d''experience feront toujours la difference --
On 29/11/12 19:56, Manuel Bouyer wrote:> On Thu, Nov 29, 2012 at 06:31:45PM +0100, Roger Pau Monne wrote: >> The following series adds support for NetBSD gntdev to libxc, and >> makes libxl use Qemu as a disk backend if an image file stored on a >> remote filesystem is used. > > > Can''t you let the administrator decide this in the domU''s config file ? > Right now vnd on nfs has problems, but nothing unfixable. > So one day you''ll want to use the kernel driver for nfs too. > But maybe other local filesystems will have the problem. > So it''s not for the tool do decide in the back of the admin.We can always add a configure check later when this is fixed to decide if it is needed or not. I don''t like the idea of having to set a configuration option if you want to use disk images on NFS, because if you don''t know about this specific issue your whole Dom0 will crash which is really not desired or user friendly.
On 30/11/12 09:52, Manuel Bouyer wrote:> On Fri, Nov 30, 2012 at 09:42:52AM +0100, Roger Pau Monné wrote: >> On 29/11/12 19:56, Manuel Bouyer wrote: >>> On Thu, Nov 29, 2012 at 06:31:45PM +0100, Roger Pau Monne wrote: >>>> The following series adds support for NetBSD gntdev to libxc, and >>>> makes libxl use Qemu as a disk backend if an image file stored on a >>>> remote filesystem is used. >>> >>> >>> Can''t you let the administrator decide this in the domU''s config file ? >>> Right now vnd on nfs has problems, but nothing unfixable. >>> So one day you''ll want to use the kernel driver for nfs too. >>> But maybe other local filesystems will have the problem. >>> So it''s not for the tool do decide in the back of the admin. >> >> We can always add a configure check later when this is fixed to decide >> if it is needed or not. I don''t like the idea of having to set a >> configuration option if you want to use disk images on NFS, because if >> you don''t know about this specific issue your whole Dom0 will crash >> which is really not desired or user friendly. > > And I don''t like the idea of software doing things in my back. > And, beside this, I don''t think local vs remote is the right criteria. > There are remote filesystems which may play nice with vnd. There > are local filesystems that may not play nice with vnd.I would agree with you if this was a DomU crash, but in this case the crash happens in the Dom0, and every DomU that the system might be running crashes completely. This is not acceptable in any way from my point of view. I think we should not expect the user to be aware of this kind of problems. If we cannot guarantee that the vnd driver is functional for all filesystems, we should not use it. From my point of view reliability should always come before performance.
On Fri, Nov 30, 2012 at 10:57:57AM +0100, Roger Pau Monné wrote:> There are also other tools that build on top of libxl, like libvirt, are > we going to modify those high level tools to add a new option to the > config file if the disk of a DomU is in NFS and the Dom0 is NetBSD? INo if. the admin should be able to choose what backend to use, regardless of the storage used, and probably of the dom0 OS (you have several backend available in linux as well, isn''t it) ?> don''t think we should take that road, I think libxl should take care of > all those quicks, and provide an uniform layer that can be trusted > independently of the environment, so the same configuration file can be > used in all supported Dom0 OSes.Ho, that won''t work anyway. The options to setup the network are different, for example.> > Not fixing this in libxl just moves the problem a layer upper, where > there''s a lot more of options, and of course a lot more of work to track > and fix them all.Then, as a compromise, could the default be specified in xend-config.sxp and be overridable in the domU''s config file ? this way, the admin can set a sane defaut for his setup, and eventually decide what to use on a per-domU basis. -- Manuel Bouyer <bouyer@antioche.eu.org> NetBSD: 26 ans d''experience feront toujours la difference --
On Fri, 2012-11-30 at 09:57 +0000, Roger Pau Monné wrote:> On 30/11/12 10:41, Manuel Bouyer wrote: > > On Fri, Nov 30, 2012 at 10:21:02AM +0100, Roger Pau Monné wrote: > >>> And I don't like the idea of software doing things in my back. > >>> And, beside this, I don't think local vs remote is the right criteria. > >>> There are remote filesystems which may play nice with vnd. There > >>> are local filesystems that may not play nice with vnd. > >> > >> I would agree with you if this was a DomU crash, but in this case the > >> crash happens in the Dom0, and every DomU that the system might be > >> running crashes completely. This is not acceptable in any way from my > >> point of view. > >> > >> I think we should not expect the user to be aware of this kind of > >> problems. If we cannot guarantee that the vnd driver is functional for > >> all filesystems, we should not use it. From my point of view reliability > >> should always come before performance. > > > > In my POV, the admin show know what he's doing. This includes be aware of the > > limitations of the software he uses. A 50% performances loss just "in case" > > is not accetpable. Or at last last there should be a way for the admin > > to revert to an acceptable configuration, performance wise, without > > hacking and rebuilding from sources. > > There are also other tools that build on top of libxl, like libvirt, are > we going to modify those high level tools to add a new option to the > config file if the disk of a DomU is in NFS and the Dom0 is NetBSD? I > don't think we should take that road, I think libxl should take care of > all those quicks, and provide an uniform layer that can be trusted > independently of the environment, so the same configuration file can be > used in all supported Dom0 OSes. > > Not fixing this in libxl just moves the problem a layer upper, where > there's a lot more of options, and of course a lot more of work to track > and fix them all.libxl only selects the backend itself if the caller doesn't provide one. If the caller sets the backend field != UNKNOWN then libxl will (try) and use it. This field is exposed by xl via the backendtype= key in the disk configuration http://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txt AIUI libvirt reuses xl's syntax and so would inherit this for free. In any case any upper layer building on libxl is likely to want to provide the option to manually select a specific backend. I think this satisfies Manuel's "there should be a way for the admin..." requirement. I agree that it doesn't seem right to expect upper layers to automatically set this manual override, IYSWIM, though. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Fri, Nov 30, 2012 at 10:32:32AM +0000, Ian Campbell wrote:> libxl only selects the backend itself if the caller doesn''t provide one. > If the caller sets the backend field != UNKNOWN then libxl will (try) > and use it. This field is exposed by xl via the backendtype= key in the > disk configuration > http://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txtthanks for pointing this out. I guess qdisk is the qemu backend, and tap would be the in-kernel backend ? Is there a way to specify in a config file which default xl should use ? -- Manuel Bouyer <bouyer@antioche.eu.org> NetBSD: 26 ans d''experience feront toujours la difference --
On 11/30/12 5:50 AM, Manuel Bouyer wrote:> On Fri, Nov 30, 2012 at 10:43:21AM +0000, Ian Campbell wrote: >> On Fri, 2012-11-30 at 10:38 +0000, Manuel Bouyer wrote: >>> On Fri, Nov 30, 2012 at 10:32:32AM +0000, Ian Campbell wrote: >>>> libxl only selects the backend itself if the caller doesn''t provide one. >>>> If the caller sets the backend field != UNKNOWN then libxl will (try) >>>> and use it. This field is exposed by xl via the backendtype= key in the >>>> disk configuration >>>> http://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txt >>> thanks for pointing this out. >>> I guess qdisk is the qemu backend, and tap would be the in-kernel backend ? >> qdisk == qemu, tap == blktap, phy == in kernel. > OK; but then, how does the script called by xenbackendd know what setup > is should do ? With xm, it would get a string in the form > phy:/dev/wd0e > or > file:/domains/foo.img > > but from what I''ve understant, this syntax is deprecated now ?Hi Manuel, thanks for all of your work on xen. So, the short answer is that the /usr/pkg/etc/xen/scripts/block executable (which is just a shell script) needs to fish the type of the backend and other extra info out of xenstore. In the case of netbsd''s block script, it uses the xenstore-read utility. When the block script is called, $1 ($xpath) will be an entry like so /local/domain/0/backend/vbd/2/768; so this is vbd for domU instance #2 with disk instance id 768. $2 ($xstatus) is the reason the block script is called, 2 for startup, and 6 for tear down. If you look at the block script you can also see that the block script needs to fish out the type of the backend that is located at $xpath/type in xenstore. The block script utilizes the xenstore-read and xenstore-write to read/modify xenstore, if you notice. So, under $xpath here are the pertinent entries: - $xpath/type - $xpath/params - $xpath/physical-disk With xm+xend, for disk config file:/var/xen/domu/server001/disk.img, prior to the block script getting called: $xpath/type = ''file'' $xpath/params = ''/var/xen/domu/server001/disk.img'' $xpath/physical-disk = <unspecified> Then when the block script is called, it detects that the type is ''file'' and so it would call vnconfig using the value $xpath/params as the actual file using an unused vnd device, say vnd2. Then it would modify $xpath/physical-disk using the xenstore-write utility and set the value of $xpath/physical-disk to /dev/vnd2. After that, the handling, everything else is handled as if the backend is an actual physical device, which is true. If the value of $xpath/type is ''phy'', nothing needs to be done, since $xpath/physical-disk has been setup properly. Now, the above behavior is how xm+xend+xenbackendd works. There was a bug in libxl/xl that I fixed as described here http://mail-index.netbsd.org/port-xen/2012/05/29/msg007252.html so that libxl/xl would behave the same way. You might also notice, I made small changes to allow custom backend type. I don''t mean to keep tooting my own horn, but I just don''t want the fix to get lost, although I just learned that I probably should do a bug report with the patch, but I don''t have much time right now. Hopefully I don''t bore you to death if you read this far, but the way the libxl/xl is going seems somewhat ridiculous, where they are trying to insert more and more policy vs functionality, as evidenced by Roger''s effort to outrightly to just decide (hence a policy) that ''well, vnd can''t work with file on NFS, so be damn with it and just route everything via qemu-dm''. Additionally, they are planning to retire xenbackendd and again this is a trend that I don''t like where they clump everything into one giant ball called the libxl/xl, as opposed to a bunch of little executables doing specific things. Another illustration: you might ask who''s managing the domU in the absence of xend; well what happens is that when you do ''xl create ...'' xl would then daemonize itself to watch xenstore for that specific domU that it just created. I guess that''s cool, but to me it''s got the feel to put everything into one giant, monolithic, unflexible entity (shudder). So, I''m just writing to give you some more information about this whole libxl/xl stuff, and hopefully it''ll give you some arsenal so that some of the xen folks don''t try to force policies that is claimed to protect ''end-user'' or very linux specific. Cheers, Toby
On 03/12/12 20:47, Toby Karyadi wrote:> On 11/30/12 5:50 AM, Manuel Bouyer wrote: >> On Fri, Nov 30, 2012 at 10:43:21AM +0000, Ian Campbell wrote: >>> On Fri, 2012-11-30 at 10:38 +0000, Manuel Bouyer wrote: >>>> On Fri, Nov 30, 2012 at 10:32:32AM +0000, Ian Campbell wrote: >>>>> libxl only selects the backend itself if the caller doesn''t provide one. >>>>> If the caller sets the backend field != UNKNOWN then libxl will (try) >>>>> and use it. This field is exposed by xl via the backendtype= key in the >>>>> disk configuration >>>>> http://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txt >>>> thanks for pointing this out. >>>> I guess qdisk is the qemu backend, and tap would be the in-kernel backend ? >>> qdisk == qemu, tap == blktap, phy == in kernel. >> OK; but then, how does the script called by xenbackendd know what setup >> is should do ? With xm, it would get a string in the form >> phy:/dev/wd0e >> or >> file:/domains/foo.img >> >> but from what I''ve understant, this syntax is deprecated now ? > > Hi Manuel, thanks for all of your work on xen. > > So, the short answer is that the /usr/pkg/etc/xen/scripts/block > executable (which is just a shell script) needs to fish the type of the > backend and other extra info out of xenstore. In the case of netbsd''s > block script, it uses the xenstore-read utility. When the block script > is called, $1 ($xpath) will be an entry like so > /local/domain/0/backend/vbd/2/768; so this is vbd for domU instance #2 > with disk instance id 768. $2 ($xstatus) is the reason the block script > is called, 2 for startup, and 6 for tear down. > > If you look at the block script you can also see that the block script > needs to fish out the type of the backend that is located at $xpath/type > in xenstore. The block script utilizes the xenstore-read and > xenstore-write to read/modify xenstore, if you notice. > > So, under $xpath here are the pertinent entries: > - $xpath/type > - $xpath/params > - $xpath/physical-disk > > With xm+xend, for disk config file:/var/xen/domu/server001/disk.img, > prior to the block script getting called: > $xpath/type = ''file'' > $xpath/params = ''/var/xen/domu/server001/disk.img'' > $xpath/physical-disk = <unspecified> > > Then when the block script is called, it detects that the type is ''file'' > and so it would call vnconfig using the value $xpath/params as the > actual file using an unused vnd device, say vnd2. Then it would modify > $xpath/physical-disk using the xenstore-write utility and set the value > of $xpath/physical-disk to /dev/vnd2. After that, the handling, > everything else is handled as if the backend is an actual physical > device, which is true. > > If the value of $xpath/type is ''phy'', nothing needs to be done, since > $xpath/physical-disk has been setup properly. > > Now, the above behavior is how xm+xend+xenbackendd works. There was a > bug in libxl/xl that I fixed as described here > http://mail-index.netbsd.org/port-xen/2012/05/29/msg007252.html so that > libxl/xl would behave the same way. You might also notice, I made small > changes to allow custom backend type. I don''t mean to keep tooting my > own horn, but I just don''t want the fix to get lost, although I just > learned that I probably should do a bug report with the patch, but I > don''t have much time right now. > > Hopefully I don''t bore you to death if you read this far, but the way > the libxl/xl is going seems somewhat ridiculous, where they are trying > to insert more and more policy vs functionality, as evidenced by Roger''s > effort to outrightly to just decide (hence a policy) that ''well, vndI don''t understand this whole argument about "policy vs functionality", from my point of view what we have now is also a policy, every raw disk file is attached using the vnd device. And getting a gntdev isn''t certainly a policy, on the other hand this is going to get us much more functionality (like the ability to run backends in userspace, for example support for blktap3 when it is released).> can''t work with file on NFS, so be damn with it and just route > everything via qemu-dm''.Again, as discused with Manuel, a better solution has to be found for this issue, and I''m sure we can reach a consensus.> Additionally, they are planning to retire > xenbackendd and again this is a trend that I don''t like where they clump > everything into one giant ball called the libxl/xl, as opposed to aThe retirement of xenbackendd is done for a good reason, calling hotplug scripts from libxl allows for better control of when hotplug scripts are called, and also allows for better error handling if these scripts fail. Same scripts are called, so functionality stays the same. This was done for both NetBSD and Linux, in the past Linux used to call hotplug scripts from udev, and now they are called from libxl too. I''m not able to see how having a more unified hotplug script interface can be a bad thing.> bunch of little executables doing specific things. Another illustration: > you might ask who''s managing the domU in the absence of xend; well what > happens is that when you do ''xl create ...'' xl would then daemonize > itself to watch xenstore for that specific domU that it just created. I > guess that''s cool, but to me it''s got the feel to put everything into > one giant, monolithic, unflexible entity (shudder).Again, I''m not able to see how this is a problem, in the past you had xend, which was a gigantic piece of python code acting as a central arbiter, now you have a small C daemon for each running domain. libxl is generally considered a better piece of code, and is much more easier to maintain than xend. Do you have any concrete technical reason to belive that xend was better than libxl?> So, I''m just writing to give you some more information about this whole > libxl/xl stuff, and hopefully it''ll give you some arsenal so that some > of the xen folks don''t try to force policies that is claimed to protect > ''end-user'' or very linux specific.libxl has even split OS specific routines in separate files, both for Linux and NetBSD and frankly, I''m not able to see how that is worse that the amount of patches we had in pkgsrc to have xend working. Now libxl works out-of-the-box on NetBSD, and many efforts have been put into that. We should focus our efforts on getting Xen to work on NetBSD without a ton of NetBSD specific out of the tree patches, and right now my biggest concern is getting Qemu upstream working.
Whoops, this was meant to be a correspondence just for Manuel. But since the cat is out of the bag... On 12/3/12 4:34 PM, Roger Pau Monné wrote:> I don''t understand this whole argument about "policy vs > functionality", from my point of view what we have now is also a > policy, every raw disk file is attached using the vnd device. And > getting a gntdev isn''t certainly a policy, on the other hand this is > going to get us much more functionality (like the ability to run > backends in userspace, for example support for blktap3 when it is > released).Yes, on netbsd file: disk config will use vnd, but only because the way the block script is written. The functionality of the block script is well defined, and therefore is much easier to override, then, say patching libxl. Therefore, it''s not a ''policy'', but more of a default functionality, since anyone who has some knowledge about shell script can modify it to do something else, e.g. run the vnd under rump (is that possible?) to avoid dom0 blowing up. But if it is decided that disk config that begins with file: to always use qemu-dm just because on the off chance that using vnd over NFS can blow up the dom0, and if there is no way to override it, or really difficult (which is relative, I know), then it becomes a ''policy''. Can I setup iscsi through hotplug for example? To be honest, I haven''t look into how the hotplug script works, so hopefully it can provide equivalent functionality and flexibility.>> Again, as discused with Manuel, a better solution has to be found for >> this issue, and I''m sure we can reach a consensus.That''s what I''m hoping.> The retirement of xenbackendd is done for a good reason, calling > hotplug scripts from libxl allows for better control of when hotplug > scripts are called, and also allows for better error handling if these > scripts fail. Same scripts are called, so functionality stays the > same. This was done for both NetBSD and Linux, in the past Linux used > to call hotplug scripts from udev, and now they are called from libxl > too. I''m not able to see how having a more unified hotplug script > interface can be a bad thing. Again, I''m not able to see how this is a > problem, in the past you had xend, which was a gigantic piece of > python code acting as a central arbiter, now you have a small C daemon > for each running domain. libxl is generally considered a better piece > of code, and is much more easier to maintain than xend. Do you have > any concrete technical reason to belive that xend was better than libxl?Like I said, I probably look into the hotplug script framework before I yammer any further. If it allows for overriding file: disk configs with vnd or whatever, then I''m happy. I don''t think xend is better than libxl, on the contrary, I find libxl source to be more understandable, and much smaller. I couldn''t make out head from tail with xend source code and I''m a python programmer most of the time. But by the same token I know I''ve got this blob called xend that does these specific things, and xenbackendd that does these specific things. There is really nothing that prevents from say creating another program called domu_watcher (or whatever) that is linked to libxl for functionality sharing and gets launched by xl during xl create; but again, this is more nitpicks and a matter of preference.> libxl has even split OS specific routines in separate files, both for > Linux and NetBSD and frankly, I''m not able to see how that is worse > that the amount of patches we had in pkgsrc to have xend working. Now > libxl works out-of-the-box on NetBSD, and many efforts have been put > into that. We should focus our efforts on getting Xen to work on > NetBSD without a ton of NetBSD specific out of the tree patches, and > right now my biggest concern is getting Qemu upstream working.I noticed that even in 4.1.2, which I thought was a good start, however those compile-time ''plugins'' were not useful back then since it was only for whether blktap is supported, to which under netbsd compilation the answer is ''nah!''. I haven''t looked into 4.2 libxl src either, so maybe the plugin points are much extensive now. I know enough that creating a system with plugin support, whether compile-time or during runtime, is not easy, because to a certain degree you have to expect the unexpected. If I sound I''m just griping all of the time, it''s just that I had more time right now to jump in, but I don''t. So, I do a lot of hand waving ;-). I understand that you gotta get qemu working, but that''s orthogonal to letting us shoot ourselves in the foot ;-) I mean, have you used netbsd''s disklabel? Anyways... Cheers, Toby
Toby Karyadi writes ("Re: [Xen-devel] [PATCH v2 0/3] Add support for NetBSD gntdev"):> Hopefully I don''t bore you to death if you read this far, but the way > the libxl/xl is going seems somewhat ridiculous, where they are trying > to insert more and more policy vs functionality,The proposal here AIUI is just about what the default should be. We have taken the view that the admin should not /need/ to specify explicitly which software components to assemble together to get the block devices to work. That doesn''t mean that the admin won''t be enabled to specify exactly what they want (and to keep all the pieces if it doesn''t work). Ian.