Greg KH wrote:> On Thu, Aug 03, 2006 at 12:26:16PM -0700, Zachary Amsden wrote: > >> Who said that? Please smack them on the head with a broom. We are all >> actively working on implementing Rusty''s paravirt-ops proposal. It >> makes the API vs ABI discussion moot, as it allow for both. >> > > So everyone is still skirting the issue, oh great :) >I don''t really think there''s an issue to be skirted here. The current plan is to design and implement a paravirt_ops interface, which is a typical Linux source-level interface between the bulk of the kernel and a set of hypervisor-specific backends. Xen, VMWare and other interested parties are working together on this interface to make sure it meets everyone''s needs (and if you have another hypervisor you''d like to support with this interface, we want to hear from you). Until VMWare proposed VMI, Xen was the only hypervisor needing support, so it was reasonable that the Xen patches just go straight to Xen. But with paravirtops the result will be more flexible, since a kernel will be configurable to run on any combination of supported hypervisor or on bare hardware. As far as I''m concerned, the issue of whether VMI has a stable ABI or not is one which on the VMI side of the paravirtops interface, and it doesn''t have any wider implications. Certainly Xen will maintain a backwards compatible hypervisor interface for as long as we want/need to, but that''s a matter for our side of paravirtops. And the paravirtops interface will change over time as the kernel does, and the backends will be adapted to match, either using the same ABI to the underlying hypervisor, or an expanded one, or whatever; it doesn''t matter as far as the rest of the kernel is concerned. There''s the other question of whether VMI is a suitable interface for Xen, making the whole paravirt_ops exercise redundant. Zach and VMWare are claiming to have a VMI binding to Xen which is full featured with good performance. That''s an interesting claim, and I don''t doubt that its somewhat true. However, they haven''t released either code for this interface or detailed performance results, so its hard to evaluate. And with anything in this area, its always the details that matter: what tests, on what hardware, at what scale? Does VMI really expose all of Xen''s features, or does it just use a bare-minimum subset to get things going? And how does the interface fit with short and long term design goals? I don''t think anybody is willing to answer these questions with any confidence. VMWare''s initial VMI proposal was very geared towards their particular hypervisor architecture; it has been modified over time to be a little closer to Xen''s model, in order to efficiently support the Xen binding. But Xen and ESX have very different designs and underlying philosophies, so I wouldn''t expect a single interface to fit comfortably with either. As far as LKML is concerned, the only interface which matters is the Linux -> <something> interface, which is defined within the scope of the Linux development process. That''s what paravirt_ops is intended to be. And being a Linux API, paravirt_ops can avoid duplicating other Linux interfaces. For example, VMI, like the Xen hypervisor interface, need various ways to deal with time. The rest of the kernel needn''t know or care about those interfaces, because the paravirt backend for each can also register a clocksource, or use other kernel APIs to expose that interface (some of which we''ll probably develop/expand over time as needed, but in the normal way kernel interfaces chance). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2006-08-03 at 21:18 -0700, Andrew Morton wrote:> > As far as LKML is concerned, the only interface which matters is the > > Linux -> <something> interface, which is defined within the scope of the > > Linux development process. That''s what paravirt_ops is intended to be. > > I must confess that I still don''t "get" paravirtops. AFACIT the VMI > proposal, if it works, will make that whole layer simply go away. Which > is attractive. If it works.Everywhere in the kernel where we have multiple implementations we want to select at runtime, we use an ops struct. Why should the choice of Xen/VMI/native/other be any different? Yes, we could force native and Xen to work via VMI, but the result would be less clear, less maintainable, and gratuitously different from elsewhere in the kernel. And, of course, unlike paravirt_ops where we can change and add ops at any time, we can''t similarly change the VMI interface because it''s an ABI (that''s the point: the hypervisor can provide the implementation). I hope that clarifies, Rusty. -- Help! Save Australia from the worst of the DMCA: http://linux.org.au/law _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Andrew Morton (akpm@osdl.org) wrote:> I must confess that I still don''t "get" paravirtops. AFACIT the VMI > proposal, if it works, will make that whole layer simply go away. Which > is attractive. If it works.Paravirtops is simply a table of function which are populated by the hypervisor specific code at start-of-day. Some care is taken to patch up callsites which are performance sensitive. The main difference is the API vs. ABI distinction. In paravirt ops case, the ABI is defined at compile time from source. The VMI takes it one step further and fixes the ABI. That last step is a big one. There are two basic issues. 1) what is the interface between the kernel and the glue to a hypervisor. 2) how does one call from the kernel into the glue layer. Getting bogged down in #2, the details of the calling convention, is a distraction from the real issue, #1. We are trying to actually find an API that is useful for multiple projects. Paravirt_ops gives the flexibility to evolve the interface. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Antonio Vargas (windenntw@gmail.com) wrote:> One feature I found missing at the paravirt patches is to allow the > user to forbid the use of paravirtualization of certain features (via > a bitmask on the kernel commandline for example) so that the execution > drops into the native hardware virtualization system. Such a featureThere is no native harware virtualization system in this picture. Maybe I''m just misunderstanding you.> would provide a big upwards compatibility for the kernel<->hypervisor > system. The case for this would be needing to forcefully upgrade the > hypervisor due to security issues and finding out that the hypervisor > is incompatible at the paravirtualizatrion level, then the user would > be at least capable of continuing to run the old kernel with the new > hypervisor until the compatibility is reached again.This seems a bit like a trumped up example, as randomly disabling a part of the pv interface is likely to cause correctness issues, not just performance degradation. Hypervisor compatibility is a slightly separate issue here. There''s two interfaces. The linux paravirt interface is internal to the kernel. The hypervisor interface is external to the kernel. kernel <--pv interface--> paravirt glue layer <--hv interface--> hypervisor So changes to the hypervisor must remain ABI compatible to continue working with the same kernel. This is the same requirement the kernel has with the syscall interface it provides to userspace.> BTW, what is the recommended distro or kernel setup to help testing > the latest paravirt patches? I''ve got a spare machine (with no needed > data) at hand which could be put to good use.Distro of choice. Current kernel with the pv patches[1], but be forewarned, they are very early, and not fully booting. thanks, -chris [1] mercurial patchqueue http://ozlabs.org/~rusty/paravirt/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2006-08-03 at 22:53 -0700, Andrew Morton wrote:> On Fri, 04 Aug 2006 15:04:35 +1000 > Rusty Russell <rusty@rustcorp.com.au> wrote: > > > On Thu, 2006-08-03 at 21:18 -0700, Andrew Morton wrote: > > Everywhere in the kernel where we have multiple implementations we want > > to select at runtime, we use an ops struct. Why should the choice of > > Xen/VMI/native/other be any different? > > VMI is being proposed as an appropriate way to connect Linux to Xen. If > that is true then no other glue is needed.Sorry, this is wrong. VMI was proposed as the appropriate way to connect Linux to Xen, *and* native, *and* VMWare''s hypervisors (and others). This way one Linux binary can boot on all three, using different VMI blobs.> > Yes, we could force native and Xen to work via VMI, but the result would > > be less clear, less maintainable, and gratuitously different from > > elsewhere in the kernel. > > I suspect others would disagree with that. We''re at the stage of needing > to see code to settle this.Wrong again. We''ve *seen* the code for VMI, and fairly hairy. Seeing the native-implementation and Xen-implementation VMI blobs will not make it less hairy!> > And, of course, unlike paravirt_ops where we > > can change and add ops at any time, we can''t similarly change the VMI > > interface because it''s an ABI (that''s the point: the hypervisor can > > provide the implementation). > > hm. Dunno. ABIs can be uprevved. Perhaps.Certainly VMI can be. But I''d prefer to leave the excellent hackers at VMWare with the task of maintaining their ABI, and let Linux hackers (most of whom will run native) manipulate paravirt_ops freely. We''re not good at maintaining ABIs. We''re going to be especially bad at maintaining an ABI when the 99% of us running native will never notice the breakage. Hope that clarifies, Rusty. -- Help! Save Australia from the worst of the DMCA: http://linux.org.au/law _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Antonio Vargas (windenntw@gmail.com) wrote:> What I was refering with "native hardware virtualization" is just the > VT or Pacitifica -provided trapping into the hypervisor upon executing > "dangerous" instructions such as tlb-flushes, reading/setting the > current ring-level, cli/sti...We are not talking about VMX or AMDV. Just plain ol'' x86 hardware.> Yes, maybe just providing a switch to force paravirtops to use the > native hardware implementation would be enough, or just in case, > making the default the native hardware and allowing the kernel > commandline to select another one (just like on io-schedulers)In this case native hardware == running on bare metal w/out VMX/AMDV support and w/out any hypervisor. So, while this would let you actually boot the machine, it''s probably not really useful for the case you cited (security update to hypervisor causes ABI breakage) because you''d be booting a normal kernel w/out any virtualization. IOW, all the virtual machines that were running on that physical machine would not be running.> Yes. What I propose is allowing the systems to continue running (only > with degraded performance) when the hv-interface between the running > kernel and the running hypervisor doesn''t match.This is non-trivial. If the hv-interface breaks the ABI, then you''d need to update the pv-glue layer in the kernel.> >> BTW, what is the recommended distro or kernel setup to help testing > >> the latest paravirt patches? I''ve got a spare machine (with no needed > >> data) at hand which could be put to good use. > > > >Distro of choice. Current kernel with the pv patches[1], but be > >forewarned, they are very early, and not fully booting. > > Thanks, will be setting it up :)Thanks for helping. -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 2006-08-04 at 00:21 -0700, Andrew Morton wrote:> On Fri, 04 Aug 2006 17:04:59 +1000 > Rusty Russell <rusty@rustcorp.com.au> wrote: > > > On Thu, 2006-08-03 at 22:53 -0700, Andrew Morton wrote: > > > VMI is being proposed as an appropriate way to connect Linux to Xen. If > > > that is true then no other glue is needed. > > > > Sorry, this is wrong. > > It''s actually 100% correct.Err, yes. I actually misrepresented VMI: the native implementation is inline (ie. no blob is required for native). Bad Rusty.> > > > Yes, we could force native and Xen to work via VMI, but the result would > > > > be less clear, less maintainable, and gratuitously different from > > > > elsewhere in the kernel. > > > > > > I suspect others would disagree with that. We''re at the stage of needing > > > to see code to settle this. > > > > Wrong again. > > I was referring to the VMI-for-Xen code.I know. And I repeat, we don''t have to see that part, to know that the result is less clear, less maintainable and gratuitously different from elsewhere in the kernel than the paravirt_ops approach. We''ve seen paravirt and the VMI parts of this already.> > We''ve *seen* the code for VMI, and fairly hairy. > > I probably slept through that discussion - I don''t recall that things were > that bad. Do you recall the Subject: or date?Read the patches which Zach sent back in March, particularly: [RFC, PATCH 3/24] i386 Vmi interface definition [RFC, PATCH 4/24] i386 Vmi inline implementation [RFC, PATCH 5/24] i386 Vmi code patching If you want to hack on x86 arch code, you''d need to understand these. Then to see the paravirt patches go to http://ozlabs.org/~rusty/paravirt and look at the approximately-equivalent paravirt_ops patches: 008-paravirt-structure.patch 009-binary-patch.patch There''s nothing in those paravirt_ops patches which will surprise any kernel hacker. That''s my entire point: maintainable, unsurprising, clear. Rusty. -- Help! Save Australia from the worst of the DMCA: http://linux.org.au/law _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 4 Aug 2006, Rusty Russell wrote:> On Thu, 2006-08-03 at 22:53 -0700, Andrew Morton wrote: >> On Fri, 04 Aug 2006 15:04:35 +1000 >> Rusty Russell <rusty@rustcorp.com.au> wrote: >> >>> On Thu, 2006-08-03 at 21:18 -0700, Andrew Morton wrote: >>> Everywhere in the kernel where we have multiple implementations we want >>> to select at runtime, we use an ops struct. Why should the choice of >>> Xen/VMI/native/other be any different? >> >> VMI is being proposed as an appropriate way to connect Linux to Xen. If >> that is true then no other glue is needed. > > Sorry, this is wrong. VMI was proposed as the appropriate way to > connect Linux to Xen, *and* native, *and* VMWare''s hypervisors (and > others). This way one Linux binary can boot on all three, using > different VMI blobs. > >>> Yes, we could force native and Xen to work via VMI, but the result would >>> be less clear, less maintainable, and gratuitously different from >>> elsewhere in the kernel. >> >> I suspect others would disagree with that. We''re at the stage of needing >> to see code to settle this. > > Wrong again. We''ve *seen* the code for VMI, and fairly hairy. Seeing > the native-implementation and Xen-implementation VMI blobs will not make > it less hairy! > >>> And, of course, unlike paravirt_ops where we >>> can change and add ops at any time, we can''t similarly change the VMI >>> interface because it''s an ABI (that''s the point: the hypervisor can >>> provide the implementation). >> >> hm. Dunno. ABIs can be uprevved. Perhaps. > > Certainly VMI can be. But I''d prefer to leave the excellent hackers at > VMWare with the task of maintaining their ABI, and let Linux hackers > (most of whom will run native) manipulate paravirt_ops freely. > > We''re not good at maintaining ABIs. We''re going to be especially bad at > maintaining an ABI when the 99% of us running native will never notice > the breakage.some questions from a user. pleas point out where I am misunderstanding things. one of the big uses of virtualization will be to run things in sandboxes, when people do this they typicaly migrate the sandbox from system to system over time (working with chroot sandboxes I''ve seen some HUGE skews between what''s running in the sandbox and what''s running in the host). If the interface between the guest kernel and the hypervisor isn''t fixed how could somone run a 2.6.19 guest and a 2.6.30 guest at the same time? if it''s only a source-level API this implies that when you move your host kernel from 2.6.19 to 2.6.25 you would need to recompile your 2.6.19 guest kernel to support the modifications. where are the patches going to come from to do this? It seems to me from reading this thread that the PowerPC and S390 have a ABI defined, specificly defined by the hardware in the case of PowerPC and by the externaly maintained, Linux-independant hypervisor (which is effectivly the hardware) in the case of the s390. If there''s going to be long-term compatability between different hosts and guests there need some limits to what can change. needing to uprev the host when you uprev a guest is acceptable needing to uprev a guest when you uprev a host is not. this basicly boils down to ''once you expose an interface to a user it can''t change'', with the interface that''s being exposed being the calls that the guest makes to the host. David Lang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Lang wrote:> if it''s only a source-level API this implies that when you move your > host kernel from 2.6.19 to 2.6.25 you would need to recompile your > 2.6.19 guest kernel to support the modifications. where are the > patches going to come from to do this?No, the low-level interface between the kernel is an ABI, which will be as stable as your hypervisor author/vendor wants it to be (which is generally "very stable"). The question is whether that low-level interface is exposed to the rest of the kernel directly, or hidden behind a kernel-internal source-level API. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8/4/06, David Lang <dlang@digitalinsight.com> wrote:> On Fri, 4 Aug 2006, Rusty Russell wrote: > > > On Thu, 2006-08-03 at 22:53 -0700, Andrew Morton wrote: > >> On Fri, 04 Aug 2006 15:04:35 +1000 > >> Rusty Russell <rusty@rustcorp.com.au> wrote: > >> > >>> On Thu, 2006-08-03 at 21:18 -0700, Andrew Morton wrote: > >>> Everywhere in the kernel where we have multiple implementations we want > >>> to select at runtime, we use an ops struct. Why should the choice of > >>> Xen/VMI/native/other be any different? > >> > >> VMI is being proposed as an appropriate way to connect Linux to Xen. If > >> that is true then no other glue is needed. > > > > Sorry, this is wrong. VMI was proposed as the appropriate way to > > connect Linux to Xen, *and* native, *and* VMWare''s hypervisors (and > > others). This way one Linux binary can boot on all three, using > > different VMI blobs. > > > >>> Yes, we could force native and Xen to work via VMI, but the result would > >>> be less clear, less maintainable, and gratuitously different from > >>> elsewhere in the kernel. > >> > >> I suspect others would disagree with that. We''re at the stage of needing > >> to see code to settle this. > > > > Wrong again. We''ve *seen* the code for VMI, and fairly hairy. Seeing > > the native-implementation and Xen-implementation VMI blobs will not make > > it less hairy! > > > >>> And, of course, unlike paravirt_ops where we > >>> can change and add ops at any time, we can''t similarly change the VMI > >>> interface because it''s an ABI (that''s the point: the hypervisor can > >>> provide the implementation). > >> > >> hm. Dunno. ABIs can be uprevved. Perhaps. > > > > Certainly VMI can be. But I''d prefer to leave the excellent hackers at > > VMWare with the task of maintaining their ABI, and let Linux hackers > > (most of whom will run native) manipulate paravirt_ops freely. > > > > We''re not good at maintaining ABIs. We''re going to be especially bad at > > maintaining an ABI when the 99% of us running native will never notice > > the breakage. > > some questions from a user. pleas point out where I am misunderstanding things.asking is the smart way :)> one of the big uses of virtualization will be to run things in sandboxes, when > people do this they typicaly migrate the sandbox from system to system over time > (working with chroot sandboxes I''ve seen some HUGE skews between what''s running > in the sandbox and what''s running in the host). If the interface between the > guest kernel and the hypervisor isn''t fixed how could somone run a 2.6.19 guest > and a 2.6.30 guest at the same time? > > if it''s only a source-level API this implies that when you move your host kernel > from 2.6.19 to 2.6.25 you would need to recompile your 2.6.19 guest kernel to > support the modifications. where are the patches going to come from to do this? > > It seems to me from reading this thread that the PowerPC and S390 have a ABI > defined, specificly defined by the hardware in the case of PowerPC and by the > externaly maintained, Linux-independant hypervisor (which is effectivly the > hardware) in the case of the s390.the trick with ppc, s390, m68k... is that they were defined since day zero (*simplifies 68000/68010 history here*) so that when you run as non-priviledged-task and try to execute a priviledged instruction, then the security acts out and the OS gets control. x86 wasn''t since they had some instructions where the non-priviledged could detect it was so, thus barring any way of the hypervisor appearing invisible. this is solved on x86 and x64_64 with the new extensions.> If there''s going to be long-term compatability between different hosts and > guests there need some limits to what can change. > > needing to uprev the host when you uprev a guest is acceptable > > needing to uprev a guest when you uprev a host is not.Now, allowing this transparent acting is great since you can run your normal kernel as-is as a guest. But to get close to 100% speed, what you do is to rewrite parts of the OS to be aware of the hypervisor, and stablish a common way to talk. Thus happens the work with the paravirt-ops. Just like you can use any filesystem under linux because they have a well-defined intrface to the rest of the kernel, the paravirt-ops are the way we are wrking to define an interface so that the rest of the kernel can be ignorant to whether it''s running on the bare metal or as a guest. Then, if you needed to run say 2.6.19 with hypervisor A-1.0, you just need to write paravirt-ops which talk and translate between 2.6.19 and A-1.0. If 5 years later you are still running A-1.0 and want to run a 2.6.28 guest, then you would just need to write the paravirt-ops between 2.6.28 and A-1.0, with no need to modify the rest of the code or the hypervisor. At the moment we only have 1 GPL hypervisor and 1 binary one. Then maybe it''s needed to define if linux should help run under binary hypervisors, but imagine instead of this one, we had the usual Ghyper vs Khyper separation. We would prefer to give the same adaptations to both of them and abstract them away just like we do with filesystems.> this basicly boils down to ''once you expose an interface to a user it can''t > change'', with the interface that''s being exposed being the calls that the guest > makes to the host.Yes, that''s the reason some mentioned ppc, sparc, s390... because they have been doing this longer than us and we could consider adopting some of their designs (just like we did for POSIX system calls ;)> David Lang-- Greetz, Antonio Vargas aka winden of network _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 4 Aug 2006, Antonio Vargas wrote:>> If there''s going to be long-term compatability between different hosts and >> guests there need some limits to what can change. >> >> needing to uprev the host when you uprev a guest is acceptable >> >> needing to uprev a guest when you uprev a host is not. > > Now, allowing this transparent acting is great since you can run your > normal kernel as-is as a guest. But to get close to 100% speed, what > you do is to rewrite parts of the OS to be aware of the hypervisor, > and stablish a common way to talk.I understand this, but for example a UML 2.6.10 kernel will continue to run unmodified on top of a 2.6.17 kernel, the ABI used is stable. however if you have a 2.6.10 host with a 2.6.10 UML guest and want to run a 2.6.17 guest you may (but not nessasarily must) have to upgrade the host to 2.6.17 or later.> Thus happens the work with the paravirt-ops. Just like you can use any > filesystem under linux because they have a well-defined intrface to > the rest of the kernel, the paravirt-ops are the way we are wrking to > define an interface so that the rest of the kernel can be ignorant to > whether it''s running on the bare metal or as a guest. > > Then, if you needed to run say 2.6.19 with hypervisor A-1.0, you just > need to write paravirt-ops which talk and translate between 2.6.19 and > A-1.0. If 5 years later you are still running A-1.0 and want to run a > 2.6.28 guest, then you would just need to write the paravirt-ops > between 2.6.28 and A-1.0, with no need to modify the rest of the code > or the hypervisor.who is going to be writing all these interface layers to connect each kernel version to each hypervisor version. and please note, I am not just considering Xen and vmware as hypervisors, a vanilla linux kernel is the hypervisor for UML. so just stating that the hypervisor maintainers need to do this is implying that the kernel maintainers would be required to do this. also I''m looking at the more likly case that 5 years from now you may still be runnint 2.6.19, but need to upgrade to hypervisor A-5.8 (to support a different client). you don''t want to have to try and recompile the 2.6.19 kernel to keep useing it.> At the moment we only have 1 GPL hypervisor and 1 binary one. Then > maybe it''s needed to define if linux should help run under binary > hypervisors, but imagine instead of this one, we had the usual Ghyper > vs Khyper separation. We would prefer to give the same adaptations to > both of them and abstract them away just like we do with filesystems.you have three hypervisors that I know of. Linux, Xen (multiple versions) , and VMware. each with (mostly) incompatable guests>> this basicly boils down to ''once you expose an interface to a user it can''t >> change'', with the interface that''s being exposed being the calls that the >> guest >> makes to the host. > > Yes, that''s the reason some mentioned ppc, sparc, s390... because they > have been doing this longer than us and we could consider adopting > some of their designs (just like we did for POSIX system calls ;)I''m not commenting on any of the specifics of the interface calls (I trust you guys to make that be sane :-) I''m just responding the the idea that the interface actually needs to be locked down to an ABI as opposed to just source-level compatability. David Lang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Lang wrote:> I''m not commenting on any of the specifics of the interface calls (I > trust you guys to make that be sane :-) I''m just responding the the idea > that the interface actually needs to be locked down to an ABI as opposed > to just source-level compatability.you are right that the interface to the HV should be stable. But those are going to be specific to the HV, the paravirt_ops allows the kernel to smoothly deal with having different HV''s. So in a way it''s an API interface to allow the kernel to deal with multiple different ABIs that exist today and will in the future. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Aug 04, 2006 at 12:06:28PM -0700, David Lang wrote:> I understand this, but for example a UML 2.6.10 kernel will continue to run > unmodified on top of a 2.6.17 kernel, the ABI used is stable. however if > you have a 2.6.10 host with a 2.6.10 UML guest and want to run a 2.6.17 > guest you may (but not nessasarily must) have to upgrade the host to 2.6.17 > or later.Why might you have to do that? Jeff _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 4 Aug 2006, Arjan van de Ven wrote:> David Lang wrote: >> I''m not commenting on any of the specifics of the interface calls (I trust >> you guys to make that be sane :-) I''m just responding the the idea that the >> interface actually needs to be locked down to an ABI as opposed to just >> source-level compatability. > > you are right that the interface to the HV should be stable. But those are > going > to be specific to the HV, the paravirt_ops allows the kernel to smoothly deal > with having different HV''s. > So in a way it''s an API interface to allow the kernel to deal with multiple > different ABIs that exist today and will in the future.so if I understand this correctly we are saying that a kernel compiled to run on hypervisor A would need to be recompiled to run on hypervisor B, and recompiled again to run on hypervisor C, etc where A could be bare hardware, B could be Xen 2, C could be Xen 3, D could be vmware, E could be vanilla Linux, etc. this sounds like something that the distros would not support, they would pick their one hypervisor to support and leave out the others. the big problem with this is that the preferred hypervisor will change over time and people will be left with incompatable choices (or having to compile their own kernels, including having to recompile older kernels to support newer hypervisors) David Lang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 4 Aug 2006, Jeff Dike wrote:> On Fri, Aug 04, 2006 at 12:06:28PM -0700, David Lang wrote: >> I understand this, but for example a UML 2.6.10 kernel will continue to run >> unmodified on top of a 2.6.17 kernel, the ABI used is stable. however if >> you have a 2.6.10 host with a 2.6.10 UML guest and want to run a 2.6.17 >> guest you may (but not nessasarily must) have to upgrade the host to 2.6.17 >> or later. > > Why might you have to do that?take this with a grain of salt, I''m not saying the particular versions I''m listing would require this if your new guest kernel wants to use some new feature (SKAS3, time virtualization, etc) but the older host kernel didn''t support some system call nessasary to implement it, you may need to upgrade the host kernel to one that provides the new features. David Lang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Lang wrote:> so if I understand this correctly we are saying that a kernel compiled > to run on hypervisor A would need to be recompiled to run on > hypervisor B, and recompiled again to run on hypervisor C, etc > > where A could be bare hardware, B could be Xen 2, C could be Xen 3, D > could be vmware, E could be vanilla Linux, etc.Yes, but you can compile one kernel for any set of hypervisors, so if you want both Xen and VMI, then compile both in. (You always get bare hardware support.)> this sounds like something that the distros would not support, they > would pick their one hypervisor to support and leave out the others. > the big problem with this is that the preferred hypervisor will change > over time and people will be left with incompatable choices (or having > to compile their own kernels, including having to recompile older > kernels to support newer hypervisors)Why? That''s like saying that distros will only bother to compile in one scsi driver. The hypervisor driver is tricker than a normal kernel device driver, because in general it needs to be present from very early in boot, which precludes it from being a normal module. There''s hope that we''ll be able to support hypervisor drivers as boot-time grub/multiboot modules, so you''ll be able to compile up a new hypervisor driver for a particular kernel and use it without recompiling the whole thing. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 4 Aug 2006, Jeremy Fitzhardinge wrote:>> so if I understand this correctly we are saying that a kernel compiled to >> run on hypervisor A would need to be recompiled to run on hypervisor B, and >> recompiled again to run on hypervisor C, etc >> >> where A could be bare hardware, B could be Xen 2, C could be Xen 3, D could >> be vmware, E could be vanilla Linux, etc. > > Yes, but you can compile one kernel for any set of hypervisors, so if you > want both Xen and VMI, then compile both in. (You always get bare hardware > support.)how can I compile in support for Xen4 on my 2.6.18 kernel? after all xen 2 and xen3 are incompatable hypervisors so why wouldn''t xen4 (and I realize there is no xen4 yet, but there is likly to be one during the time virtual servers created with 2.6.18 are still running)>> this sounds like something that the distros would not support, they would >> pick their one hypervisor to support and leave out the others. the big >> problem with this is that the preferred hypervisor will change over time >> and people will be left with incompatable choices (or having to compile >> their own kernels, including having to recompile older kernels to support >> newer hypervisors) > > Why? That''s like saying that distros will only bother to compile in one scsi > driver. > > The hypervisor driver is tricker than a normal kernel device driver, because > in general it needs to be present from very early in boot, which precludes it > from being a normal module. There''s hope that we''ll be able to support > hypervisor drivers as boot-time grub/multiboot modules, so you''ll be able to > compile up a new hypervisor driver for a particular kernel and use it without > recompiling the whole thing.distros don''t offer kernels with all options today, why would they in the future (how many distros offer seperate 486/586/K6/K7/Pentium/P2/P3/P4 kernels, none. they offer a least-common denominator kernel or two instead) I also am missing something here. how can a system be compiled to do several different things for the same privilaged opcode (including running that opcode) without turning that area of code into a performance pig as it checks for each possible hypervisor being present? David Lang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Lang wrote:> how can I compile in support for Xen4 on my 2.6.18 kernel? after all > xen 2 and xen3 are incompatable hypervisors so why wouldn''t xen4 (and > I realize there is no xen4 yet, but there is likly to be one during > the time virtual servers created with 2.6.18 are still running)Firstly, backwards compatibility is very important; I would guess that if there were a Xen4 ABI, the hypervisor would still support Xen3 for some time. Secondly, if someone goes to the effort of backporting a Xen4 paravirtops driver for 2.6.18, then you could compile it in.> I also am missing something here. how can a system be compiled to do > several different things for the same privilaged opcode (including > running that opcode) without turning that area of code into a > performance pig as it checks for each possible hypervisor being present?Conceptually, the paravirtops structure is a structure of pointers to functions which get filled in at runtime to support whatever hypervisor we''re running over. But it also has the means to patch inline versions of the appropriate code sequences for performance-critical operations. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Aug 04, 2006 at 02:26:20PM -0700, Jeremy Fitzhardinge wrote:> >I also am missing something here. how can a system be compiled to do > >several different things for the same privilaged opcode (including > >running that opcode) without turning that area of code into a > >performance pig as it checks for each possible hypervisor being present? > > Conceptually, the paravirtops structure is a structure of pointers to > functions which get filled in at runtime to support whatever hypervisor > we''re running over. But it also has the means to patch inline versions > of the appropriate code sequences for performance-critical operations.Perhaps Ulrich and Jakub should join this discussion, as the whole thing sounds like a rehash of the userland ld.so + glibc versioned ABI. glibc has weathered 64-bit LFS changes to open(), SYSENTER, and vdso. Isn''t this discussion entirely analogous (except for the patching of performance critical sections, perhaps) to taking a binary compiled against glibc-2.0 back on Linux-2.2 and running it on glibc-2.4 + 2.6.17? Or OpenSolaris, for that matter? Bill Rugolsky _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Aug 04, 2006 at 12:49:13PM -0700, David Lang wrote:> >Why might you have to do that? > > take this with a grain of salt, I''m not saying the particular versions I''m > listing would require this > > if your new guest kernel wants to use some new feature (SKAS3, time > virtualization, etc) but the older host kernel didn''t support some system > call nessasary to implement it, you may need to upgrade the host kernel to > one that provides the new features.OK, yeah. Just making sure you weren''t thinking that the UML and host versions were tied together (although a modern distro won''t boot on a 2.6 UML on a 2.4 host because UML''s TLS needs TLS support on the host...). Jeff _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Lang wrote:> On Fri, 4 Aug 2006, Arjan van de Ven wrote: > >> David Lang wrote: >>> I''m not commenting on any of the specifics of the interface calls (I >>> trust you guys to make that be sane :-) I''m just responding the the >>> idea that the interface actually needs to be locked down to an ABI as >>> opposed to just source-level compatability. >> >> you are right that the interface to the HV should be stable. But those >> are going >> to be specific to the HV, the paravirt_ops allows the kernel to >> smoothly deal >> with having different HV''s. >> So in a way it''s an API interface to allow the kernel to deal with >> multiple >> different ABIs that exist today and will in the future. > > so if I understand this correctly we are saying that a kernel compiled > to run on hypervisor A would need to be recompiled to run on hypervisor > B, and recompiled again to run on hypervisor C, etc >no the actual implementation of the operation structure is dynamic and can be picked at runtime, so you can compile a kernel for A,B *and* C and at runtime the kernel picks the one you have _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 4 Aug 2006, Jeff Dike wrote:> On Fri, Aug 04, 2006 at 12:49:13PM -0700, David Lang wrote: >>> Why might you have to do that? >> >> take this with a grain of salt, I''m not saying the particular versions I''m >> listing would require this >> >> if your new guest kernel wants to use some new feature (SKAS3, time >> virtualization, etc) but the older host kernel didn''t support some system >> call nessasary to implement it, you may need to upgrade the host kernel to >> one that provides the new features. > > OK, yeah. > > Just making sure you weren''t thinking that the UML and host versions > were tied together (although a modern distro won''t boot on a 2.6 UML > on a 2.4 host because UML''s TLS needs TLS support on the host...).this is exactly the type of thing that I think is acceptable. this is a case of a new client needing a new host. if you have a server running a bunch of 2.4 UMLs on a 2.4 host and want to add a 2.6 UML you can do it becouse you can shift to a buch of 2.4 UMLs (plus one 2.6 UML) running on a 2.6 host. what I would be bothered by was if you weren''t able to run a 2.4 UML on a 2.6 host becouse you have locked out the upgrade path Everyone needs to remember that this sort of thing does happen, Xen2 clients cannot run on a Xen3 host. David Lang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 4 Aug 2006, Arjan van de Ven wrote:>> >> so if I understand this correctly we are saying that a kernel compiled to >> run on hypervisor A would need to be recompiled to run on hypervisor B, and >> recompiled again to run on hypervisor C, etc >> > no the actual implementation of the operation structure is dynamic and can be > picked > at runtime, so you can compile a kernel for A,B *and* C and at runtime the > kernel > picks the one you haveOk, I was under the impression that this sort of thing was frowned upon for hotpath items (which I understand a good chunk of this would be). this still leaves the question of old client on new hypervisors that is continueing in other branches of this thread. David Lang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel