thr3ads.net - Xen devel - [Xen-devel] Re: A proposal

If this information is useful, please help other people find it:
Share via:

Jeremy Fitzhardinge

2006-Aug-04 02:52 UTC

[Xen-devel] Re: A proposal - binary

Greg KH wrote:> On Thu, Aug 03, 2006 at 12:26:16PM -0700, Zachary Amsden wrote:
>   
>> Who said that?  Please smack them on the head with a broom.  We are all
>> actively working on implementing Rusty''s paravirt-ops
proposal.  It
>> makes the API vs ABI discussion moot, as it allow for both.
>>     
>
> So everyone is still skirting the issue, oh great :)
>   
I don''t really think there''s an issue to be skirted here.  The
current
plan is to design and implement a paravirt_ops interface, which is a 
typical Linux source-level interface between the bulk of the kernel and 
a set of hypervisor-specific backends.  Xen, VMWare and other interested 
parties are working together on this interface to make sure it meets 
everyone''s needs (and if you have another hypervisor you''d
like to
support with this interface, we want to hear from you).

Until VMWare proposed VMI, Xen was the only hypervisor needing support, 
so it was reasonable that the Xen patches just go straight to Xen.  But 
with paravirtops the result will be more flexible, since a kernel will 
be configurable to run on any combination of supported hypervisor or on 
bare hardware.

As far as I''m concerned, the issue of whether VMI has a stable ABI or 
not is one which on the VMI side of the paravirtops interface, and it 
doesn''t have any wider implications.

Certainly Xen will maintain a backwards compatible hypervisor interface 
for as long as we want/need to, but that''s a matter for our side of 
paravirtops.  And the paravirtops interface will change over time as the 
kernel does, and the backends will be adapted to match, either using the 
same ABI to the underlying hypervisor, or an expanded one, or whatever; 
it doesn''t matter as far as the rest of the kernel is concerned.

There''s the other question of whether VMI is a suitable interface for 
Xen, making the whole paravirt_ops exercise redundant.  Zach and VMWare 
are claiming to have a VMI binding to Xen which is full featured with 
good performance.  That''s an interesting claim, and I don''t
doubt that
its somewhat true.  However, they haven''t released either code for this
interface or detailed performance results, so its hard to evaluate.  And 
with anything in this area, its always the details that matter: what 
tests, on what hardware, at what scale?  Does VMI really expose all of 
Xen''s features, or does it just use a bare-minimum subset to get things
going?  And how does the interface fit with short and long term design 
goals?

I don''t think anybody is willing to answer these questions with any 
confidence.  VMWare''s initial VMI proposal was very geared towards
their
particular hypervisor architecture; it has been modified over time to be 
a little closer to Xen''s model, in order to efficiently support the Xen
binding.  But Xen and ESX have very different designs and underlying 
philosophies, so I wouldn''t expect a single interface to fit
comfortably
with either.

As far as LKML is concerned, the only interface which matters is the 
Linux -> <something> interface, which is defined within the scope of
the
Linux development process.  That''s what paravirt_ops is intended to be.

And being a Linux API, paravirt_ops can avoid duplicating other Linux 
interfaces. For example, VMI, like the Xen hypervisor interface, need 
various ways to deal with time.  The rest of the kernel needn''t know or
care about those interfaces, because the paravirt backend for each can 
also register a clocksource, or use other kernel APIs to expose that 
interface (some of which we''ll probably develop/expand over time as 
needed, but in the normal way kernel interfaces chance).

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Rusty Russell

2006-Aug-04 05:04 UTC

head link

[Xen-devel] Re: A proposal - binary

On Thu, 2006-08-03 at 21:18 -0700, Andrew Morton wrote:> > As far as LKML is concerned, the only interface which matters is the 
> > Linux -> <something> interface, which is defined within the
scope of the
> > Linux development process.  That''s what paravirt_ops is
intended to be.
> 
> I must confess that I still don''t "get" paravirtops. 
AFACIT the VMI
> proposal, if it works, will make that whole layer simply go away.  Which
> is attractive.  If it works.
Everywhere in the kernel where we have multiple implementations we want
to select at runtime, we use an ops struct.  Why should the choice of
Xen/VMI/native/other be any different?

Yes, we could force native and Xen to work via VMI, but the result would
be less clear, less maintainable, and gratuitously different from
elsewhere in the kernel.  And, of course, unlike paravirt_ops where we
can change and add ops at any time, we can''t similarly change the VMI
interface because it''s an ABI (that''s the point: the
hypervisor can
provide the implementation).

I hope that clarifies,
Rusty.
-- 
Help! Save Australia from the worst of the DMCA: http://linux.org.au/law


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2006-Aug-04 05:40 UTC

head link

[Xen-devel] Re: A proposal - binary

* Andrew Morton (akpm@osdl.org) wrote:> I must confess that I still don''t "get" paravirtops. 
AFACIT the VMI
> proposal, if it works, will make that whole layer simply go away.  Which
> is attractive.  If it works.
Paravirtops is simply a table of function which are populated by the
hypervisor specific code at start-of-day.  Some care is taken to patch
up callsites which are performance sensitive.  The main difference is
the API vs. ABI distinction.  In paravirt ops case, the ABI is defined at
compile time from source.  The VMI takes it one step further and fixes
the ABI.  That last step is a big one.

There are two basic issues. 1) what is the interface between the kernel
and the glue to a hypervisor. 2) how does one call from the kernel into
the glue layer.

Getting bogged down in #2, the details of the calling convention, is a
distraction from the real issue, #1.  We are trying to actually find an
API that is useful for multiple projects.  Paravirt_ops gives the
flexibility to evolve the interface.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2006-Aug-04 07:01 UTC

head link

[Xen-devel] Re: A proposal - binary

* Antonio Vargas (windenntw@gmail.com) wrote:> One feature I found missing at the paravirt patches is to allow the
> user to forbid the use of paravirtualization of certain features (via
> a bitmask on the kernel commandline for example) so that the execution
> drops into the native hardware virtualization system. Such a feature
There is no native harware virtualization system in this picture.  Maybe
I''m just misunderstanding you.
> would provide a big upwards compatibility for the kernel<->hypervisor
> system. The case for this would be needing to forcefully upgrade the
> hypervisor due to security issues and finding out that the hypervisor
> is  incompatible at the paravirtualizatrion level, then the user would
> be at least capable of continuing to run the old kernel with the new
> hypervisor until the compatibility is reached again.
This seems a bit like a trumped up example, as randomly disabling a part
of the pv interface is likely to cause correctness issues, not just
performance degradation.

Hypervisor compatibility is a slightly separate issue here.  There''s
two
interfaces.  The linux paravirt interface is internal to the kernel.
The hypervisor interface is external to the kernel.

kernel <--pv interface--> paravirt glue layer <--hv interface-->
hypervisor

So changes to the hypervisor must remain ABI compatible to continue
working with the same kernel.  This is the same requirement the kernel
has with the syscall interface it provides to userspace.
> BTW, what is the recommended distro or kernel setup to help testing
> the latest paravirt patches? I''ve got a spare machine (with no
needed
> data) at hand which could be put to good use.
Distro of choice.  Current kernel with the pv patches[1], but be
forewarned, they are very early, and not fully booting.

thanks,
-chris

[1] mercurial patchqueue http://ozlabs.org/~rusty/paravirt/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Rusty Russell

2006-Aug-04 07:04 UTC

head link

[Xen-devel] Re: A proposal - binary

On Thu, 2006-08-03 at 22:53 -0700, Andrew Morton wrote:> On Fri, 04 Aug 2006 15:04:35 +1000
> Rusty Russell <rusty@rustcorp.com.au> wrote:
> 
> > On Thu, 2006-08-03 at 21:18 -0700, Andrew Morton wrote:
> > Everywhere in the kernel where we have multiple implementations we
want
> > to select at runtime, we use an ops struct.  Why should the choice of
> > Xen/VMI/native/other be any different?
> 
> VMI is being proposed as an appropriate way to connect Linux to Xen.  If
> that is true then no other glue is needed.
Sorry, this is wrong.  VMI was proposed as the appropriate way to
connect Linux to Xen, *and* native, *and* VMWare''s hypervisors (and
others).  This way one Linux binary can boot on all three, using
different VMI blobs.
> > Yes, we could force native and Xen to work via VMI, but the result
would
> > be less clear, less maintainable, and gratuitously different from
> > elsewhere in the kernel.
> 
> I suspect others would disagree with that.  We''re at the stage of
needing
> to see code to settle this.
Wrong again.  We''ve *seen* the code for VMI, and fairly hairy.  Seeing
the native-implementation and Xen-implementation VMI blobs will not make
it less hairy!
> >  And, of course, unlike paravirt_ops where we
> > can change and add ops at any time, we can''t similarly change
the VMI
> > interface because it''s an ABI (that''s the point: the
hypervisor can
> > provide the implementation).
> 
> hm.  Dunno.  ABIs can be uprevved.  Perhaps.
Certainly VMI can be.  But I''d prefer to leave the excellent hackers at
VMWare with the task of maintaining their ABI, and let Linux hackers
(most of whom will run native) manipulate paravirt_ops freely.

We''re not good at maintaining ABIs.  We''re going to be
especially bad at
maintaining an ABI when the 99% of us running native will never notice
the breakage.

Hope that clarifies,
Rusty.
-- 
Help! Save Australia from the worst of the DMCA: http://linux.org.au/law


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2006-Aug-04 07:37 UTC

head link

[Xen-devel] Re: A proposal - binary

* Antonio Vargas (windenntw@gmail.com) wrote:> What I was refering with "native hardware virtualization" is just
the
> VT or Pacitifica -provided trapping into the hypervisor upon executing
> "dangerous" instructions such as tlb-flushes, reading/setting the
> current ring-level, cli/sti...
We are not talking about VMX or AMDV.  Just plain ol'' x86 hardware.
> Yes, maybe just providing a switch to force paravirtops to use the
> native hardware implementation would be enough, or just in case,
> making the default the native hardware and allowing the kernel
> commandline to select another one (just like on io-schedulers)
In this case native hardware == running on bare metal w/out VMX/AMDV
support and w/out any hypervisor.  So, while this would let you actually
boot the machine, it''s probably not really useful for the case you
cited
(security update to hypervisor causes ABI breakage) because you''d be
booting a normal kernel w/out any virtualization.  IOW, all the virtual
machines that were running on that physical machine would not be running.
> Yes. What I propose is allowing the systems to continue running (only
> with degraded performance) when the hv-interface between the running
> kernel and the running hypervisor doesn''t match.
This is non-trivial.  If the hv-interface breaks the ABI, then you''d
need to update the pv-glue layer in the kernel.
> >> BTW, what is the recommended distro or kernel setup to help
testing
> >> the latest paravirt patches? I''ve got a spare machine
(with no needed
> >> data) at hand which could be put to good use.
> >
> >Distro of choice.  Current kernel with the pv patches[1], but be
> >forewarned, they are very early, and not fully booting.
> 
> Thanks, will be setting it up :)
Thanks for helping.
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Rusty Russell

2006-Aug-04 08:29 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, 2006-08-04 at 00:21 -0700, Andrew Morton wrote:> On Fri, 04 Aug 2006 17:04:59 +1000
> Rusty Russell <rusty@rustcorp.com.au> wrote:
> 
> > On Thu, 2006-08-03 at 22:53 -0700, Andrew Morton wrote:
> > > VMI is being proposed as an appropriate way to connect Linux to
Xen.  If
> > > that is true then no other glue is needed.
> > 
> > Sorry, this is wrong.
> 
> It''s actually 100% correct.
Err, yes.  I actually misrepresented VMI: the native implementation is
inline (ie. no blob is required for native).  Bad Rusty.
> > > > Yes, we could force native and Xen to work via VMI, but the
result would
> > > > be less clear, less maintainable, and gratuitously different
from
> > > > elsewhere in the kernel.
> > > 
> > > I suspect others would disagree with that.  We''re at the
stage of needing
> > > to see code to settle this.
> > 
> > Wrong again.
> 
> I was referring to the VMI-for-Xen code.
I know.  And I repeat, we don''t have to see that part, to know that the
result is less clear, less maintainable and gratuitously different from
elsewhere in the kernel than the paravirt_ops approach.  We''ve seen
paravirt and the VMI parts of this already.
> >  We''ve *seen* the code for VMI, and fairly hairy.
> 
> I probably slept through that discussion - I don''t recall that
things were
> that bad.   Do you recall the Subject: or date?
Read the patches which Zach sent back in March, particularly:

[RFC, PATCH 3/24] i386 Vmi interface definition
[RFC, PATCH 4/24] i386 Vmi inline implementation
[RFC, PATCH 5/24] i386 Vmi code patching

If you want to hack on x86 arch code, you''d need to understand these.

Then to see the paravirt patches go to http://ozlabs.org/~rusty/paravirt
and look at the approximately-equivalent paravirt_ops patches:

	008-paravirt-structure.patch
	009-binary-patch.patch

There''s nothing in those paravirt_ops patches which will surprise any
kernel hacker.  That''s my entire point: maintainable, unsurprising,
clear.

Rusty.
-- 
Help! Save Australia from the worst of the DMCA: http://linux.org.au/law


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

David Lang

2006-Aug-04 16:57 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, 4 Aug 2006, Rusty Russell wrote:
> On Thu, 2006-08-03 at 22:53 -0700, Andrew Morton wrote:
>> On Fri, 04 Aug 2006 15:04:35 +1000
>> Rusty Russell <rusty@rustcorp.com.au> wrote:
>>
>>> On Thu, 2006-08-03 at 21:18 -0700, Andrew Morton wrote:
>>> Everywhere in the kernel where we have multiple implementations we
want
>>> to select at runtime, we use an ops struct.  Why should the choice
of
>>> Xen/VMI/native/other be any different?
>>
>> VMI is being proposed as an appropriate way to connect Linux to Xen. 
If
>> that is true then no other glue is needed.
>
> Sorry, this is wrong.  VMI was proposed as the appropriate way to
> connect Linux to Xen, *and* native, *and* VMWare''s hypervisors
(and
> others).  This way one Linux binary can boot on all three, using
> different VMI blobs.
>
>>> Yes, we could force native and Xen to work via VMI, but the result
would
>>> be less clear, less maintainable, and gratuitously different from
>>> elsewhere in the kernel.
>>
>> I suspect others would disagree with that.  We''re at the stage
of needing
>> to see code to settle this.
>
> Wrong again.  We''ve *seen* the code for VMI, and fairly hairy. 
Seeing
> the native-implementation and Xen-implementation VMI blobs will not make
> it less hairy!
>
>>>  And, of course, unlike paravirt_ops where we
>>> can change and add ops at any time, we can''t similarly
change the VMI
>>> interface because it''s an ABI (that''s the point:
the hypervisor can
>>> provide the implementation).
>>
>> hm.  Dunno.  ABIs can be uprevved.  Perhaps.
>
> Certainly VMI can be.  But I''d prefer to leave the excellent
hackers at
> VMWare with the task of maintaining their ABI, and let Linux hackers
> (most of whom will run native) manipulate paravirt_ops freely.
>
> We''re not good at maintaining ABIs.  We''re going to be
especially bad at
> maintaining an ABI when the 99% of us running native will never notice
> the breakage.
some questions from a user. pleas point out where I am misunderstanding things.

one of the big uses of virtualization will be to run things in sandboxes, when 
people do this they typicaly migrate the sandbox from system to system over time
(working with chroot sandboxes I''ve seen some HUGE skews between
what''s running
in the sandbox and what''s running in the host). If the interface
between the
guest kernel and the hypervisor isn''t fixed how could somone run a
2.6.19 guest
and a 2.6.30 guest at the same time?

if it''s only a source-level API this implies that when you move your
host kernel
from 2.6.19 to 2.6.25 you would need to recompile your 2.6.19 guest kernel to 
support the modifications. where are the patches going to come from to do this?

It seems to me from reading this thread that the PowerPC and S390 have a ABI 
defined, specificly defined by the hardware in the case of PowerPC and by the 
externaly maintained, Linux-independant hypervisor (which is effectivly the 
hardware) in the case of the s390.

If there''s going to be long-term compatability between different hosts
and
guests there need some limits to what can change.

needing to uprev the host when you uprev a guest is acceptable

needing to uprev a guest when you uprev a host is not.

this basicly boils down to ''once you expose an interface to a user it
can''t
change'', with the interface that''s being exposed being the
calls that the guest
makes to the host.

David Lang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2006-Aug-04 18:38 UTC

head link

[Xen-devel] Re: A proposal - binary

David Lang wrote:> if it''s only a source-level API this implies that when you move
your
> host kernel from 2.6.19 to 2.6.25 you would need to recompile your 
> 2.6.19 guest kernel to support the modifications. where are the 
> patches going to come from to do this?
No, the low-level interface between the kernel is an ABI, which will be 
as stable as your hypervisor author/vendor wants it to be (which is 
generally "very stable").  The question is whether that low-level 
interface is exposed to the rest of the kernel directly, or hidden 
behind a kernel-internal source-level API.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Antonio Vargas

2006-Aug-04 18:46 UTC

head link

[Xen-devel] Re: A proposal - binary

On 8/4/06, David Lang <dlang@digitalinsight.com>
wrote:> On Fri, 4 Aug 2006, Rusty Russell wrote:
>
> > On Thu, 2006-08-03 at 22:53 -0700, Andrew Morton wrote:
> >> On Fri, 04 Aug 2006 15:04:35 +1000
> >> Rusty Russell <rusty@rustcorp.com.au> wrote:
> >>
> >>> On Thu, 2006-08-03 at 21:18 -0700, Andrew Morton wrote:
> >>> Everywhere in the kernel where we have multiple
implementations we want
> >>> to select at runtime, we use an ops struct.  Why should the
choice of
> >>> Xen/VMI/native/other be any different?
> >>
> >> VMI is being proposed as an appropriate way to connect Linux to
Xen.  If
> >> that is true then no other glue is needed.
> >
> > Sorry, this is wrong.  VMI was proposed as the appropriate way to
> > connect Linux to Xen, *and* native, *and* VMWare''s
hypervisors (and
> > others).  This way one Linux binary can boot on all three, using
> > different VMI blobs.
> >
> >>> Yes, we could force native and Xen to work via VMI, but the
result would
> >>> be less clear, less maintainable, and gratuitously different
from
> >>> elsewhere in the kernel.
> >>
> >> I suspect others would disagree with that.  We''re at the
stage of needing
> >> to see code to settle this.
> >
> > Wrong again.  We''ve *seen* the code for VMI, and fairly
hairy.  Seeing
> > the native-implementation and Xen-implementation VMI blobs will not
make
> > it less hairy!
> >
> >>>  And, of course, unlike paravirt_ops where we
> >>> can change and add ops at any time, we can''t
similarly change the VMI
> >>> interface because it''s an ABI (that''s the
point: the hypervisor can
> >>> provide the implementation).
> >>
> >> hm.  Dunno.  ABIs can be uprevved.  Perhaps.
> >
> > Certainly VMI can be.  But I''d prefer to leave the excellent
hackers at
> > VMWare with the task of maintaining their ABI, and let Linux hackers
> > (most of whom will run native) manipulate paravirt_ops freely.
> >
> > We''re not good at maintaining ABIs.  We''re going to
be especially bad at
> > maintaining an ABI when the 99% of us running native will never notice
> > the breakage.
>
> some questions from a user. pleas point out where I am misunderstanding
things.
asking is the smart way :)
> one of the big uses of virtualization will be to run things in sandboxes,
when
> people do this they typicaly migrate the sandbox from system to system over
time
> (working with chroot sandboxes I''ve seen some HUGE skews between
what''s running
> in the sandbox and what''s running in the host). If the interface
between the
> guest kernel and the hypervisor isn''t fixed how could somone run a
2.6.19 guest
> and a 2.6.30 guest at the same time?
>
> if it''s only a source-level API this implies that when you move
your host kernel
> from 2.6.19 to 2.6.25 you would need to recompile your 2.6.19 guest kernel
to
> support the modifications. where are the patches going to come from to do
this?
>
> It seems to me from reading this thread that the PowerPC and S390 have a
ABI
> defined, specificly defined by the hardware in the case of PowerPC and by
the
> externaly maintained, Linux-independant hypervisor (which is effectivly the
> hardware) in the case of the s390.
the trick with ppc, s390, m68k... is that they were defined since day
zero (*simplifies 68000/68010 history here*) so that when you run as
non-priviledged-task and try to execute a priviledged instruction,
then the security acts out and the OS gets control. x86 wasn''t since
they had some instructions where the non-priviledged could detect it
was so, thus barring any way of the hypervisor appearing invisible.
this is solved on x86 and x64_64 with the new extensions.
> If there''s going to be long-term compatability between different
hosts and
> guests there need some limits to what can change.
>
> needing to uprev the host when you uprev a guest is acceptable
>
> needing to uprev a guest when you uprev a host is not.
Now, allowing this transparent acting is great since you can run your
normal kernel as-is as a guest. But to get close to 100% speed, what
you do is to rewrite parts of the OS to be aware of the hypervisor,
and stablish a common way to talk.

Thus happens the work with the paravirt-ops. Just like you can use any
filesystem under linux because they have a well-defined intrface to
the rest of the kernel, the paravirt-ops are the way we are wrking to
define an interface so that the rest of the kernel can be ignorant to
whether it''s running on the bare metal or as a guest.

Then, if you needed to run say 2.6.19 with hypervisor A-1.0, you just
need to write paravirt-ops which talk and translate between 2.6.19 and
A-1.0. If 5 years later you are still running A-1.0 and want to run a
2.6.28 guest, then you would just need to write the paravirt-ops
between 2.6.28 and A-1.0, with no need to modify the rest of the code
or the hypervisor.

At the moment we only have 1 GPL hypervisor and 1 binary one. Then
maybe it''s needed to define if linux should help run under binary
hypervisors, but imagine instead of this one, we had the usual Ghyper
vs Khyper separation. We would prefer to give the same adaptations to
both of them and abstract them away just like we do with filesystems.
> this basicly boils down to ''once you expose an interface to a user
it can''t
> change'', with the interface that''s being exposed being
the calls that the guest
> makes to the host.
Yes, that''s the reason some mentioned ppc, sparc, s390... because they
have been doing this longer than us and we could consider adopting
some of their designs (just like we did for POSIX system calls ;)
> David Lang
-- 
Greetz, Antonio Vargas aka winden of network

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

David Lang

2006-Aug-04 19:06 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, 4 Aug 2006, Antonio Vargas wrote:
>> If there''s going to be long-term compatability between
different hosts and
>> guests there need some limits to what can change.
>> 
>> needing to uprev the host when you uprev a guest is acceptable
>> 
>> needing to uprev a guest when you uprev a host is not.
>
> Now, allowing this transparent acting is great since you can run your
> normal kernel as-is as a guest. But to get close to 100% speed, what
> you do is to rewrite parts of the OS to be aware of the hypervisor,
> and stablish a common way to talk.
I understand this, but for example a UML 2.6.10 kernel will continue to run 
unmodified on top of a 2.6.17 kernel, the ABI used is stable. however if you 
have a 2.6.10 host with a 2.6.10 UML guest and want to run a 2.6.17 guest you 
may (but not nessasarily must) have to upgrade the host to 2.6.17 or later.
> Thus happens the work with the paravirt-ops. Just like you can use any
> filesystem under linux because they have a well-defined intrface to
> the rest of the kernel, the paravirt-ops are the way we are wrking to
> define an interface so that the rest of the kernel can be ignorant to
> whether it''s running on the bare metal or as a guest.
>
> Then, if you needed to run say 2.6.19 with hypervisor A-1.0, you just
> need to write paravirt-ops which talk and translate between 2.6.19 and
> A-1.0. If 5 years later you are still running A-1.0 and want to run a
> 2.6.28 guest, then you would just need to write the paravirt-ops
> between 2.6.28 and A-1.0, with no need to modify the rest of the code
> or the hypervisor.
who is going to be writing all these interface layers to connect each kernel 
version to each hypervisor version. and please note, I am not just considering 
Xen and vmware as hypervisors, a vanilla linux kernel is the hypervisor for UML.
so just stating that the hypervisor maintainers need to do this is implying that
the kernel maintainers would be required to do this.

also I''m looking at the more likly case that 5 years from now you may
still be
runnint 2.6.19, but need to upgrade to hypervisor A-5.8 (to support a different 
client). you don''t want to have to try and recompile the 2.6.19 kernel
to keep
useing it.
> At the moment we only have 1 GPL hypervisor and 1 binary one. Then
> maybe it''s needed to define if linux should help run under binary
> hypervisors, but imagine instead of this one, we had the usual Ghyper
> vs Khyper separation. We would prefer to give the same adaptations to
> both of them and abstract them away just like we do with filesystems.
you have three hypervisors that I know of. Linux, Xen (multiple versions) , and 
VMware. each with (mostly) incompatable guests
>> this basicly boils down to ''once you expose an interface to a
user it can''t
>> change'', with the interface that''s being exposed
being the calls that the
>> guest
>> makes to the host.
>
> Yes, that''s the reason some mentioned ppc, sparc, s390... because
they
> have been doing this longer than us and we could consider adopting
> some of their designs (just like we did for POSIX system calls ;)
I''m not commenting on any of the specifics of the interface calls (I
trust you
guys to make that be sane :-) I''m just responding the the idea that the
interface actually needs to be locked down to an ABI as opposed to just 
source-level compatability.

David Lang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Arjan van de Ven

2006-Aug-04 19:26 UTC

head link

[Xen-devel] Re: A proposal - binary

David Lang wrote:> I''m not commenting on any of the specifics of the interface calls
(I
> trust you guys to make that be sane :-) I''m just responding the
the idea
> that the interface actually needs to be locked down to an ABI as opposed 
> to just source-level compatability.
you are right that the interface to the HV should be stable. But those are going
to be specific to the HV, the paravirt_ops allows the kernel to smoothly deal
with having different HV''s.
So in a way it''s an API interface to allow the kernel to deal with
multiple
different ABIs that exist today and will in the future.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeff Dike

2006-Aug-04 19:45 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, Aug 04, 2006 at 12:06:28PM -0700, David Lang
wrote:> I understand this, but for example a UML 2.6.10 kernel will continue to run
> unmodified on top of a 2.6.17 kernel, the ABI used is stable. however if 
> you have a 2.6.10 host with a 2.6.10 UML guest and want to run a 2.6.17 
> guest you may (but not nessasarily must) have to upgrade the host to 2.6.17
> or later.
Why might you have to do that?

				Jeff

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

David Lang

2006-Aug-04 19:45 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, 4 Aug 2006, Arjan van de Ven wrote:
> David Lang wrote:
>> I''m not commenting on any of the specifics of the interface
calls (I trust
>> you guys to make that be sane :-) I''m just responding the the
idea that the
>> interface actually needs to be locked down to an ABI as opposed to just
>> source-level compatability.
>
> you are right that the interface to the HV should be stable. But those are 
> going
> to be specific to the HV, the paravirt_ops allows the kernel to smoothly
deal
> with having different HV''s.
> So in a way it''s an API interface to allow the kernel to deal with
multiple
> different ABIs that exist today and will in the future.
so if I understand this correctly we are saying that a kernel compiled to run on
hypervisor A would need to be recompiled to run on hypervisor B, and recompiled 
again to run on hypervisor C, etc

where A could be bare hardware, B could be Xen 2, C could be Xen 3, D could be 
vmware, E could be vanilla Linux, etc.

this sounds like something that the distros would not support, they would pick 
their one hypervisor to support and leave out the others. the big problem with 
this is that the preferred hypervisor will change over time and people will be 
left with incompatable choices (or having to compile their own kernels, 
including having to recompile older kernels to support newer hypervisors)

David Lang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

David Lang

2006-Aug-04 19:49 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, 4 Aug 2006, Jeff Dike wrote:
> On Fri, Aug 04, 2006 at 12:06:28PM -0700, David Lang wrote:
>> I understand this, but for example a UML 2.6.10 kernel will continue to
run
>> unmodified on top of a 2.6.17 kernel, the ABI used is stable. however
if
>> you have a 2.6.10 host with a 2.6.10 UML guest and want to run a 2.6.17
>> guest you may (but not nessasarily must) have to upgrade the host to
2.6.17
>> or later.
>
> Why might you have to do that?
take this with a grain of salt, I''m not saying the particular versions
I''m
listing would require this

if your new guest kernel wants to use some new feature (SKAS3, time 
virtualization, etc) but the older host kernel didn''t support some
system call
nessasary to implement it, you may need to upgrade the host kernel to one that 
provides the new features.

David Lang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2006-Aug-04 20:11 UTC

head link

[Xen-devel] Re: A proposal - binary

David Lang wrote:> so if I understand this correctly we are saying that a kernel compiled 
> to run on hypervisor A would need to be recompiled to run on 
> hypervisor B, and recompiled again to run on hypervisor C, etc
>
> where A could be bare hardware, B could be Xen 2, C could be Xen 3, D 
> could be vmware, E could be vanilla Linux, etc.
Yes, but you can compile one kernel for any set of hypervisors, so if 
you want both Xen and VMI, then compile both in.  (You always get bare 
hardware support.)
> this sounds like something that the distros would not support, they 
> would pick their one hypervisor to support and leave out the others. 
> the big problem with this is that the preferred hypervisor will change 
> over time and people will be left with incompatable choices (or having 
> to compile their own kernels, including having to recompile older 
> kernels to support newer hypervisors)
Why?  That''s like saying that distros will only bother to compile in
one
scsi driver.

The hypervisor driver is tricker than a normal kernel device driver, 
because in general it needs to be present from very early in boot, which 
precludes it from being a normal module.  There''s hope that
we''ll be
able to support hypervisor drivers as boot-time grub/multiboot modules, 
so you''ll be able to compile up a new hypervisor driver for a
particular
kernel and use it without recompiling the whole thing.


    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

David Lang

2006-Aug-04 20:31 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, 4 Aug 2006, Jeremy Fitzhardinge wrote:
>> so if I understand this correctly we are saying that a kernel compiled
to
>> run on hypervisor A would need to be recompiled to run on hypervisor B,
and
>> recompiled again to run on hypervisor C, etc
>> 
>> where A could be bare hardware, B could be Xen 2, C could be Xen 3, D
could
>> be vmware, E could be vanilla Linux, etc.
>
> Yes, but you can compile one kernel for any set of hypervisors, so if you 
> want both Xen and VMI, then compile both in.  (You always get bare hardware
> support.)
how can I compile in support for Xen4 on my 2.6.18 kernel? after all xen 2 and 
xen3 are incompatable hypervisors so why wouldn''t xen4 (and I realize
there is
no xen4 yet, but there is likly to be one during the time virtual servers 
created with 2.6.18 are still running)
>> this sounds like something that the distros would not support, they
would
>> pick their one hypervisor to support and leave out the others. the big 
>> problem with this is that the preferred hypervisor will change over
time
>> and people will be left with incompatable choices (or having to compile
>> their own kernels, including having to recompile older kernels to
support
>> newer hypervisors)
>
> Why?  That''s like saying that distros will only bother to compile
in one scsi
> driver.
>
> The hypervisor driver is tricker than a normal kernel device driver,
because
> in general it needs to be present from very early in boot, which precludes
it
> from being a normal module.  There''s hope that we''ll be
able to support
> hypervisor drivers as boot-time grub/multiboot modules, so you''ll
be able to
> compile up a new hypervisor driver for a particular kernel and use it
without
> recompiling the whole thing.
distros don''t offer kernels with all options today, why would they in
the future
(how many distros offer seperate 486/586/K6/K7/Pentium/P2/P3/P4 kernels, none. 
they offer a least-common denominator kernel or two instead)

I also am missing something here. how can a system be compiled to do several 
different things for the same privilaged opcode (including running that opcode) 
without turning that area of code into a performance pig as it checks for each 
possible hypervisor being present?

David Lang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2006-Aug-04 21:26 UTC

head link

[Xen-devel] Re: A proposal - binary

David Lang wrote:> how can I compile in support for Xen4 on my 2.6.18 kernel? after all 
> xen 2 and xen3 are incompatable hypervisors so why wouldn''t xen4
(and
> I realize there is no xen4 yet, but there is likly to be one during 
> the time virtual servers created with 2.6.18 are still running)
Firstly, backwards compatibility is very important; I would guess that 
if there were a Xen4 ABI, the hypervisor would still support Xen3 for 
some time.  Secondly, if someone goes to the effort of backporting a 
Xen4 paravirtops driver for 2.6.18, then you could compile it in.
> I also am missing something here. how can a system be compiled to do 
> several different things for the same privilaged opcode (including 
> running that opcode) without turning that area of code into a 
> performance pig as it checks for each possible hypervisor being present?
Conceptually, the paravirtops structure is a structure of pointers to 
functions which get filled in at runtime to support whatever hypervisor 
we''re running over.  But it also has the means to patch inline versions
of the appropriate code sequences for performance-critical operations.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Bill Rugolsky Jr.

2006-Aug-04 21:40 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, Aug 04, 2006 at 02:26:20PM -0700, Jeremy Fitzhardinge
wrote:> >I also am missing something here. how can a system be compiled to do 
> >several different things for the same privilaged opcode (including 
> >running that opcode) without turning that area of code into a 
> >performance pig as it checks for each possible hypervisor being
present?
> 
> Conceptually, the paravirtops structure is a structure of pointers to 
> functions which get filled in at runtime to support whatever hypervisor 
> we''re running over.  But it also has the means to patch inline
versions
> of the appropriate code sequences for performance-critical operations.
Perhaps Ulrich and Jakub should join this discussion, as the whole
thing sounds like a rehash of the userland ld.so + glibc versioned ABI.
glibc has weathered 64-bit LFS changes to open(), SYSENTER, and vdso.

Isn''t this discussion entirely analogous (except for the patching of
performance critical sections, perhaps) to taking a binary compiled
against glibc-2.0 back on Linux-2.2 and running it on glibc-2.4 + 2.6.17?
Or OpenSolaris, for that matter?

	Bill Rugolsky

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeff Dike

2006-Aug-04 21:46 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, Aug 04, 2006 at 12:49:13PM -0700, David Lang
wrote:> >Why might you have to do that?
> 
> take this with a grain of salt, I''m not saying the particular
versions I''m
> listing would require this
> 
> if your new guest kernel wants to use some new feature (SKAS3, time 
> virtualization, etc) but the older host kernel didn''t support some
system
> call nessasary to implement it, you may need to upgrade the host kernel to 
> one that provides the new features.
OK, yeah.

Just making sure you weren''t thinking that the UML and host versions
were tied together (although a modern distro won''t boot on a 2.6 UML
on a 2.4 host because UML''s TLS needs TLS support on the host...).

				Jeff

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Arjan van de Ven

2006-Aug-04 22:00 UTC

head link

[Xen-devel] Re: A proposal - binary

David Lang wrote:> On Fri, 4 Aug 2006, Arjan van de Ven wrote:
> 
>> David Lang wrote:
>>> I''m not commenting on any of the specifics of the
interface calls (I
>>> trust you guys to make that be sane :-) I''m just
responding the the
>>> idea that the interface actually needs to be locked down to an ABI
as
>>> opposed to just source-level compatability.
>>
>> you are right that the interface to the HV should be stable. But those 
>> are going
>> to be specific to the HV, the paravirt_ops allows the kernel to 
>> smoothly deal
>> with having different HV''s.
>> So in a way it''s an API interface to allow the kernel to deal
with
>> multiple
>> different ABIs that exist today and will in the future.
> 
> so if I understand this correctly we are saying that a kernel compiled 
> to run on hypervisor A would need to be recompiled to run on hypervisor 
> B, and recompiled again to run on hypervisor C, etc
> no the actual implementation of the operation structure is dynamic and can be
picked
at runtime, so you can compile a kernel for A,B *and* C and at runtime the
kernel
picks the one you have

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

David Lang

2006-Aug-04 22:40 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, 4 Aug 2006, Jeff Dike wrote:
> On Fri, Aug 04, 2006 at 12:49:13PM -0700, David Lang wrote:
>>> Why might you have to do that?
>>
>> take this with a grain of salt, I''m not saying the particular
versions I''m
>> listing would require this
>>
>> if your new guest kernel wants to use some new feature (SKAS3, time
>> virtualization, etc) but the older host kernel didn''t support
some system
>> call nessasary to implement it, you may need to upgrade the host kernel
to
>> one that provides the new features.
>
> OK, yeah.
>
> Just making sure you weren''t thinking that the UML and host
versions
> were tied together (although a modern distro won''t boot on a 2.6
UML
> on a 2.4 host because UML''s TLS needs TLS support on the host...).
this is exactly the type of thing that I think is acceptable.

this is a case of a new client needing a new host.

if you have a server running a bunch of 2.4 UMLs on a 2.4 host and want to add 
a 2.6 UML you can do it becouse you can shift to a buch of 2.4 UMLs (plus one 
2.6 UML) running on a 2.6 host.

what I would be bothered by was if you weren''t able to run a 2.4 UML on
a 2.6
host becouse you have locked out the upgrade path

Everyone needs to remember that this sort of thing does happen, Xen2 clients 
cannot run on a Xen3 host.

David Lang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

David Lang

2006-Aug-04 22:45 UTC

head link

[Xen-devel] Re: A proposal - binary

On Fri, 4 Aug 2006, Arjan van de Ven wrote:
>> 
>> so if I understand this correctly we are saying that a kernel compiled
to
>> run on hypervisor A would need to be recompiled to run on hypervisor B,
and
>> recompiled again to run on hypervisor C, etc
>> 
> no the actual implementation of the operation structure is dynamic and can
be
> picked
> at runtime, so you can compile a kernel for A,B *and* C and at runtime the 
> kernel
> picks the one you have
Ok, I was under the impression that this sort of thing was frowned upon for 
hotpath items (which I understand a good chunk of this would be).

this still leaves the question of old client on new hypervisors that is 
continueing in other branches of this thread.

David Lang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Aug 2006 - Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary

[Xen-devel] Re: A proposal - binary