Jeremy Fitzhardinge
2007-Apr-18 17:49 UTC
[RFC] First (incomplete) cut of Xen paravirt binding
I've updated the patches at http://ozlabs.org/~rusty/paravirt/?mf=33ba6c4fce13;path=/ to carve out the basic shape of how I see all this fitting together. These patches implement an initial set of Xen paravirt ops, as well as adapting head.S to set up a Xen-specific entrypoint. The head.S code does absolutely minimal setup, and then calls xen_start_kernel(). This installs the Xen paravirt ops, does some CPUID setup which would normally be done in head.S, calls set_fixaddr_top(), and then calls the normal start_kernel(). These patches compile, but are completely untested (though I recently booted it native on qemu, so it isn't completely broken). The Xen entrypoints are very incomplete; so far I've only done the trivial ones, including the ones which end up just using the native nopara implementations (which I've exported out of paravirt.c). I'm about to start on implementing the more substantial paravirt ops relating to interrupts, segmentation, and everything else needed to get to a minimally functioning kernel. I also haven't really gone over the list of paravirt ops in detail to see if they're really what we want; I figure that will come up as I keep adapting Xen to the interface. But an obvious seems to be we should have explicit flush_tlb/multicast_flush_tlb calls rather than simply relying on reloading cr3. Comments? Does this look like the way we want to go? So far it has been coming together very nicely... J (Rusty, it would be convienient if you enabled .tar.gz/.zip downloading in the hg server: put "allow_archive = gz, zip" in the [web] part of hgrc.)
On Wed, 2006-07-26 at 10:56 -0700, Jeremy Fitzhardinge wrote:> I've updated the patches at > http://ozlabs.org/~rusty/paravirt/?mf=33ba6c4fce13;path=/ to carve out > the basic shape of how I see all this fitting together. > > These patches implement an initial set of Xen paravirt ops, as well as > adapting head.S to set up a Xen-specific entrypoint. The head.S code > does absolutely minimal setup, and then calls xen_start_kernel(). This > installs the Xen paravirt ops, does some CPUID setup which would > normally be done in head.S, calls set_fixaddr_top(), and then calls the > normal start_kernel().Cool! I want to make three changes to this over time: 1) Copy the ops structure in the asm, based on value of %ebx (0 == xen, etc). Only copy the non-NULL entries, to make implementing ops simple (eg. Xen doesn't want to override all ops). Xen wants %esi, so I might have to move that to %eax: I'll see how it works out. 2) Call *paravirt_ops.init rather than hardcoded xen_start_kernel. 3) Rename from xen-head.S to paravirt-head.S.> I also haven't really gone over the list of paravirt ops in detail to > see if they're really what we want; I figure that will come up as I keep > adapting Xen to the interface. But an obvious seems to be we should > have explicit flush_tlb/multicast_flush_tlb calls rather than simply > relying on reloading cr3.Yep, and I thought about set_tss_desc, rather than lower-level ops, because Xen doesn't want it at all. But see how you go..> Comments? Does this look like the way we want to go? So far it has > been coming together very nicely...Agreed!> (Rusty, it would be convienient if you enabled .tar.gz/.zip downloading > in the hg server: put "allow_archive = gz, zip" in the [web] part of hgrc.)Done (your instructions were not quite right, but the man page worked wonders!). Thanks, Rusty. -- Help! Save Australia from the worst of the DMCA: http://linux.org.au/law
Jeremy Fitzhardinge
2007-Apr-18 17:49 UTC
[RFC] First (incomplete) cut of Xen paravirt binding
Jeremy Fitzhardinge wrote:> I've updated the patches at > http://ozlabs.org/~rusty/paravirt/?mf=33ba6c4fce13;path=/ to carve out > the basic shape of how I see all this fitting together.Oh, I meant to point out that obviously I intend this posting to be a preview rather than soliciting a full review, but I think we'll start sending out patches for full review very soon. The structure of the patch series is as follows: Generic patches which are fairly uncontroversial, and are essentially the same as previously posted. For the most part, they don't do anything other than provide some structure for later patches to use: 001-apply-to-page-range.patch 001a-reboot-use-struct.patch 002-sync-bitops.patch 003-remove-ring0-assumptions.patch 004-abstract-asm.patch 005-cpuid-cleanup.patch unfix-fixmap.patch The basic pieces of the paravirtualization layer, including an implementation which runs on native hardware. While the details of this will change over time, these patches produce a working kernel, and so are pretty close to being submittable - there's nothing like a concrete example to get people interested... 006-paravirt_header.patch 007-paravirt-descriptor-ops.patch 008-paravirt-structure.patch 009-binary-patch.patch 010-paravirt-config-deps.patch The final set is the Xen binding to these interfaces. It's going to be a while before these patches produce a booting kernel, but I've already made some structural design decisions which people will likely be interested in, particularly in 024-head.patch and 020-paravirt-xen.patch. 021-vsyscall-note.patch 022-config-xen.patch 023-xen-interface.patch 024-head.patch 024-hypercall-interface.patch 020-paravirt-xen.patch Thanks, J
Christian Limpach
2007-Apr-18 17:49 UTC
[RFC] First (incomplete) cut of Xen paravirt binding
> From: virtualization-bounces@lists.osdl.org > [mailto:virtualization-bounces@lists.osdl.org] On Behalf Of > Eric W. Biederman > > What I don't want to do is have a lot of variation in /sbin/kexec > for loading linux under different hypervisors for no particular > reason.That's very reasonable, but at the same time: What we don't want to do is have a lot of variation in our domain builder for loading different operating systems for no particular reason. And also, I think there are much better paravirtual approaches to kexec in a paravirtual environment: - kexec proper is best done by putting the image in memory and then having the control plane build a new domain using this image - kexec/kdump is best done by dumping the memory from the control plane And kexec of the control domain requires hypervisor support anyway.> We can easily stuff a hypervisor id in the parameter block. So > we can do: "paravirts[%esi->hypervisor]->init();"If you make %esi->hypervisor a 32bit ident and add a lookup of %esi->hypervisor by comparing it to paravirts.ident for each compiled in paravirts ops structure, then this starts to look quite good. If given the choice, I would still choose the per-hypervisor entry point though... Christian
Christian Limpach
2007-Apr-18 17:49 UTC
[RFC] First (incomplete) cut of Xen paravirt binding
> From: virtualization-bounces@lists.osdl.org > [mailto:virtualization-bounces@lists.osdl.org] On Behalf Of > Rusty Russell > > 2) The hypervisor type is handed through %ebx to the startup_paravirt > function at boot. Currently 0 = Xen 3.0, 1 = VMI.I think this is a really really bad idea because you're requiring that all operating systems agree on these values or you impose it upon the hypervisor's domain builder to be operating system specific. I really don't think either of those are very appealing, especially considering that there's a much simpler solution using hypervisor specific entry points. That said, %ebx = 0 works for us, so maybe I should just not care... Christian
Christian Limpach
2007-Apr-18 17:49 UTC
[RFC] First (incomplete) cut of Xen paravirt binding
> > That's very reasonable, but at the same time: > > What we don't want to do is have a lot of variation in our domain > > builder for loading different operating systems for no particular > > reason. > > Sure. A reasonable concern. Do you have an ELF format > windows kernel?Of course not, but we're also not trying to run Windows fully paravirtualized.> My point being this is a fundamental point where variation happens. > No boot-loader has been able to see OS vendors on a single format. > I don't expect Xen will be able to change this trend.We've managed to get NetBSD, FreeBSD, Solaris and recently even Plan9 to agree on using ELF with our __xen_guest section extension. Christian