Eric Zhang
2006-Nov-27 01:49 UTC
[Xen-users] Why the unmodified guest os can run on xen while hardware supports VT?
Hi, xen-users: I am a newbie of xen but I am very interest in it. I have read the documents in xen''s homepage but I still have a problem here: I know that xen will run in "ring 0" in x86''s protection mode and xen will run the guest OS in "ring 1", this is done by modifying the guest OS''s kernel codes. My question is why we shouldn''t modify the guest os''s kernel while the hardware supports VT technology -- such as a Intel''s CPU which supports VT. Thanks for any suggestions. -- Eric Zhang 2006-11-27 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Javier Guerra
2006-Nov-27 03:48 UTC
Re: [Xen-users] Why the unmodified guest os can run on xen while hardware supports VT?
On Sunday 26 November 2006 8:49 pm, Eric Zhang wrote:> I know that xen will run in "ring 0" in x86''s protection mode and > xen will run the guest OS in "ring 1", this is done by modifying the > guest OS''s kernel codes. My question is why we shouldn''t modify the > guest os''s kernel while the hardware supports VT technology -- such as a > Intel''s CPU which supports VT.forgive me if i think your question doesn''t make much sense if read textually. i guess it''s just a language limitation, this is what i believe you wanted to ask: "I know that Xen itself (the hypervisor) runs in ring 0 and the guest OS''s kernel in ring 1, to do this, it''s needed to modify the guest OS''s kernel (paravirtualization). My question is how we can run an unmodified guest OS''s kernel when the hardware supports VT technology" If this is what you want to know, let me try to explain a little: The rings mechanism implemented by Intel processors since the 80386 (now called IA32 or simply x86) lets a piece of code ''manage'' what a different code can do, simply by running the managed code in a higher ring than the manager. the manager sets exception handlers that are called each time something special happens on the managed code, like memory page faults, instruction exceptions, etc. unfortunately, x86 provides a limited set of rings (0 is the most privileged, 3 is the least); and there are some things that can only be done on ring 0. (mostrly very low level hardware accesses). therefore, usual unmodified kernels are run only at ring 0. other processors (i.e. almost anyone from IBM) provide full virtualization facilities: an hypervisor can setup an environment basically identical to the bare processor (that''s what MoL (mac on linux) does to run MacOS on top of PPC linux) existing virtualization software, like VMWare, virtualPC, and Qemu (with kqemu) manage to work by doing a very complex set of tricks: set several exception handlers to cover most of anything that the guest kernel might do, and scanning each piece of guest code before running it, replacing any instruction that would have a non-desired effect with a call to an emulator that does what _should_ happen. of course, there''s some performance penalty, and very complex code. the fact that all these emulator/virtualization hybrids actually work (and very well) is very amazing to me. the new HVM extensions to the x86 IA lets the hypervisor to setup new handlers for all the missing privileged instructions, effectively making it possible to run managed code in ring 0. i think you could think of it like creating a new ring0.5 for the guest OS''s kernel, it can do anything ring 0 does, but managed by a "real ring 0" where the hypervisor resides. the rest of the problem is the emulation of the rest of the hardware: the chipset, PCI controller, network, hard disk, cdrom, graphics card, etc. for this, the Xen hypervisor forward any hardware access to a slightly modified qemu process running in domain 0. i hope this makes it all a bit clearer, and that i''m not too wrong on some points. -- Javier _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Eric Zhang
2006-Nov-27 05:49 UTC
Re: [Xen-users] Why the unmodified guest os can run on xen while hardware supports VT?
Hi, Xen-users: Thanks, Javier. You explanation is very clear and that is exactly what I want. Sorry for my poor English and thank you again. Eric Zhang 2006-11-27 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Ulrich Windl
2006-Nov-27 08:01 UTC
Re: [Xen-users] Why the unmodified guest os can run on xen while hardware supports VT?
On 26 Nov 2006 at 22:48, Javier Guerra wrote:> The rings mechanism implemented by Intel processors since the 80386 (nowI think it was implemented in the 80286 already (the first CPu with "protected mode"). Ulrich _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Petersson, Mats
2006-Nov-27 10:29 UTC
RE: [Xen-users] Why the unmodified guest os can run on xen while hardware supports VT?
Javier, Very good explanation. Some comments below...> -----Original Message----- > From: xen-users-bounces@lists.xensource.com > [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of > Javier Guerra > Sent: 27 November 2006 03:48 > To: xen-users@lists.xensource.com > Subject: Re: [Xen-users] Why the unmodified guest os can run > on xen while hardware supports VT? > > On Sunday 26 November 2006 8:49 pm, Eric Zhang wrote: > > I know that xen will run in "ring 0" in x86''s > protection mode and > > xen will run the guest OS in "ring 1", this is done by modifying the > > guest OS''s kernel codes. My question is why we shouldn''t modify the > > guest os''s kernel while the hardware supports VT technology > -- such as a > > Intel''s CPU which supports VT.And AMD processors supporting AMD-V (aka SVM, formerly known as Pacifica).> > forgive me if i think your question doesn''t make much sense > if read textually. > i guess it''s just a language limitation, this is what i > believe you wanted to > ask: > > "I know that Xen itself (the hypervisor) runs in ring 0 and > the guest OS''s > kernel in ring 1, to do this, it''s needed to modify the guest > OS''s kernel > (paravirtualization). My question is how we can run an > unmodified guest OS''s > kernel when the hardware supports VT technology" > > If this is what you want to know, let me try to explain a little: > > The rings mechanism implemented by Intel processors since the > 80386 (now > called IA32 or simply x86) lets a piece of code ''manage'' what > a different > code can do, simply by running the managed code in a higher > ring than the > manager. the manager sets exception handlers that are > called each time > something special happens on the managed code, like memory > page faults, > instruction exceptions, etc. > > unfortunately, x86 provides a limited set of rings (0 is the > most privileged, > 3 is the least); and there are some things that can only be > done on ring 0. > (mostrly very low level hardware accesses). therefore, usual > unmodified > kernels are run only at ring 0.This is actually MORE than most other processors that usually just have "supervisor" and "user" mode. The fact that it''s more than 2 means that it''s possible to use the "ring compression" model that Xen and many other non-hardware-based virtual machine monitors use. I''m not aware of any 32-bit major operating system using anything other than rings 0 and 3.> > other processors (i.e. almost anyone from IBM) provide full > virtualization > facilities: an hypervisor can setup an environment basically > identical to the > bare processor (that''s what MoL (mac on linux) does to run > MacOS on top of > PPC linux) > > existing virtualization software, like VMWare, virtualPC, and > Qemu (with > kqemu) manage to work by doing a very complex set of tricks: > set several > exception handlers to cover most of anything that the guest > kernel might do, > and scanning each piece of guest code before running it, > replacing any > instruction that would have a non-desired effect with a call > to an emulator > that does what _should_ happen. > > of course, there''s some performance penalty, and very complex > code. the fact > that all these emulator/virtualization hybrids actually work > (and very well) > is very amazing to me. > > the new HVM extensions to the x86 IA lets the hypervisor to > setup new handlers > for all the missing privileged instructions, effectively > making it possible > to run managed code in ring 0. i think you could think of it > like creating a > new ring0.5 for the guest OS''s kernel, it can do anything > ring 0 does, but > managed by a "real ring 0" where the hypervisor resides.Whilst this is a good simplified answer, I''d like to say that it''s "incorrect". The hardware support for virtualization actually creates two sets of 0..3 rings. One set being the Hypervisor''s set of protection levels, which are "not managed" and the "managed" ones which the guest-OS runs in. There is an important difference: Having four protection levels on "both sides" means that you can run something like Xen in the "hypervisor side", and still have all three rings available to run for example Windows in a "managed" environment. -- Mats> > the rest of the problem is the emulation of the rest of the > hardware: the > chipset, PCI controller, network, hard disk, cdrom, graphics > card, etc. for > this, the Xen hypervisor forward any hardware access to a > slightly modified > qemu process running in domain 0. > > i hope this makes it all a bit clearer, and that i''m not too > wrong on some > points. > > -- > Javier >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Javier Guerra
2006-Nov-27 12:01 UTC
Re: [Xen-users] Why the unmodified guest os can run on xen while hardware supports VT?
On Monday 27 November 2006 5:29 am, Petersson, Mats wrote:> > unfortunately, x86 provides a limited set of rings (0 is the > > most privileged, > > 3 is the least); and there are some things that can only be > > done on ring 0. > > (mostrly very low level hardware accesses). therefore, usual > > unmodified > > kernels are run only at ring 0. > > This is actually MORE than most other processors that usually just have > "supervisor" and "user" mode. The fact that it''s more than 2 means that > it''s possible to use the "ring compression" model that Xen and many > other non-hardware-based virtual machine monitors use. > > I''m not aware of any 32-bit major operating system using anything other > than rings 0 and 3.right, any other architecture i''ve read of has only two modes. in principle, the rings structure (borrowed from MULTICS, i guess) could be more flexible; but, as you said, no OS used more than two. I said limited not in the sense "too few", but meaning "number set in stone"; therefore not enough for full hardware virtualization. what other processors (i only know about POWER and derivatives. other examples??) provide is orthogonal to the privilege system, since it''s meant from the start to be used to contain full OSs, and not only for the kernel/userspace separation (the usual supervisor/user is for that)> > the new HVM extensions to the x86 IA lets the hypervisor to > > setup new handlers > > for all the missing privileged instructions, effectively > > making it possible > > to run managed code in ring 0. i think you could think of it > > like creating a > > new ring0.5 for the guest OS''s kernel, it can do anything > > ring 0 does, but > > managed by a "real ring 0" where the hypervisor resides. > > Whilst this is a good simplified answer, I''d like to say that it''s > "incorrect". > The hardware support for virtualization actually creates two sets of > 0..3 rings. One set being the Hypervisor''s set of protection levels, > which are "not managed" and the "managed" ones which the guest-OS runs > in. There is an important difference: Having four protection levels on > "both sides" means that you can run something like Xen in the > "hypervisor side", and still have all three rings available to run for > example Windows in a "managed" environment.that was the most speculative part, thanks for correcting it. -- Javier _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users