> Also switching page tables seems to be not so easy. Is it > possible to switch atomically to a new, completely > independant page table tree? > i.e. old tree is valid (of cource), new tree is too, but the > pages of the old tree are not mapped read-only in the new > tree (and visa versa).If the pagetable you''re running on isn''t pinned the refcounts will drop to zero when you switch away from it, hence its only necessary for the destination pagetable to be internally consistent. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 26 Feb 2006, at 22:49, Ian Pratt wrote:>> Also switching page tables seems to be not so easy. Is it >> possible to switch atomically to a new, completely >> independant page table tree? >> i.e. old tree is valid (of cource), new tree is too, but the >> pages of the old tree are not mapped read-only in the new >> tree (and visa versa). > > If the pagetable you''re running on isn''t pinned the refcounts will drop > to zero when you switch away from it, hence its only necessary for the > destination pagetable to be internally consistent.We run on the old pagetables while we validate the new ones, so currently they need to map each other read-only. We could have a slow path in the pagetable switch code that switches to the idle pagetable, drops the old pagetable, checks the new pagetable then switches again. On failure it would have to crash the domain dealing with the fact that this VCPU no longer has a valid ''cr3''. It''s a bit of a pain, but it might simplify mm/init.c. I''m not sure by how much though. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi,> It''s a bit of a pain, but it might simplify mm/init.c. I''m not sure by > how much though.The kexec trampoline would be *much* easier then, every bit of state information I have to carry around is a pain. mm/init.c hopefully doesn''t need any tweaks for kexec. cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 27 Feb 2006, at 15:34, Gerd Hoffmann wrote:>> It''s a bit of a pain, but it might simplify mm/init.c. I''m not sure by >> how much though. > > The kexec trampoline would be *much* easier then, every bit of state > information I have to carry around is a pain. mm/init.c hopefully > doesn''t need any tweaks for kexec.Okay, done. Changeset 8999 (5adaa69...). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> > Okay, done. Changeset 8999 (5adaa69...).Thanks. Unfortunaly I can take down Xen now by trying to use that code path, see the log below. cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Thanks. Unfortunaly I can take down Xen now by trying to use that code > path, see the log below.> (XEN) ----[ Xen-3.0.0 Not tainted ]---- > (XEN) CPU: 1 > (XEN) EIP: e008:[<ff13383c>] free_l2_table+0x53/0x78xen tries to dereference a pointer it got from map_domain_page() here. Guess it points into nowhere because we are running the idle page tables, not the domain page tables at that point. cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 1 Mar 2006, at 14:10, Gerd Hoffmann wrote:>> Thanks. Unfortunaly I can take down Xen now by trying to use that >> code >> path, see the log below. > >> (XEN) ----[ Xen-3.0.0 Not tainted ]---- >> (XEN) CPU: 1 >> (XEN) EIP: e008:[<ff13383c>] free_l2_table+0x53/0x78 > > xen tries to dereference a pointer it got from map_domain_page() here. > Guess it points into nowhere because we are running the idle page > tables, not the domain page tables at that point.Yep, that''s the problem. I have an idea how to fix this. I''ll get back to you. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 1 Mar 2006, at 15:03, Keir Fraser wrote:>> xen tries to dereference a pointer it got from map_domain_page() here. >> Guess it points into nowhere because we are running the idle page >> tables, not the domain page tables at that point. > > Yep, that''s the problem. I have an idea how to fix this. I''ll get back > to you.Okay, try changeset 9025:e0f66dbe4b13 which (hopefully) fixes map_domain_page() to work properly when the guest is running on the host idle page tables. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Okay, try changeset 9025:e0f66dbe4b13 which (hopefully) fixes > map_domain_page() to work properly when the guest is running on the host > idle page tables.Thanks, it at least doesn''t crash any more. It doesn''t work yet either, but that most likely is a bug in setting up my page tables ... btw: What happens if I try to install a gdt with zero entries? Can I do that? Are there some default code/data/stack segments (full 4G minus xen hole) available I can use then? IIRC the domain builders don''t setup a gdt, so there must be some which are used at boot time until the OS kernel installes its own ... cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2 Mar 2006, at 09:16, Gerd Hoffmann wrote:> Thanks, it at least doesn''t crash any more. It doesn''t work yet > either, > but that most likely is a bug in setting up my page tables ... > > btw: What happens if I try to install a gdt with zero entries? Can I > do > that? Are there some default code/data/stack segments (full 4G minus > xen hole) available I can use then? IIRC the domain builders don''t > setup a gdt, so there must be some which are used at boot time until > the > OS kernel installes its own ...The default segment descriptors are *always* available. A guest can only install up to a 14-page GDT: the 15th and 16th pages are reserved by Xen and always present. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> The default segment descriptors are *always* available. A guest can only > install up to a 14-page GDT: the 15th and 16th pages are reserved by Xen > and always present.Ok, managed to fixup that, switching to the default descriptors and killing the GDT works, so the gdt doesn''t stop validating any more. But I ran into the next bug now. I think we have a funny chicken-and-egg problem when validating a self-consistent but new page table set (which also maps itself) from scratch, see messages attached below. cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2 Mar 2006, at 11:34, Gerd Hoffmann wrote:> Ok, managed to fixup that, switching to the default descriptors and > killing the GDT works, so the gdt doesn''t stop validating any more. > > But I ran into the next bug now. I think we have a funny > chicken-and-egg problem when validating a self-consistent but new page > table set (which also maps itself) from scratch, see messages attached > below.Looks like your new pagetables have a writable mapping of MFN 18c95, which is the new page directory. i.e., the new p.t. set is not self consistent? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Looks like your new pagetables have a writable mapping of MFN 18c95, > which is the new page directory. i.e., the new p.t. set is not self > consistent?They where indeed. Wrong usage of pte_wrprotect() :-/ Next question: Is there some way to _clear_ the trap table? cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2 Mar 2006, at 16:36, Gerd Hoffmann wrote:>> Looks like your new pagetables have a writable mapping of MFN 18c95, >> which is the new page directory. i.e., the new p.t. set is not self >> consistent? > > They where indeed. Wrong usage of pte_wrprotect() :-/ > Next question: Is there some way to _clear_ the trap table?Hehe. The interface wasn''t so much designed with this in mind, but it can be done. Pass in an array of 256 trap_info structs, where the i''th struct contains: vector=i, flags=0, cs=0, addr=1 Note the address must be non-zero, as that otherwise marks the end of the list! You can put whatever non-zero value you like in that field -- it really doesn''t matter. In fact you need to pass a list of *257* trap_info structs, the last of which has addr==0. :-) The critical thing that disables the trap is flags==0: that sets the ''dpl'' for the ''trap gate'' to zero so it effectively is unusable. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> The critical thing that disables the trap is flags==0: that sets the > ''dpl'' for the ''trap gate'' to zero so it effectively is unusable.Well, xen still tries to forward the trap to that vector then, but I want to avoid exactly that. The problem I have is that my kernel faults somewhere, but I''m already that far in doing kexec that the usual linux kernel trap handling isn''t going to work any more. What I''d like to see is xen print out a register dump, with EIP being the faulting instruction, not the (non-working) fault handler entry point. Guess I have to hack xen a bit for that ... cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 3 Mar 2006, at 08:39, Gerd Hoffmann wrote:>> The critical thing that disables the trap is flags==0: that sets the >> ''dpl'' for the ''trap gate'' to zero so it effectively is unusable. > > Well, xen still tries to forward the trap to that vector then, but I > want to avoid exactly that. > > The problem I have is that my kernel faults somewhere, but I''m already > that far in doing kexec that the usual linux kernel trap handling isn''t > going to work any more. What I''d like to see is xen print out a > register dump, with EIP being the faulting instruction, not the > (non-working) fault handler entry point. > > Guess I have to hack xen a bit for that ...Well, the trick I described is good for at least ''non-exceptions'' (i.e., it''ll disable ''int x'' software interrupts). For exceptions, I guess you should simply avoid causing them? :-) In fact, why not install a dummy handler for all vectors (0-255). That is, a valid cs:eip that vectors to a small kexec function to print the state you desire? Setting flags==0 will ensure that softints don''t work, but you can print out a useful message if you take a fault. You could try and put in a safe-ish place so it continues to work even for the early stages of the new kernel boot. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 3 Mar 2006, at 09:14, Keir Fraser wrote:> Well, the trick I described is good for at least ''non-exceptions'' > (i.e., it''ll disable ''int x'' software interrupts). For exceptions, I > guess you should simply avoid causing them? :-) > > In fact, why not install a dummy handler for all vectors (0-255). That > is, a valid cs:eip that vectors to a small kexec function to print the > state you desire? Setting flags==0 will ensure that softints don''t > work, but you can print out a useful message if you take a fault. You > could try and put in a safe-ish place so it continues to work even for > the early stages of the new kernel boot.Actually a clear-trap-table call is quite easy to add: I checked in a patch that will clear the entire trap table if you pass a NULL pointer to the set_trap_table hypercall. With that patch in place, if you take a fault you will get a register dump in Xen as desired. Changeset 9109:29f8c87... -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Actually a clear-trap-table call is quite easy to add: I checked in a > patch that will clear the entire trap table if you pass a NULL pointer > to the set_trap_table hypercall. With that patch in place, if you take a > fault you will get a register dump in Xen as desired.Thanks. For now I have a different problem: get the userspace bits right. Not sure whenever it is easier to teach kexec-tools what xen is or teach libxc what kexec is. On a quick check the later looks easier. stay tuned ;) Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi, Ok, one of the more intresting issues is p2m map and ballooning, I''m not sure yet how to address that best. One problem are the "holes" in guest physical memory created by ballooning. The other one is the time gap between loading and booting the kexec kernel (and the p2m table which may change inbetween). My first attempt to address that issue by avoiding it (features="auto_translated_physmap" ;) resulted in this: kernel BUG at drivers/xen/balloon/balloon.c:216! For now I have a few questions: * Which events can change the p2m map? I think for domU that is only ballooning, right? For dom0 additionally the backend drivers (when mapping foreign pages). Anything else? * Is there some way to rebuild the p2m map from scratch using hypercalls? * Is there some easy way to "compress" the memory, i.e. move all pages to the start of (guest physical) memory? * Are unpriviledged domains allowed to do Dom0 ops for DOM_SELF? getdomaininfo for example? Or hypercall_init? cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Mar 06, 2006 at 03:15:00PM +0100, Gerd Hoffmann wrote:> * Which events can change the p2m map? I think for domU that is only > ballooning, right? For dom0 additionally the backend drivers (when > mapping foreign pages). Anything else?Networking as well, since network packet reception involves flipping pages between domains.> * Is there some way to rebuild the p2m map from scratch using > hypercalls?As far as I know not completely reliably, since the P2M map is maintained by the guest and Xen is almost entirely uninvolved. You could probably get most of the way there by retrieving a list of pages belonging to the guest (not sure if this can be done from within an unprivileged guest though, or only in Domain-0), then building a reverse map of entries in the M2P table. This will only work if all pages have accurate M2P entries (certainly won''t work for domain-0, since it maps foreign pages, but might work for unprivileged domains). --Michael Vrable _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Actually a clear-trap-table call is quite easy to add: I checked in a > patch that will clear the entire trap table if you pass a NULL pointer > to the set_trap_table hypercall. With that patch in place, if you take a > fault you will get a register dump in Xen as desired.Well, I''ve tried that now, and got this: (XEN) domain_crash_sync called from entry.S (ff147f35) (XEN) Domain 1 (vcpu#0) crashed on cpu#1: (XEN) Assertion ''(diff != 0) || VM86_MODE(regs) || !RING_0(regs) || HVM_DOMAIN(current)'' failed, line 35, file x86_32/traps.c (XEN) BUG at x86_32/traps.c:35 (changeset 9137) cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 7 Mar 2006, at 13:06, Gerd Hoffmann wrote:>> Actually a clear-trap-table call is quite easy to add: I checked in a >> patch that will clear the entire trap table if you pass a NULL pointer >> to the set_trap_table hypercall. With that patch in place, if you >> take a >> fault you will get a register dump in Xen as desired. > > Well, I''ve tried that now, and got this: > > (XEN) domain_crash_sync called from entry.S (ff147f35) > (XEN) Domain 1 (vcpu#0) crashed on cpu#1: > (XEN) Assertion ''(diff != 0) || VM86_MODE(regs) || !RING_0(regs) || > HVM_DOMAIN(current)'' failed, line 35, file x86_32/traps.c > (XEN) BUG at x86_32/traps.c:35The assertion is spurious. I''ve tightened up the assertion condition in changeset 9157:51c59d5d7... "Tighten up the assertion conditions in the GUEST_MODE() macro.". -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> The assertion is spurious. I''ve tightened up the assertion condition in > changeset 9157:51c59d5d7... "Tighten up the assertion conditions in the > GUEST_MODE() macro.".Looks like I''m triggering lots of funny bugs, see log below ... Current plan is to get kexec work first with features=auto_translated_physmap due to some non-trivial issues with handling the p2m map. That attempt killed Xen too ;) cheers, Gerd -- Gerd ''just married'' Hoffmann <kraxel@suse.de> I''m the hacker formerly known as Gerd Knorr. http://www.suse.de/~kraxel/just-married.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel