Steven Rostedt
2006-Nov-02 16:53 UTC
[Xen-devel] [PATCH] Fix >4G i386 PAE grant table interface
It has been discovered that i386 boxes with more than 4G of RAM would randomly crash. It was traced to the interface of blktap using gnttab_set_map_op. It would pass in the 64 bit pte entry, but the gnttab_set_map_op would only take a 32 bit (on i386) unsigned long as a parameter. So we lose the top 32bits. Luckily! The kernel/HV ABI used a uint64_t as the variable to pass the address. So this does *NOT* break the current kernel/HV ABI. But after the HV grabs the 64 bit address from the guest, it too calls a function that uses a unsigned long (32bits on i386) to pass that address with. So the HV side also chops off the top 64 bits of the variable. This patch updates both the linux-2.6-sparse tree and the xen HV to use uint64_t instead of unsigned long for those particular functions. This patch has been tested on RHEL5 Beta on a box with 12G i386. Signed-off-by: Steven Rostedt <srostedt@redhat.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-02 18:16 UTC
[Xen-devel] Re: [PATCH] Fix >4G i386 PAE grant table interface
On 2/11/06 16:53, "Steven Rostedt" <srostedt@redhat.com> wrote:> This patch updates both the linux-2.6-sparse tree and the xen HV to use > uint64_t instead of unsigned long for those particular functions. This > patch has been tested on RHEL5 Beta on a box with 12G i386. > > Signed-off-by: Steven Rostedt <srostedt@redhat.com>Nasty bug. At least it affects only blktap. Thanks. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2006-Nov-03 08:42 UTC
Re: [Xen-devel] [PATCH] Fix >4G i386 PAE grant table interface
>>> Steven Rostedt <srostedt@redhat.com> 02.11.06 17:53 >>> >It has been discovered that i386 boxes with more than 4G of RAM would >randomly crash. It was traced to the interface of blktap using >gnttab_set_map_op. > >It would pass in the 64 bit pte entry, but the gnttab_set_map_op would >only take a 32 bit (on i386) unsigned long as a parameter. So we lose >the top 32bits.Could you use maddr_t here rather than uint64_t? For non-PAE i386 Linux, especially when using CONFIG_REGPARM, adding a useless argument slot seems wasteful... Keir - will this go into 3.0.3? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2006-Nov-03 09:28 UTC
[Xen-devel] Re: [PATCH] Fix >4G i386 PAE grant table interface
>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 02.11.06 19:16 >>> >On 2/11/06 16:53, "Steven Rostedt" <srostedt@redhat.com> wrote: > >> This patch updates both the linux-2.6-sparse tree and the xen HV to use >> uint64_t instead of unsigned long for those particular functions. This >> patch has been tested on RHEL5 Beta on a box with 12G i386. >> >> Signed-off-by: Steven Rostedt <srostedt@redhat.com> > >Nasty bug. At least it affects only blktap. Thanks.Looking at this it would seem to me that the second call to gnttab_set_unmap_op in blkltap.c is missing the GNTMAP_contains_pte flag, affecting auto_translated_physmap guests. Signed-off-by: Jan Beulich <jbeulich@novell.com> Index: head-2006-10-30/drivers/xen/blktap/blktap.c ==================================================================--- head-2006-10-30.orig/drivers/xen/blktap/blktap.c 2006-10-26 12:10:54.000000000 +0200 +++ head-2006-10-30/drivers/xen/blktap/blktap.c 2006-11-03 10:00:44.000000000 +0100 @@ -908,8 +908,10 @@ static void fast_flush_area(pending_req_ return; } - gnttab_set_unmap_op(&unmap[invcount], - ptep, GNTMAP_host_map, + gnttab_set_unmap_op(&unmap[invcount], ptep, + GNTMAP_host_map + | GNTMAP_application_map + | GNTMAP_contains_pte, khandle->user); invcount++; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2006-Nov-03 09:56 UTC
Re: [Xen-devel] [PATCH] Fix >4G i386 PAE grant table interface
>>> Steven Rostedt <srostedt@redhat.com> 02.11.06 17:53 >>> >It has been discovered that i386 boxes with more than 4G of RAM would >randomly crash. It was traced to the interface of blktap using >gnttab_set_map_op. > >It would pass in the 64 bit pte entry, but the gnttab_set_map_op would >only take a 32 bit (on i386) unsigned long as a parameter. So we lose >the top 32bits. > >Luckily! The kernel/HV ABI used a uint64_t as the variable to pass the >address. So this does *NOT* break the current kernel/HV ABI. > >But after the HV grabs the 64 bit address from the guest, it too calls a >function that uses a unsigned long (32bits on i386) to pass that address >with. So the HV side also chops off the top 64 bits of the variable. > > >This patch updates both the linux-2.6-sparse tree and the xen HV to use >uint64_t instead of unsigned long for those particular functions. This >patch has been tested on RHEL5 Beta on a box with 12G i386. > >Signed-off-by: Steven Rostedt <srostedt@redhat.com>After integrating the kernel part of this patch in my tree, I found that almost the whole kernel got rebuilt. include/asm-{i386,x86_64}/mach-xen/asm/fixmap.h were needlessly including include/xen/gnttab.h. Removing this made necessary explicit inclusion of that header in tpm_xen.c, the build of which should not have succeeded on non-x86 architectures before. Signed-off-by: Jan Beulich <jbeulich@novell.com> Index: head-2006-10-30/drivers/char/tpm/tpm_xen.c ==================================================================--- head-2006-10-30.orig/drivers/char/tpm/tpm_xen.c 2006-10-30 13:36:39.000000000 +0100 +++ head-2006-10-30/drivers/char/tpm/tpm_xen.c 2006-11-03 10:41:32.000000000 +0100 @@ -41,6 +41,7 @@ #include <xen/evtchn.h> #include <xen/interface/grant_table.h> #include <xen/interface/io/tpmif.h> +#include <xen/gnttab.h> #include <xen/xenbus.h> #include "tpm.h" #include "tpm_vtpm.h" Index: head-2006-10-30/include/asm-i386/mach-xen/asm/fixmap.h ==================================================================--- head-2006-10-30.orig/include/asm-i386/mach-xen/asm/fixmap.h 2006-08-28 10:57:30.000000000 +0200 +++ head-2006-10-30/include/asm-i386/mach-xen/asm/fixmap.h 2006-11-03 10:30:39.000000000 +0100 @@ -27,7 +27,6 @@ extern unsigned long __FIXADDR_TOP; #include <asm/acpi.h> #include <asm/apicdef.h> #include <asm/page.h> -#include <xen/gnttab.h> #ifdef CONFIG_HIGHMEM #include <linux/threads.h> #include <asm/kmap_types.h> Index: head-2006-10-30/include/asm-x86_64/mach-xen/asm/fixmap.h ==================================================================--- head-2006-10-30.orig/include/asm-x86_64/mach-xen/asm/fixmap.h 2006-08-28 10:57:30.000000000 +0200 +++ head-2006-10-30/include/asm-x86_64/mach-xen/asm/fixmap.h 2006-11-03 10:30:50.000000000 +0100 @@ -14,7 +14,6 @@ #include <linux/config.h> #include <linux/kernel.h> #include <asm/apicdef.h> -#include <xen/gnttab.h> #include <asm/page.h> #include <asm/vsyscall.h> #include <asm/vsyscall32.h> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Rostedt
2006-Nov-03 13:43 UTC
Re: [Xen-devel] [PATCH] Fix >4G i386 PAE grant table interface
Jan Beulich wrote:>>>> Steven Rostedt <srostedt@redhat.com> 02.11.06 17:53 >>> >> It has been discovered that i386 boxes with more than 4G of RAM would >> randomly crash. It was traced to the interface of blktap using >> gnttab_set_map_op. >> >> It would pass in the 64 bit pte entry, but the gnttab_set_map_op would >> only take a 32 bit (on i386) unsigned long as a parameter. So we lose >> the top 32bits. > > Could you use maddr_t here rather than uint64_t? For non-PAE i386 > Linux, especially when using CONFIG_REGPARM, adding a useless > argument slot seems wasteful... >Actually, it makes no difference to me. In fact uint64_t was my third incarnation, since I wasn''t sure what the best would be. I started with unsigned long long, then switched to u64, and then noticed that since host_addr is uint64_t, that seemed the proper thing to use. So a maddr_t would work too. Do you want to do the patch, or would you like me to send another patch that would do this change? -- Steve _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2006-Nov-03 14:27 UTC
Re: [Xen-devel] [PATCH] Fix >4G i386 PAE grant table interface
>Do you want to do the patch, or would you like me to send another patch >that would do this change?Unless this was already committed, maybe Keir could do this on the fly? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-03 20:43 UTC
Re: [Xen-devel] [PATCH] Fix >4G i386 PAE grant table interface
On 3/11/06 14:27, "Jan Beulich" <jbeulich@novell.com> wrote:>> Do you want to do the patch, or would you like me to send another patch >> that would do this change? > > Unless this was already committed, maybe Keir could do this on the fly?Already committed, but I''ll consider patches. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Rostedt
2006-Nov-03 22:46 UTC
Re: [Xen-devel] Re: [PATCH] Fix >4G i386 PAE grant table interface
Jan Beulich wrote:>>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 02.11.06 19:16 >>> >> On 2/11/06 16:53, "Steven Rostedt" <srostedt@redhat.com> wrote: >> >>> This patch updates both the linux-2.6-sparse tree and the xen HV to use >>> uint64_t instead of unsigned long for those particular functions. This >>> patch has been tested on RHEL5 Beta on a box with 12G i386. >>> >>> Signed-off-by: Steven Rostedt <srostedt@redhat.com> >> Nasty bug. At least it affects only blktap. Thanks. > > Looking at this it would seem to me that the second call to > gnttab_set_unmap_op in blkltap.c is missing the > GNTMAP_contains_pte flag, affecting auto_translated_physmap > guests.It''s much worst than that! Without this patch, my patch is still broken. In the hypervisor this flag is checked, and will go down the wrong path when it is missing: int destroy_grant_host_mapping( u64 addr, unsigned long frame, unsigned int flags) { if ( flags & GNTMAP_contains_pte ) return destroy_grant_pte_mapping(addr, frame, current->domain); return destroy_grant_va_mapping(addr, frame, current); } You can see here that we call the va_mapping instead of the pte mapping. Which means that we once again truncate the top 32 bits of the address, since a va addr is expected to be "unsigned long" in size. Also, this can be bad, if the va addr is really virtual, and doesn''t match the actual offset shift (like handling high memory).> > Signed-off-by: Jan Beulich <jbeulich@novell.com> >please apply. Acked-by: Steven Rostedt <srostedt@redhat.com> -- Steve _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-04 07:42 UTC
Re: [Xen-devel] Re: [PATCH] Fix >4G i386 PAE grant table interface
On 3/11/06 10:46 pm, "Steven Rostedt" <srostedt@redhat.com> wrote:> It''s much worst than that! > > Without this patch, my patch is still broken. In the hypervisor this > flag is checked, and will go down the wrong path when it is missing:The hypervisor remembers the flags for itself, so the kernel can only confuse itself. Still, it was a bug. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Rostedt
2006-Nov-04 14:25 UTC
Re: [Xen-devel] Re: [PATCH] Fix >4G i386 PAE grant table interface
Keir Fraser wrote:> On 3/11/06 10:46 pm, "Steven Rostedt" <srostedt@redhat.com> wrote: > >> It''s much worst than that! >> >> Without this patch, my patch is still broken. In the hypervisor this >> flag is checked, and will go down the wrong path when it is missing: > > The hypervisor remembers the flags for itself, so the kernel can only > confuse itself. Still, it was a bug. >Well the flag for the application map isn''t the problem. But I do see missing the flag for contains pte is a problem.>int create_grant_host_mapping( u64 addr, unsigned long frame, unsigned int flags) { l1_pgentry_t pte = l1e_from_pfn(frame, GRANT_PTE_FLAGS); if ( (flags & GNTMAP_application_map) ) l1e_add_flags(pte,_PAGE_USER); if ( !(flags & GNTMAP_readonly) ) l1e_add_flags(pte,_PAGE_RW); if ( flags & GNTMAP_contains_pte ) return create_grant_pte_mapping(addr, pte, current); return create_grant_va_mapping(addr, pte, current); } int destroy_grant_host_mapping( u64 addr, unsigned long frame, unsigned int flags) { if ( flags & GNTMAP_contains_pte ) return destroy_grant_pte_mapping(addr, frame, current->domain); return destroy_grant_va_mapping(addr, frame, current); } So is there a difference between create_grant_pte_mapping and create_grant_va_mapping. As well as destroy_grant_pte_mapping and destroy_grant_va_mapping. So calling pte create, and then va destroy on the same mapping is not a bug? -- Steve _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-04 16:23 UTC
Re: [Xen-devel] Re: [PATCH] Fix >4G i386 PAE grant table interface
On 4/11/06 2:25 pm, "Steven Rostedt" <srostedt@redhat.com> wrote:> So is there a difference between create_grant_pte_mapping and > create_grant_va_mapping. As well as destroy_grant_pte_mapping and > destroy_grant_va_mapping. So calling pte create, and then va destroy on > the same mapping is not a bug?That would be a bug, if it were possible, which it''s not. ''flags'' is not a parameter to the gnttab_unmap operation. Xen remembers the flags from the original map operation. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Rostedt
2006-Nov-04 17:10 UTC
Re: [Xen-devel] Re: [PATCH] Fix >4G i386 PAE grant table interface
Keir Fraser wrote:> > > On 4/11/06 2:25 pm, "Steven Rostedt" <srostedt@redhat.com> wrote: > >> So is there a difference between create_grant_pte_mapping and >> create_grant_va_mapping. As well as destroy_grant_pte_mapping and >> destroy_grant_va_mapping. So calling pte create, and then va destroy on >> the same mapping is not a bug? > > That would be a bug, if it were possible, which it''s not. ''flags'' is not a > parameter to the gnttab_unmap operation. Xen remembers the flags from the > original map operation. >OK, took me some time to find what you mean: __gnttab_unmap_grant_ref( struct gnttab_unmap_grant_ref *op) { [...] map = &ld->grant_table->maptrack[op->handle]; [...] dom = map->domid; ref = map->ref; flags = map->flags; [...] if ( (op->host_addr != 0) && (flags & GNTMAP_host_map) ) { if ( (rc = destroy_grant_host_mapping(op->host_addr, frame, flags)) < 0 ) goto unmap_out; OK, but it can be a problem on the kernel side because of the Xen auto translate physmap feature. But not as bad as I thought. But it''s still good to be consistent. -- Steve _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2006-Nov-06 08:21 UTC
Re: [Xen-devel] [PATCH] Fix >4G i386 PAE grant table interface
>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 03.11.06 21:43 >>> >On 3/11/06 14:27, "Jan Beulich" <jbeulich@novell.com> wrote: > >>> Do you want to do the patch, or would you like me to send another patch >>> that would do this change? >> >> Unless this was already committed, maybe Keir could do this on the fly? > >Already committed, but I''ll consider patches.Don''t use a 64-bit addr parameter to gnttab_set_{,un}map_op() when the upper 32 bits will never be used (i.e. i386 non-PAE). Signed-off-by: Jan Beulich <jbeulich@novell.com> Index: head-2006-10-30/include/xen/gnttab.h ==================================================================--- head-2006-10-30.orig/include/xen/gnttab.h 2006-09-21 09:57:46.000000000 +0200 +++ head-2006-10-30/include/xen/gnttab.h 2006-11-03 09:56:58.000000000 +0100 @@ -118,7 +118,7 @@ int gnttab_suspend(void); int gnttab_resume(void); static inline void -gnttab_set_map_op(struct gnttab_map_grant_ref *map, uint64_t addr, +gnttab_set_map_op(struct gnttab_map_grant_ref *map, maddr_t addr, uint32_t flags, grant_ref_t ref, domid_t domid) { if (flags & GNTMAP_contains_pte) @@ -134,7 +134,7 @@ gnttab_set_map_op(struct gnttab_map_gran } static inline void -gnttab_set_unmap_op(struct gnttab_unmap_grant_ref *unmap, uint64_t addr, +gnttab_set_unmap_op(struct gnttab_unmap_grant_ref *unmap, maddr_t addr, uint32_t flags, grant_handle_t handle) { if (flags & GNTMAP_contains_pte) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel