David Vrabel
2013-Aug-28 17:30 UTC
[PATCH] xen/p2m: check MFN is in range before using the m2p table
From: David Vrabel <david.vrabel@citrix.com> On hosts with more than 168 GB of memory, a 32-bit guest may attempt to grant map an MFN that is not cannot lookup in its mapping of the m2p table. There is an m2p lookup as part of m2p_add_override() and m2p_remove_override(). The lookup falls off the end of the mapped portion of the m2p and (because the mapping is at the highest virtual address) wraps around and the lookup causes a fault on what appears to be a user space address. do_page_fault() (thinking it''s a fault to a userspace address), tries to lock mm->mmap_sem. If the gntdev device is used for the grant map, m2p_add_override() is called from from gnttab_mmap() with mm->mmap_sem already locked. do_page_fault() then deadlocks. The deadlock would most commonly occur when a 64-bit guest is started and xenconsoled attempts to grant map its console ring. Introduce mfn_to_pfn_no_overrides() which checks the MFN is within the mapped portion of the m2p table before accessing the table and use this in m2p_add_override(), m2p_remove_override(), and mfn_to_pfn() (which already had the correct range check). All faults caused by accessing the non-existant parts of the m2p are thus within the kernel address space and exception_fixup() is called without trying to lock mm->mmap_sem. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/include/asm/xen/page.h | 32 +++++++++++++++++++------------- arch/x86/xen/p2m.c | 10 ++++------ 2 files changed, 23 insertions(+), 19 deletions(-) diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h index 6aef9fb..17c7b6b 100644 --- a/arch/x86/include/asm/xen/page.h +++ b/arch/x86/include/asm/xen/page.h @@ -79,30 +79,35 @@ static inline int phys_to_machine_mapping_valid(unsigned long pfn) return get_phys_to_machine(pfn) != INVALID_P2M_ENTRY; } -static inline unsigned long mfn_to_pfn(unsigned long mfn) +static inline unsigned long mfn_to_pfn_no_overrides(unsigned long mfn) { unsigned long pfn; - int ret = 0; + int ret; - if (xen_feature(XENFEAT_auto_translated_physmap)) - return mfn; + if (unlikely(mfn >= machine_to_phys_nr)) + return ~0; - if (unlikely(mfn >= machine_to_phys_nr)) { - pfn = ~0; - goto try_override; - } - pfn = 0; /* * The array access can fail (e.g., device space beyond end of RAM). * In such cases it doesn''t matter what we return (we return garbage), * but we must handle the fault without crashing! */ ret = __get_user(pfn, &machine_to_phys_mapping[mfn]); -try_override: - /* ret might be < 0 if there are no entries in the m2p for mfn */ if (ret < 0) - pfn = ~0; - else if (get_phys_to_machine(pfn) != mfn) + return ~0; + + return pfn; +} + +static inline unsigned long mfn_to_pfn(unsigned long mfn) +{ + unsigned long pfn; + + if (xen_feature(XENFEAT_auto_translated_physmap)) + return mfn; + + pfn = mfn_to_pfn_no_overrides(mfn); + if (pfn != ~0 && get_phys_to_machine(pfn) != mfn) { /* * If this appears to be a foreign mfn (because the pfn * doesn''t map back to the mfn), then check the local override @@ -111,6 +116,7 @@ try_override: * m2p_find_override_pfn returns ~0 if it doesn''t find anything. */ pfn = m2p_find_override_pfn(mfn, ~0); + } /* * pfn is ~0 if there are no entries in the m2p for mfn or if the diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index 95fb2aa..1563444 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -878,7 +878,6 @@ int m2p_add_override(unsigned long mfn, struct page *page, unsigned long uninitialized_var(address); unsigned level; pte_t *ptep = NULL; - int ret = 0; pfn = page_to_pfn(page); if (!PageHighMem(page)) { @@ -925,8 +924,8 @@ int m2p_add_override(unsigned long mfn, struct page *page, * frontend pages while they are being shared with the backend, * because mfn_to_pfn (that ends up being called by GUPF) will * return the backend pfn rather than the frontend pfn. */ - ret = __get_user(pfn, &machine_to_phys_mapping[mfn]); - if (ret == 0 && get_phys_to_machine(pfn) == mfn) + pfn = mfn_to_pfn_no_overrides(mfn); + if (get_phys_to_machine(pfn) == mfn) set_phys_to_machine(pfn, FOREIGN_FRAME(mfn)); return 0; @@ -941,7 +940,6 @@ int m2p_remove_override(struct page *page, unsigned long uninitialized_var(address); unsigned level; pte_t *ptep = NULL; - int ret = 0; pfn = page_to_pfn(page); mfn = get_phys_to_machine(pfn); @@ -1019,8 +1017,8 @@ int m2p_remove_override(struct page *page, * the original pfn causes mfn_to_pfn(mfn) to return the frontend * pfn again. */ mfn &= ~FOREIGN_FRAME_BIT; - ret = __get_user(pfn, &machine_to_phys_mapping[mfn]); - if (ret == 0 && get_phys_to_machine(pfn) == FOREIGN_FRAME(mfn) && + pfn = mfn_to_pfn_no_overrides(mfn); + if (get_phys_to_machine(pfn) == FOREIGN_FRAME(mfn) && m2p_find_override(mfn) == NULL) set_phys_to_machine(pfn, mfn); -- 1.7.2.5
Jan Beulich
2013-Aug-29 07:07 UTC
Re: [PATCH] xen/p2m: check MFN is in range before using the m2p table
>>> On 28.08.13 at 19:30, David Vrabel <david.vrabel@citrix.com> wrote: > From: David Vrabel <david.vrabel@citrix.com> > > On hosts with more than 168 GB of memory, a 32-bit guest may attempt > to grant map an MFN that is not cannot lookup in its mapping of the > m2p table. There is an m2p lookup as part of m2p_add_override() and > m2p_remove_override(). The lookup falls off the end of the mapped > portion of the m2p and (because the mapping is at the highest virtual > address) wraps around and the lookup causes a fault on what appears to > be a user space address. > > do_page_fault() (thinking it''s a fault to a userspace address), tries > to lock mm->mmap_sem. If the gntdev device is used for the grant map, > m2p_add_override() is called from from gnttab_mmap() with mm->mmap_sem > already locked. do_page_fault() then deadlocks. > > The deadlock would most commonly occur when a 64-bit guest is started > and xenconsoled attempts to grant map its console ring. > > Introduce mfn_to_pfn_no_overrides() which checks the MFN is within the > mapped portion of the m2p table before accessing the table and use > this in m2p_add_override(), m2p_remove_override(), and mfn_to_pfn() > (which already had the correct range check). > > All faults caused by accessing the non-existant parts of the m2p are > thus within the kernel address space and exception_fixup() is called > without trying to lock mm->mmap_sem. > > Signed-off-by: David Vrabel <david.vrabel@citrix.com>This all looks quite fine to me, but iiuc it only removes the deadlock, it doesn''t make things work. In order to make it work I think we''d need a hypercall (albeit even then it would work only up to 1Tb). Otoh the question of course is whether driving this big a system with a 32-bit Dom0 kernel is a reasonable thing in the first place. Jan
David Vrabel
2013-Aug-29 10:31 UTC
Re: [PATCH] xen/p2m: check MFN is in range before using the m2p table
On 29/08/13 08:07, Jan Beulich wrote:>>>> On 28.08.13 at 19:30, David Vrabel <david.vrabel@citrix.com> wrote: >> From: David Vrabel <david.vrabel@citrix.com> >> >> On hosts with more than 168 GB of memory, a 32-bit guest may attempt >> to grant map an MFN that is not cannot lookup in its mapping of the >> m2p table. There is an m2p lookup as part of m2p_add_override() and >> m2p_remove_override(). The lookup falls off the end of the mapped >> portion of the m2p and (because the mapping is at the highest virtual >> address) wraps around and the lookup causes a fault on what appears to >> be a user space address. >> >> do_page_fault() (thinking it''s a fault to a userspace address), tries >> to lock mm->mmap_sem. If the gntdev device is used for the grant map, >> m2p_add_override() is called from from gnttab_mmap() with mm->mmap_sem >> already locked. do_page_fault() then deadlocks. >> >> The deadlock would most commonly occur when a 64-bit guest is started >> and xenconsoled attempts to grant map its console ring. >> >> Introduce mfn_to_pfn_no_overrides() which checks the MFN is within the >> mapped portion of the m2p table before accessing the table and use >> this in m2p_add_override(), m2p_remove_override(), and mfn_to_pfn() >> (which already had the correct range check). >> >> All faults caused by accessing the non-existant parts of the m2p are >> thus within the kernel address space and exception_fixup() is called >> without trying to lock mm->mmap_sem. >> >> Signed-off-by: David Vrabel <david.vrabel@citrix.com> > > This all looks quite fine to me, but iiuc it only removes the deadlock, > it doesn''t make things work. In order to make it work I think we''d > need a hypercall (albeit even then it would work only up to 1Tb). > Otoh the question of course is whether driving this big a system > with a 32-bit Dom0 kernel is a reasonable thing in the first place.It should be sufficient as either an MFN is owned by this domain and thus it is within the m2p mapping or it''s a foreign MFN and it will be looked up via the m2p overrides. However, I note I got one of the tests wrong in mfn_to_pfn() and it''s broken. It should be: pfn = mfn_to_pfn_no_overrides(mfn); if (pfn == ~0 || get_phys_to_machine(pfn) != mfn) pfn = m2p_find_override_pfn(mfn, ~0); Or the equivalent (because get_phys_to_machine(~0) returns INVALID_P2M_ENTRY which is != mfn: pfn = mfn_to_pfn_no_overrides(mfn); if (get_phys_to_machine(pfn) != mfn) pfn = m2p_find_override_pfn(mfn, ~0); David