The reworking of p2m lookups to use get_gfn()/put_gfn() left the shadow code not taking the p2m lock, even in cases where the p2m would be updated (i.e. PoD). In many cases, shadow code doesn''t need the exclusion that get_gfn()/put_gfn() provides, as it has its own interlocks against p2m updates, but this is taking things too far, and can lead to crashes in the PoD code. Now that most shadow-code p2m lookups are done with explicitly unlocked accessors, or with the get_page_from_gfn() accessor, which is often lock-free, we can just turn this locking on. The remaining locked lookups are in sh_page_fault() (in a path that''s almost always already serializing on the paging lock), and in emulate_map_dest() (which can probably be updated to use get_page_from_gfn()). They''re not addressed here but may be in a follow-up patch. Signed-off-by: Tim Deegan <tim@xen.org> --- xen/arch/x86/mm/p2m.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index de1dd82..2db73c9 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -218,8 +218,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn, return _mfn(gfn); } - /* For now only perform locking on hap domains */ - if ( locked && (hap_enabled(p2m->domain)) ) + if ( locked ) /* Grab the lock here, don''t release until put_gfn */ gfn_lock(p2m, gfn, 0); @@ -248,8 +247,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn, void __put_gfn(struct p2m_domain *p2m, unsigned long gfn) { - if ( !p2m || !paging_mode_translate(p2m->domain) - || !hap_enabled(p2m->domain) ) + if ( !p2m || !paging_mode_translate(p2m->domain) ) /* Nothing to do in this case */ return; -- 1.7.10.4
Jan Beulich
2013-Feb-21 14:29 UTC
Re: [PATCH] x86/mm: Take the p2m lock even in shadow mode.
>>> On 21.02.13 at 15:21, Tim Deegan <tim@xen.org> wrote: > The reworking of p2m lookups to use get_gfn()/put_gfn() left the > shadow code not taking the p2m lock, even in cases where the p2m would > be updated (i.e. PoD). > > In many cases, shadow code doesn''t need the exclusion that > get_gfn()/put_gfn() provides, as it has its own interlocks against p2m > updates, but this is taking things too far, and can lead to crashes in > the PoD code. > > Now that most shadow-code p2m lookups are done with explicitly > unlocked accessors, or with the get_page_from_gfn() accessor, which is > often lock-free, we can just turn this locking on. > > The remaining locked lookups are in sh_page_fault() (in a path that''s > almost always already serializing on the paging lock), and in > emulate_map_dest() (which can probably be updated to use > get_page_from_gfn()). They''re not addressed here but may be in a > follow-up patch.I assume once coming out of stage testing, this also needs to be put into 4.2? Jan
Tim Deegan
2013-Feb-21 14:33 UTC
Re: [PATCH] x86/mm: Take the p2m lock even in shadow mode.
At 14:29 +0000 on 21 Feb (1361456940), Jan Beulich wrote:> >>> On 21.02.13 at 15:21, Tim Deegan <tim@xen.org> wrote: > > The reworking of p2m lookups to use get_gfn()/put_gfn() left the > > shadow code not taking the p2m lock, even in cases where the p2m would > > be updated (i.e. PoD). > > > > In many cases, shadow code doesn''t need the exclusion that > > get_gfn()/put_gfn() provides, as it has its own interlocks against p2m > > updates, but this is taking things too far, and can lead to crashes in > > the PoD code. > > > > Now that most shadow-code p2m lookups are done with explicitly > > unlocked accessors, or with the get_page_from_gfn() accessor, which is > > often lock-free, we can just turn this locking on. > > > > The remaining locked lookups are in sh_page_fault() (in a path that''s > > almost always already serializing on the paging lock), and in > > emulate_map_dest() (which can probably be updated to use > > get_page_from_gfn()). They''re not addressed here but may be in a > > follow-up patch. > > I assume once coming out of stage testing, this also needs to be > put into 4.2?Yes, sorry I forgot to mention it. Definitely a candidate for 4.2 Tim.
Andres Lagar-Cavilla
2013-Feb-21 14:58 UTC
Re: [PATCH] x86/mm: Take the p2m lock even in shadow mode.
> The reworking of p2m lookups to use get_gfn()/put_gfn() left the > shadow code not taking the p2m lock, even in cases where the p2m would > be updated (i.e. PoD). > > In many cases, shadow code doesn''t need the exclusion that > get_gfn()/put_gfn() provides, as it has its own interlocks against p2m > updates, but this is taking things too far, and can lead to crashes in > the PoD code. > > Now that most shadow-code p2m lookups are done with explicitly > unlocked accessors, or with the get_page_from_gfn() accessor, which is > often lock-free, we can just turn this locking on.For the record: this includes the page table walking that both the emulate mapping callback and page fault handler need to do.> > The remaining locked lookups are in sh_page_fault() (in a path that''s > almost always already serializing on the paging lock), and in > emulate_map_dest() (which can probably be updated to use > get_page_from_gfn()).That is absolutely the case. Here is a rough patch: diff -r 7324955b35ad xen/arch/x86/mm/shadow/multi.c --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -4861,6 +4861,7 @@ static mfn_t emulate_gva_to_mfn(struct v struct sh_emulate_ctxt *sh_ctxt) { unsigned long gfn; + struct page_info *page; mfn_t mfn; p2m_type_t p2mt; uint32_t pfec = PFEC_page_present | PFEC_write_access; @@ -4878,22 +4879,30 @@ static mfn_t emulate_gva_to_mfn(struct v /* Translate the GFN to an MFN */ ASSERT(!paging_locked_by_me(v->domain)); - mfn = get_gfn(v->domain, _gfn(gfn), &p2mt); - + page = get_page_from_gfn(v->domain, gfn, &p2mt, P2M_ALLOC); + + /* Sanity checking */ + if ( page == NULL ) + { + return _mfn(BAD_GFN_TO_MFN); + } if ( p2m_is_readonly(p2mt) ) { - put_gfn(v->domain, gfn); + put_page(page); return _mfn(READONLY_GFN); } if ( !p2m_is_ram(p2mt) ) { - put_gfn(v->domain, gfn); + put_page(page); return _mfn(BAD_GFN_TO_MFN); } - + mfn = page_to_mfn(page); ASSERT(mfn_valid(mfn)); + v->arch.paging.last_write_was_pt = !!sh_mfn_is_a_page_table(mfn); - put_gfn(v->domain, gfn); + /* Note shadow cannot page out or unshare this mfn, so the map won''t + * disappear. Otherwise, caller must hold onto page until done. */ + put_page(page); return mfn; }> They''re not addressed here but may be in a > follow-up patch. >Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> or Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>? see http://lists.xen.org/archives/html/xen-devel/2012-04/msg00962.html Thanks Andres> Signed-off-by: Tim Deegan <tim@xen.org> > --- > xen/arch/x86/mm/p2m.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > index de1dd82..2db73c9 100644 > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -218,8 +218,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn, > return _mfn(gfn); > } > > - /* For now only perform locking on hap domains */ > - if ( locked && (hap_enabled(p2m->domain)) ) > + if ( locked ) > /* Grab the lock here, don''t release until put_gfn */ > gfn_lock(p2m, gfn, 0); > > @@ -248,8 +247,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn, > > void __put_gfn(struct p2m_domain *p2m, unsigned long gfn) > { > - if ( !p2m || !paging_mode_translate(p2m->domain) > - || !hap_enabled(p2m->domain) ) > + if ( !p2m || !paging_mode_translate(p2m->domain) ) > /* Nothing to do in this case */ > return; > > -- > 1.7.10.4 > > > > > ------------------------------ > > Message: 2 > Date: Thu, 21 Feb 2013 14:22:51 +0000 > From: Tim Deegan <tim@xen.org> > To: Olaf Hering <olaf@aepfle.de> > Cc: Eddie Dong <eddie.dong@intel.com>, Jun Nakajima > <jun.nakajima@intel.com>, xen-devel@lists.xen.org > Subject: Re: [Xen-devel] crash in nvmx_vcpu_destroy > Message-ID: <20130221142251.GK24051@ocelot.phlegethon.org> > Content-Type: text/plain; charset=iso-8859-1 > > At 15:19 +0100 on 21 Feb (1361459959), Olaf Hering wrote: >> This patch fixes the crash for me. Thanks. > > Great - in that case it will be fixed by Jan''s more comprehensive patch > when that goes in. > > Cheers, > > Tim. > > > > ------------------------------ > > Message: 3 > Date: Thu, 21 Feb 2013 14:23:31 +0000 > From: Xen.org security team <security@xen.org> > To: xen-announce@lists.xen.org, xen-devel@lists.xen.org, > xen-users@lists.xen.org, oss-security@lists.openwall.com > Cc: "Xen.org security team" <security@xen.org> > Subject: [Xen-devel] Xen Security Advisory 36 (CVE-2013-0153) - > interrupt remap entries shared and old ones not cleared on AMD IOMMUs > Message-ID: <E1U8X3b-0007ni-Cw@xenbits.xen.org> > Content-Type: text/plain; charset="utf-8" > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Xen Security Advisory CVE-2013-0153 / XSA-36 > version 4 > > interrupt remap entries shared and old ones not cleared on AMD IOMMUs > > UPDATES IN VERSION 4 > ===================> > Updated patches, to deal with a boot time crash resulting from the earlier > changes on systems with firmware broken in a way not previously accounted > for. > > ISSUE DESCRIPTION > ================> > To avoid an erratum in early hardware, the Xen AMD IOMMU code by > default chooses to use a single interrupt remapping table for the > whole system. This sharing implies that any guest with a passed > through PCI device that is bus mastering capable can inject interrupts > into other guests, including domain 0. > > Furthermore, regardless of whether a shared interrupt remapping table > is in use, old entries are not always cleared, providing opportunities > (which accumulate over time) for guests to inject interrupts into > other guests, again including domain 0. > > In a typical Xen system many devices are owned by domain 0 or driver > domains, leaving them vulnerable to such an attack. Such a DoS is > likely to have an impact on other guests running in the system. > > IMPACT > =====> > A malicious domain which is given access to a physical PCI device can > mount a denial of service attack affecting the whole system. > > VULNERABLE SYSTEMS > =================> > Xen versions 3.3 onwards are vulnerable. Earlier Xen versions do not > implement interrupt remapping, and hence do not support secure AMD-Vi > PCI passthrough in any case. > > Only systems using AMD-Vi for PCI passthrough are vulnerable. > > Any domain which is given access to a PCI device can take advantage of > this vulnerability. > > MITIGATION > =========> > This issue can be avoided by not assigning PCI devices to untrusted > guests. > > In Xen versions 4.1.3 and above the sharing of the interrupt remapping > table (and hence the more severe part of this problem) can be avoided > by passing "iommu=amd-iommu-perdev-intremap" as a command line option > to the hypervisor. This option is not fully functional on earlier > hypervisors. > > RESOLUTION > =========> > Applying the appropriate attached patch resolves this issue. > > Note that on certain systems (SP5100 chipsets with erratum 28 present, > or such with broken IVRS ACPI table) these patches will result in the > IOMMU not being enabled anymore. This should be dealt with by a BIOS > update, if available. Alternatively the check can be overridden by > specifying "iommu=no-amd-iommu-perdev-intremap" on the Xen command > line ("iommu=amd-iommu-global-intremap" on 4.1.x), at the price of > re-opening the security hole addressed by these patches. > > xsa36-unstable.patch Xen unstable > xsa36-4.2.patch Xen 4.2.x > xsa36-4.1.patch Xen 4.1.x > > $ sha256sum xsa36*.patch > 4bdc0f1f94f82c6bc6c777971f22ef915215b72b98b29f9064e4df65c0efc6f4 xsa36-4.1.patch > dd32ecaa84edbf6d11241045f40ba53ec4a3bc6c24f719bc21204067c4eb8964 xsa36-4.2.patch > 7c0b3a1b332a24a830c7a436b065943f60c54cd5b7e746c440e2992a7b5cfe41 xsa36-unstable.patch > $ > > Incremental patches on top of what was provided in version 3 can also be > taken from the respective mercurial trees: > > http://xenbits.xen.org/hg/xen-unstable.hg/rev/e68f14b9e739 > http://xenbits.xen.org/hg/staging/xen-4.2-testing.hg/rev/6a03b38b9cd6 > http://xenbits.xen.org/hg/staging/xen-4.1-testing.hg/rev/4d522221fa77 > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > > iQEcBAEBAgAGBQJRJf98AAoJEIP+FMlX6CvZ5ocH/jNY92kLw7BOencxa9R3TGTn > 20O0+j1id+xi2vjVVF2xm2SJ7g/6Egx5WURUfy2cu+I8GdDHKmRrp3Vkazltzcnd > 6AlI5aiPC2H1rFkU0FpneRk3mrluABLZO8Q5YcSJs24hwqded0W+SivH63aInki/ > PsDGoBu8HUjYMWjXyqCJVJIGToLS9ApaQ8+iTylWb1ZocRm2VcPS8yJI7z82kj3A > zRNADG36oAFawSJsE9z3ykVoYv9UYckOaWkaXh7jZPHAvIjvP2wLb9gmMkMXbIOP > ICpJJFf0w7oW6KTY3g9n8CxUMBMoUw/9Fv+CQBzOf0ZZY/vIE8q65A0NhCcWixo> =vmpB > -----END PGP SIGNATURE----- > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: xsa36-4.1.patch > Type: application/octet-stream > Size: 14403 bytes > Desc: not available > URL: <http://lists.xen.org/archives/html/xen-devel/attachments/20130221/02b3643c/attachment.obj> > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: xsa36-4.2.patch > Type: application/octet-stream > Size: 12586 bytes > Desc: not available > URL: <http://lists.xen.org/archives/html/xen-devel/attachments/20130221/02b3643c/attachment-0001.obj> > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: xsa36-unstable.patch > Type: application/octet-stream > Size: 12528 bytes > Desc: not available > URL: <http://lists.xen.org/archives/html/xen-devel/attachments/20130221/02b3643c/attachment-0002.obj> > > ------------------------------ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > > > End of Xen-devel Digest, Vol 96, Issue 309 > ******************************************_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Tim Deegan
2013-Feb-21 17:42 UTC
Re: [PATCH] x86/mm: Take the p2m lock even in shadow mode.
At 09:58 -0500 on 21 Feb (1361440689), Andres Lagar-Cavilla wrote:> > The remaining locked lookups are in sh_page_fault() (in a path that''s > > almost always already serializing on the paging lock), and in > > emulate_map_dest() (which can probably be updated to use > > get_page_from_gfn()). > That is absolutely the case. Here is a rough patch:Applied, thanks. I''ve taken your S-o-b below to apply to this patch too; hope that''s OK. Tim.> diff -r 7324955b35ad xen/arch/x86/mm/shadow/multi.c > --- a/xen/arch/x86/mm/shadow/multi.c > +++ b/xen/arch/x86/mm/shadow/multi.c > @@ -4861,6 +4861,7 @@ static mfn_t emulate_gva_to_mfn(struct v > struct sh_emulate_ctxt *sh_ctxt) > { > unsigned long gfn; > + struct page_info *page; > mfn_t mfn; > p2m_type_t p2mt; > uint32_t pfec = PFEC_page_present | PFEC_write_access; > @@ -4878,22 +4879,30 @@ static mfn_t emulate_gva_to_mfn(struct v > > /* Translate the GFN to an MFN */ > ASSERT(!paging_locked_by_me(v->domain)); > - mfn = get_gfn(v->domain, _gfn(gfn), &p2mt); > - > + page = get_page_from_gfn(v->domain, gfn, &p2mt, P2M_ALLOC); > + > + /* Sanity checking */ > + if ( page == NULL ) > + { > + return _mfn(BAD_GFN_TO_MFN); > + } > if ( p2m_is_readonly(p2mt) ) > { > - put_gfn(v->domain, gfn); > + put_page(page); > return _mfn(READONLY_GFN); > } > if ( !p2m_is_ram(p2mt) ) > { > - put_gfn(v->domain, gfn); > + put_page(page); > return _mfn(BAD_GFN_TO_MFN); > } > - > + mfn = page_to_mfn(page); > ASSERT(mfn_valid(mfn)); > + > v->arch.paging.last_write_was_pt = !!sh_mfn_is_a_page_table(mfn); > - put_gfn(v->domain, gfn); > + /* Note shadow cannot page out or unshare this mfn, so the map won''t > + * disappear. Otherwise, caller must hold onto page until done. */ > + put_page(page); > return mfn; > } > > > > > They''re not addressed here but may be in a > > follow-up patch. > > > > Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > or Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>? see http://lists.xen.org/archives/html/xen-devel/2012-04/msg00962.html >
Andres Lagar-Cavilla
2013-Feb-21 17:54 UTC
Re: [PATCH] x86/mm: Take the p2m lock even in shadow mode.
On Feb 21, 2013, at 12:42 PM, Tim Deegan <tim@xen.org> wrote:> At 09:58 -0500 on 21 Feb (1361440689), Andres Lagar-Cavilla wrote: >>> The remaining locked lookups are in sh_page_fault() (in a path that''s >>> almost always already serializing on the paging lock), and in >>> emulate_map_dest() (which can probably be updated to use >>> get_page_from_gfn()). >> That is absolutely the case. Here is a rough patch: > > Applied, thanks. I''ve taken your S-o-b below to apply to this patch > too; hope that''s OK.Cool, thanks Andres> > Tim. > >> diff -r 7324955b35ad xen/arch/x86/mm/shadow/multi.c >> --- a/xen/arch/x86/mm/shadow/multi.c >> +++ b/xen/arch/x86/mm/shadow/multi.c >> @@ -4861,6 +4861,7 @@ static mfn_t emulate_gva_to_mfn(struct v >> struct sh_emulate_ctxt *sh_ctxt) >> { >> unsigned long gfn; >> + struct page_info *page; >> mfn_t mfn; >> p2m_type_t p2mt; >> uint32_t pfec = PFEC_page_present | PFEC_write_access; >> @@ -4878,22 +4879,30 @@ static mfn_t emulate_gva_to_mfn(struct v >> >> /* Translate the GFN to an MFN */ >> ASSERT(!paging_locked_by_me(v->domain)); >> - mfn = get_gfn(v->domain, _gfn(gfn), &p2mt); >> - >> + page = get_page_from_gfn(v->domain, gfn, &p2mt, P2M_ALLOC); >> + >> + /* Sanity checking */ >> + if ( page == NULL ) >> + { >> + return _mfn(BAD_GFN_TO_MFN); >> + } >> if ( p2m_is_readonly(p2mt) ) >> { >> - put_gfn(v->domain, gfn); >> + put_page(page); >> return _mfn(READONLY_GFN); >> } >> if ( !p2m_is_ram(p2mt) ) >> { >> - put_gfn(v->domain, gfn); >> + put_page(page); >> return _mfn(BAD_GFN_TO_MFN); >> } >> - >> + mfn = page_to_mfn(page); >> ASSERT(mfn_valid(mfn)); >> + >> v->arch.paging.last_write_was_pt = !!sh_mfn_is_a_page_table(mfn); >> - put_gfn(v->domain, gfn); >> + /* Note shadow cannot page out or unshare this mfn, so the map won''t >> + * disappear. Otherwise, caller must hold onto page until done. */ >> + put_page(page); >> return mfn; >> } >> >> >> >>> They''re not addressed here but may be in a >>> follow-up patch. >>> >> >> Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> >> or Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>? see http://lists.xen.org/archives/html/xen-devel/2012-04/msg00962.html >>
Tim Deegan
2013-Feb-21 18:27 UTC
Re: [PATCH] x86/mm: Take the p2m lock even in shadow mode.
At 17:42 +0000 on 21 Feb (1361468573), Tim Deegan wrote:> At 09:58 -0500 on 21 Feb (1361440689), Andres Lagar-Cavilla wrote: > > > The remaining locked lookups are in sh_page_fault() (in a path that''s > > > almost always already serializing on the paging lock), and in > > > emulate_map_dest() (which can probably be updated to use > > > get_page_from_gfn()). > > That is absolutely the case. Here is a rough patch: > > Applied, thanks. I''ve taken your S-o-b below to apply to this patch > too; hope that''s OK.Jan, this second patch isn''t necessary for 4.2 -- the performance regression it fixes isn''t even measurable on 4-vcpu compile-tests. Tim.