Jean Guyader
2011-Nov-10 08:43 UTC
[Xen-devel] [PATCH 0/6] IOMMU, vtd and iotlb flush rework (v5)
In one of my previous email I detailed a bug I was seeing when passing through a Intel GPU on a guest that has more that 4G or RAM. Allen suggested that I go for the Plan B but after a discussion with Tim we agreed that Plan B was way to disruptive in term of code change. This patch series implements Plan A. http://xen.1045712.n5.nabble.com/VTD-Intel-iommu-IOTLB-flush-really-slow-td4952866.html changes between v4 and v5: - Fix hypercall continuation for add_to_physmap in compat mode. Changes between v3 and v4: - Move the loop for gmfn_range from arch_memory_op to xenmem_add_to_physmap. - Add a comment to comment to explain the purpose of iommu_dont_flush_iotlb. Changes between v2 and v3: - Check for the presence iotlb_flush_all callback before calling it. Changes between v1 and v2: - Move size in struct xen_add_to_physmap in padding between .domid and .space. - Store iommu_dont_flush per cpu - Change the code in hvmloader to relocate by batch of 64K, .size is now 16 bits. Jean Guyader (6): vtd: Refactor iotlb flush code iommu: Introduce iommu_flush and iommu_flush_all. add_to_physmap: Move the code for XENMEM_add_to_physmap. mm: New XENMEM space, XENMAPSPACE_gmfn_range hvmloader: Change memory relocation loop when overlap with PCI hole. Introduce per cpu flag (iommu_dont_flush_iotlb) to avoid unnecessary iotlb flush tools/firmware/hvmloader/pci.c | 20 +++++-- xen/arch/x86/mm.c | 82 ++++++++++++++++++++++------ xen/arch/x86/x86_64/compat/mm.c | 10 ++++ xen/drivers/passthrough/iommu.c | 25 +++++++++ xen/drivers/passthrough/vtd/iommu.c | 100 ++++++++++++++++++++--------------- xen/include/public/memory.h | 4 ++ xen/include/xen/iommu.h | 17 ++++++ 7 files changed, 192 insertions(+), 66 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Factorize the iotlb flush code from map_page and unmap_page into it''s own function. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> Acked-by: Allen M Kay <allen.m.kay@intel.com> --- xen/drivers/passthrough/vtd/iommu.c | 86 +++++++++++++++++----------------- 1 files changed, 43 insertions(+), 43 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jean Guyader
2011-Nov-10 08:44 UTC
[Xen-devel] [PATCH 2/6] iommu: Introduce iommu_flush and iommu_flush_all.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> Acked-by: Allen M Kay <allen.m.kay@intel.com> --- xen/drivers/passthrough/iommu.c | 20 ++++++++++++++++++++ xen/drivers/passthrough/vtd/iommu.c | 12 ++++++++++++ xen/include/xen/iommu.h | 5 +++++ 3 files changed, 37 insertions(+), 0 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jean Guyader
2011-Nov-10 08:44 UTC
[Xen-devel] [PATCH 3/6] add_to_physmap: Move the code for XENMEM_add_to_physmap.
Move the code for the XENMEM_add_to_physmap case into it''s own function (xenmem_add_to_physmap). Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> Acked-by: Tim Deegan <tim@xen.org> --- xen/arch/x86/mm.c | 188 ++++++++++++++++++++++++++++------------------------- 1 files changed, 99 insertions(+), 89 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jean Guyader
2011-Nov-10 08:44 UTC
[Xen-devel] [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range
XENMAPSPACE_gmfn_range is like XENMAPSPACE_gmfn but it runs on a range of pages. The size of the range is defined in a new field. This new field .size is located in the 16 bits padding between .domid and .space in struct xen_add_to_physmap to stay compatible with older versions. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> --- xen/arch/x86/mm.c | 164 ++++++++++++++++++++++----------------- xen/arch/x86/x86_64/compat/mm.c | 10 +++ xen/include/public/memory.h | 4 + 3 files changed, 108 insertions(+), 70 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jean Guyader
2011-Nov-10 08:44 UTC
[Xen-devel] [PATCH 5/6] hvmloader: Change memory relocation loop when overlap with PCI hole.
Change the way we relocate the memory page if they overlap with pci hole. Use new map space (XENMAPSPACE_gmfn_range) to move the loop into xen. This code usually get triggered when a device is pass through to a guest and the PCI hole has to be extended to have enough room to map the device BARs. The PCI hole will starts lower and it might overlap with some RAM that has been alocated for the guest. That usually happen if the guest has more than 4G of RAM. We have to relocate those pages in high mem otherwise they won''t be accessible. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> --- tools/firmware/hvmloader/pci.c | 20 ++++++++++++++------ 1 files changed, 14 insertions(+), 6 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jean Guyader
2011-Nov-10 08:44 UTC
[Xen-devel] [PATCH 6/6] Introduce per cpu flag (iommu_dont_flush_iotlb) to avoid unnecessary iotlb flush
Add cpu flag that will be checked by the iommu low level code to skip iotlb flushes. iommu_iotlb_flush shall be called explicitly. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> Acked-by: Allen M Kay <allen.m.kay@intel.com> --- xen/arch/x86/mm.c | 14 ++++++++++++++ xen/drivers/passthrough/iommu.c | 5 +++++ xen/drivers/passthrough/vtd/iommu.c | 6 ++++-- xen/include/xen/iommu.h | 12 ++++++++++++ 4 files changed, 35 insertions(+), 2 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Nov-10 09:54 UTC
[Xen-devel] Re: [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range
>>> On 10.11.11 at 09:44, Jean Guyader <jean.guyader@eu.citrix.com> wrote:In the native implementation I neither see the XENMAPSPACE_gmfn_range case getting actually handled in the main switch (did you mean to change xatp.space to XENMAPSPACE_gmfn in that case?), nor do I see how you communicate back how many of the pages were successfully processed in the event of an error in the middle of the processing or when a continuation is required. But with the patch being pretty hard to read, maybe I''m simply overlooking something? Further (I realize I should have commented on this earlier) I think that in order to allow forward progress you should not check for preemption on the very first iteration of each (re-)invocation. That would also guarantee no behavioral change to the original single-page variants.>--- a/xen/arch/x86/x86_64/compat/mm.c >+++ b/xen/arch/x86/x86_64/compat/mm.c >@@ -63,6 +63,16 @@ int compat_arch_memory_op(int op, XEN_GUEST_HANDLE(void) arg) > > XLAT_add_to_physmap(nat, &cmp); > rc = arch_memory_op(op, guest_handle_from_ptr(nat, void)); >+ if ( rc < 0 ) >+ return rc; >+ >+ if ( rc == __HYPERVISOR_memory_op ) >+ hypercall_xlat_continuation(NULL, 0x2, nat, arg); >+ >+ XLAT_add_to_physmap(&cmp, nat); >+ >+ if ( copy_to_guest(arg, &cmp, 1) ) >+ return -EFAULT;Other than in the XENMEM_[gs]et_pod_target you (so far, subject to the above comment resulting in a behavioral change) don''t have any real outputs here, and hence there''s no need to always to the outbound translation - i.e. all of this could be moved into the if ()''s body. Jan> > break; > }_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Nov-10 10:15 UTC
[Xen-devel] Re: [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range
At 09:54 +0000 on 10 Nov (1320918887), Jan Beulich wrote:> >>> On 10.11.11 at 09:44, Jean Guyader <jean.guyader@eu.citrix.com> wrote: > > In the native implementation I neither see the XENMAPSPACE_gmfn_range > case getting actually handled in the main switch (did you mean to change > xatp.space to XENMAPSPACE_gmfn in that case?), nor do I see how you > communicate back how many of the pages were successfully processed in > the event of an error in the middle of the processing or when a > continuation is required. > > But with the patch being pretty hard to read, maybe I''m simply > overlooking something?The patch changes the (compat-translated) hypercall arguments in place to reflect what''s been done. Agreed that it''s particularly hard to read, though. Tim.> Further (I realize I should have commented on this earlier) I think that > in order to allow forward progress you should not check for preemption > on the very first iteration of each (re-)invocation. That would also > guarantee no behavioral change to the original single-page variants. > > >--- a/xen/arch/x86/x86_64/compat/mm.c > >+++ b/xen/arch/x86/x86_64/compat/mm.c > >@@ -63,6 +63,16 @@ int compat_arch_memory_op(int op, XEN_GUEST_HANDLE(void) arg) > > > > XLAT_add_to_physmap(nat, &cmp); > > rc = arch_memory_op(op, guest_handle_from_ptr(nat, void)); > >+ if ( rc < 0 ) > >+ return rc; > >+ > >+ if ( rc == __HYPERVISOR_memory_op ) > >+ hypercall_xlat_continuation(NULL, 0x2, nat, arg); > >+ > >+ XLAT_add_to_physmap(&cmp, nat); > >+ > >+ if ( copy_to_guest(arg, &cmp, 1) ) > >+ return -EFAULT; > > Other than in the XENMEM_[gs]et_pod_target you (so far, subject to the > above comment resulting in a behavioral change) don''t have any real > outputs here, and hence there''s no need to always to the outbound > translation - i.e. all of this could be moved into the if ()''s body. > > Jan > > > > > break; > > } >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jean Guyader
2011-Nov-10 10:18 UTC
[Xen-devel] Re: [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range
On 10/11 09:54, Jan Beulich wrote:> >>> On 10.11.11 at 09:44, Jean Guyader <jean.guyader@eu.citrix.com> wrote: > > In the native implementation I neither see the XENMAPSPACE_gmfn_range > case getting actually handled in the main switch (did you mean to change > xatp.space to XENMAPSPACE_gmfn in that case?), nor do I see how you > communicate back how many of the pages were successfully processed in > the event of an error in the middle of the processing or when a > continuation is required. >Yes, that is true.> But with the patch being pretty hard to read, maybe I''m simply > overlooking something? >Sorry about that, moving code arround doesn''t really look code on patches. I now changed xenmem_add_to_physmap to take a pointer on xatp so I can modify the argument directly (decrementing size and incrementing gpfn and idx). Then if xenmem_add_to_physmap return EGAIN I''ll copy xtap back to the guest handler and create the native continuation. Also a case was missing for XENMAPSPACE_gmfn_range in xenmem_add_to_physmap.> Further (I realize I should have commented on this earlier) I think that > in order to allow forward progress you should not check for preemption > on the very first iteration of each (re-)invocation. That would also > guarantee no behavioral change to the original single-page variants. >Agreed, I now only check for preemption in the case of new space (XENMAPSPACE_gmfn_range), so the original behavior should be preserved.> >--- a/xen/arch/x86/x86_64/compat/mm.c > >+++ b/xen/arch/x86/x86_64/compat/mm.c > >@@ -63,6 +63,16 @@ int compat_arch_memory_op(int op, XEN_GUEST_HANDLE(void) arg) > > > > XLAT_add_to_physmap(nat, &cmp); > > rc = arch_memory_op(op, guest_handle_from_ptr(nat, void)); > >+ if ( rc < 0 ) > >+ return rc; > >+ > >+ if ( rc == __HYPERVISOR_memory_op ) > >+ hypercall_xlat_continuation(NULL, 0x2, nat, arg); > >+ > >+ XLAT_add_to_physmap(&cmp, nat); > >+ > >+ if ( copy_to_guest(arg, &cmp, 1) ) > >+ return -EFAULT; > > Other than in the XENMEM_[gs]et_pod_target you (so far, subject to the > above comment resulting in a behavioral change) don''t have any real > outputs here, and hence there''s no need to always to the outbound > translation - i.e. all of this could be moved into the if ()''s body. >Done. Thanks for the review, Jean _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jean Guyader
2011-Nov-10 10:20 UTC
[Xen-devel] Re: [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range
On 10/11 10:15, Tim Deegan wrote:> At 09:54 +0000 on 10 Nov (1320918887), Jan Beulich wrote: > > >>> On 10.11.11 at 09:44, Jean Guyader <jean.guyader@eu.citrix.com> wrote: > > > > In the native implementation I neither see the XENMAPSPACE_gmfn_range > > case getting actually handled in the main switch (did you mean to change > > xatp.space to XENMAPSPACE_gmfn in that case?), nor do I see how you > > communicate back how many of the pages were successfully processed in > > the event of an error in the middle of the processing or when a > > continuation is required. > > > > But with the patch being pretty hard to read, maybe I''m simply > > overlooking something? > > The patch changes the (compat-translated) hypercall arguments in place to > reflect what''s been done. Agreed that it''s particularly hard to read, > though. >Actually no, because I''m not using the pointer in xenmem_add_to_physmap. That will be part of the next series, and that will make the patch even harder to read :(... Jean> Tim. > > > Further (I realize I should have commented on this earlier) I think that > > in order to allow forward progress you should not check for preemption > > on the very first iteration of each (re-)invocation. That would also > > guarantee no behavioral change to the original single-page variants. > > > > >--- a/xen/arch/x86/x86_64/compat/mm.c > > >+++ b/xen/arch/x86/x86_64/compat/mm.c > > >@@ -63,6 +63,16 @@ int compat_arch_memory_op(int op, XEN_GUEST_HANDLE(void) arg) > > > > > > XLAT_add_to_physmap(nat, &cmp); > > > rc = arch_memory_op(op, guest_handle_from_ptr(nat, void)); > > >+ if ( rc < 0 ) > > >+ return rc; > > >+ > > >+ if ( rc == __HYPERVISOR_memory_op ) > > >+ hypercall_xlat_continuation(NULL, 0x2, nat, arg); > > >+ > > >+ XLAT_add_to_physmap(&cmp, nat); > > >+ > > >+ if ( copy_to_guest(arg, &cmp, 1) ) > > >+ return -EFAULT; > > > > Other than in the XENMEM_[gs]et_pod_target you (so far, subject to the > > above comment resulting in a behavioral change) don''t have any real > > outputs here, and hence there''s no need to always to the outbound > > translation - i.e. all of this could be moved into the if ()''s body. > > > > Jan > > > > > > > > break; > > > } > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Nov-10 10:21 UTC
[Xen-devel] Re: [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range
On 10/11/2011 09:54, "Jan Beulich" <JBeulich@suse.com> wrote:>>>> On 10.11.11 at 09:44, Jean Guyader <jean.guyader@eu.citrix.com> wrote: > > In the native implementation I neither see the XENMAPSPACE_gmfn_range > case getting actually handled in the main switch (did you mean to change > xatp.space to XENMAPSPACE_gmfn in that case?), nor do I see how you > communicate back how many of the pages were successfully processed in > the event of an error in the middle of the processing or when a > continuation is required. > > But with the patch being pretty hard to read, maybe I''m simply > overlooking something? > > Further (I realize I should have commented on this earlier) I think that > in order to allow forward progress you should not check for preemption > on the very first iteration of each (re-)invocation. That would also > guarantee no behavioral change to the original single-page variants.There are plenty of other examples where we check for preemption before doing any real work (eg. do_mmuext_op, do_mmu_update). I guess checking at the end of the loop is a little bit better maybe. I''m not very bothered either way. -- Keir>> --- a/xen/arch/x86/x86_64/compat/mm.c >> +++ b/xen/arch/x86/x86_64/compat/mm.c >> @@ -63,6 +63,16 @@ int compat_arch_memory_op(int op, XEN_GUEST_HANDLE(void) >> arg) >> >> XLAT_add_to_physmap(nat, &cmp); >> rc = arch_memory_op(op, guest_handle_from_ptr(nat, void)); >> + if ( rc < 0 ) >> + return rc; >> + >> + if ( rc == __HYPERVISOR_memory_op ) >> + hypercall_xlat_continuation(NULL, 0x2, nat, arg); >> + >> + XLAT_add_to_physmap(&cmp, nat); >> + >> + if ( copy_to_guest(arg, &cmp, 1) ) >> + return -EFAULT; > > Other than in the XENMEM_[gs]et_pod_target you (so far, subject to the > above comment resulting in a behavioral change) don''t have any real > outputs here, and hence there''s no need to always to the outbound > translation - i.e. all of this could be moved into the if ()''s body. > > Jan > >> >> break; >> } >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Nov-10 10:48 UTC
[Xen-devel] Re: [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range
>>> On 10.11.11 at 11:21, Keir Fraser <keir@xen.org> wrote: > On 10/11/2011 09:54, "Jan Beulich" <JBeulich@suse.com> wrote: >> Further (I realize I should have commented on this earlier) I think that >> in order to allow forward progress you should not check for preemption >> on the very first iteration of each (re-)invocation. That would also >> guarantee no behavioral change to the original single-page variants. > > There are plenty of other examples where we check for preemption before > doing any real work (eg. do_mmuext_op, do_mmu_update).And I never really liked that, because of the very potential to live lock.> I guess checking at > the end of the loop is a little bit better maybe. I''m not very bothered > either way.Yes, and then only if further iterations are needed. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel