Han, Weidong
2008-Jul-23 09:04 UTC
[Xen-devel] Dom0 hypercall for adding and removing PCI devices
Now dom0 uses hypercalls to add and remove PCI devices. In pci_bus_probe_wrapper(), it firstly adds device, then use pci_bus_probe() to probe it, remove the device if probe fails. The approach is good to only assign workable devices (pci_bus_probe() successes) to dom0, but it obviously misses RMRR. During dom0 booting, BIOS will use RMRRs. If don''t map RMRRs, system will hang. There are two options: 1) Add a check in domain_context_unmap_one(), don''t remove the device from dom0 if it has RMRR. This check is added yesterday. But it''s not clean enough. The device is not assigned to dom0, while it is mapped in dom0 VT-d page table. 2) Establish a separate RMRR page table. If the device with RMRR is removed from dom0, unmap it from dom0 VT-d page table, instead map it to the separate RMRR page table. This solution is clean, but it introduces a new VT-d page table. Currently each domain has only one VT-d page table. What''s your opinions? Randy (Weidong) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jul-23 09:10 UTC
Re: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
On 23/7/08 10:04, "Han, Weidong" <weidong.han@intel.com> wrote:> 1) Add a check in domain_context_unmap_one(), don''t remove the device > from dom0 if it has RMRR. This check is added yesterday. But it''s not > clean enough. The device is not assigned to dom0, while it is mapped in > dom0 VT-d page table. > > 2) Establish a separate RMRR page table. If the device with RMRR is > removed from dom0, unmap it from dom0 VT-d page table, instead map it to > the separate RMRR page table. This solution is clean, but it introduces > a new VT-d page table. Currently each domain has only one VT-d page > table. > > What''s your opinions?So this would be one extra VT-d pagetable, for the whole system, which would be the fallback location for RMRR mappings for devices which are currently not assigned to any domain? Thus allowing firmware to successfully initiate DMA operations on those devices? Sounds sensible. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Han, Weidong
2008-Jul-23 09:26 UTC
RE: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
Keir Fraser wrote:> On 23/7/08 10:04, "Han, Weidong" <weidong.han@intel.com> wrote: > >> 1) Add a check in domain_context_unmap_one(), don''t remove the device >> from dom0 if it has RMRR. This check is added yesterday. But it''s not >> clean enough. The device is not assigned to dom0, while it is mapped >> in dom0 VT-d page table. >> >> 2) Establish a separate RMRR page table. If the device with RMRR is >> removed from dom0, unmap it from dom0 VT-d page table, instead map >> it to the separate RMRR page table. This solution is clean, but it >> introduces a new VT-d page table. Currently each domain has only one >> VT-d page table. >> >> What''s your opinions? > > So this would be one extra VT-d pagetable, for the whole system, > which would be the fallback location for RMRR mappings for devices > which are currently not assigned to any domain? Thus allowing > firmware to successfully initiate DMA operations on those devices? > Sounds sensible. >Is it possible that idle_domain owns the RMRR VT-d page table? Randy (Weidong) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jul-23 09:28 UTC
Re: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> wrote:>> So this would be one extra VT-d pagetable, for the whole system, >> which would be the fallback location for RMRR mappings for devices >> which are currently not assigned to any domain? Thus allowing >> firmware to successfully initiate DMA operations on those devices? >> Sounds sensible. >> > > Is it possible that idle_domain owns the RMRR VT-d page table?If that''s a convenient place to stash it then why not? Either way, seems you''re going to have it special-cased in the code as fallback owner for unassigned devices. It''s possible that having it stashed in the idle domain will simply make the code more confusing. I''m not sure though. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Espen Skoglund
2008-Jul-23 18:07 UTC
Re: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
[Keir Fraser]> On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> wrote: >>> So this would be one extra VT-d pagetable, for the whole system, >>> which would be the fallback location for RMRR mappings for devices >>> which are currently not assigned to any domain? Thus allowing >>> firmware to successfully initiate DMA operations on those devices? >>> Sounds sensible.Well, the VT-d page table for RMRR devices need not contain the whole system memory---only those regions specified in the DMAR tables.>> Is it possible that idle_domain owns the RMRR VT-d page table?> If that''s a convenient place to stash it then why not? Either way, > seems you''re going to have it special-cased in the code as fallback > owner for unassigned devices. It''s possible that having it stashed > in the idle domain will simply make the code more confusing. I''m not > sure though.Right. I don''t see any particular good reason to associate it with the idle domain. As noted above, the page table need not even cover the whole memory, and it will never change after being set up at boot time. If special case code is needed anyway, then one might as well install a custom VT-d page table. If supported by hardware, the extra page tables can even be skipped altogether and the device marked as having passthrough access. That would give the RMRR device complete access to system memory, though. eSk _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Han, Weidong
2008-Jul-24 08:20 UTC
RE: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
I have another concern, when BIOS is initiating DMA operation using RMRR, can we use RMRR VT-d page table to replace dom0 VT-d page table? Does it result in some DMA failures? Randy (weidong) Han, Weidong wrote:> Espen Skoglund wrote: >> [Keir Fraser] >>> On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> wrote: >>>>> So this would be one extra VT-d pagetable, for the whole system, >>>>> which would be the fallback location for RMRR mappings for devices >>>>> which are currently not assigned to any domain? Thus allowing >>>>> firmware to successfully initiate DMA operations on those devices? >>>>> Sounds sensible. >> >> Well, the VT-d page table for RMRR devices need not contain the whole >> system memory---only those regions specified in the DMAR tables. >> >>>> Is it possible that idle_domain owns the RMRR VT-d page table? >> >>> If that''s a convenient place to stash it then why not? Either way, >>> seems you''re going to have it special-cased in the code as fallback >>> owner for unassigned devices. It''s possible that having it stashed >>> in the idle domain will simply make the code more confusing. I''m not >>> sure though. >> >> Right. I don''t see any particular good reason to associate it with >> the idle domain. As noted above, the page table need not even cover >> the whole memory, and it will never change after being set up at boot >> time. If special case code is needed anyway, then one might as well >> install a custom VT-d page table. > > What does "custom VT-d page table" mean? > > Due to domain id is not the same with DID field in context, we can > reverse a DID for RMRR VT-d page table, it can avoid to associate > with idle domain. > > Currently we reassign the device from dom0 to target domain when > assign a device, and return the device to dom0 when target domain > tears down. It''s not correct due to some devices may be not assigned > to any domain. Current device_assigned() also needs to be changed. > Maybe it needs more changes on VT-d. > > I have some concerns, maybe I missed something. Why did you use dom0 > hypercall approach to replace original method? What''s the main > benefit? I also wonder it''s suitable to wrap pci_bus_probe() > function. > > Randy (Weidong) > >> >> If supported by hardware, the extra page tables can even be skipped >> altogether and the device marked as having passthrough access. That >> would give the RMRR device complete access to system memory, though. >> >> eSk_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jul-24 08:23 UTC
Re: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
If a device is assigned to a domain (in this case dom0) then that domain''s VT-d pagetables will contain the RMRR mappings for that device. Hence BIOS can perform DMA in those RMRR-indicated regions without swapping to and fro between domain tables and fallback RMRR tables. The new fallback RMRR table would be just that -- a fallback table used for any device not currently assigned to any domain (and hence those devices should only have DMAs initiated by the BIOS, if at all). -- Keir On 24/7/08 09:20, "Han, Weidong" <weidong.han@intel.com> wrote:> I have another concern, when BIOS is initiating DMA operation using > RMRR, can we use RMRR VT-d page table to replace dom0 VT-d page table? > Does it result in some DMA failures? > > Randy (weidong) > > Han, Weidong wrote: >> Espen Skoglund wrote: >>> [Keir Fraser] >>>> On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> wrote: >>>>>> So this would be one extra VT-d pagetable, for the whole system, >>>>>> which would be the fallback location for RMRR mappings for devices >>>>>> which are currently not assigned to any domain? Thus allowing >>>>>> firmware to successfully initiate DMA operations on those devices? >>>>>> Sounds sensible. >>> >>> Well, the VT-d page table for RMRR devices need not contain the whole >>> system memory---only those regions specified in the DMAR tables. >>> >>>>> Is it possible that idle_domain owns the RMRR VT-d page table? >>> >>>> If that''s a convenient place to stash it then why not? Either way, >>>> seems you''re going to have it special-cased in the code as fallback >>>> owner for unassigned devices. It''s possible that having it stashed >>>> in the idle domain will simply make the code more confusing. I''m not >>>> sure though. >>> >>> Right. I don''t see any particular good reason to associate it with >>> the idle domain. As noted above, the page table need not even cover >>> the whole memory, and it will never change after being set up at boot >>> time. If special case code is needed anyway, then one might as well >>> install a custom VT-d page table. >> >> What does "custom VT-d page table" mean? >> >> Due to domain id is not the same with DID field in context, we can >> reverse a DID for RMRR VT-d page table, it can avoid to associate >> with idle domain. >> >> Currently we reassign the device from dom0 to target domain when >> assign a device, and return the device to dom0 when target domain >> tears down. It''s not correct due to some devices may be not assigned >> to any domain. Current device_assigned() also needs to be changed. >> Maybe it needs more changes on VT-d. >> >> I have some concerns, maybe I missed something. Why did you use dom0 >> hypercall approach to replace original method? What''s the main >> benefit? I also wonder it''s suitable to wrap pci_bus_probe() >> function. >> >> Randy (Weidong) >> >>> >>> If supported by hardware, the extra page tables can even be skipped >>> altogether and the device marked as having passthrough access. That >>> would give the RMRR device complete access to system memory, though. >>> >>> eSk >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Han, Weidong
2008-Jul-24 08:32 UTC
RE: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
We found USB (has RMRR) is firstly assigned to dom0, but pci_bus_probe() failed, then it was removed from dom0. The removing needs switch to RMRR VT-d page table. At the same time, BIOS was using its RMRR. Randy (Weidong) Keir Fraser wrote:> If a device is assigned to a domain (in this case dom0) then that > domain''s VT-d pagetables will contain the RMRR mappings for that > device. Hence BIOS can perform DMA in those RMRR-indicated regions > without swapping to and fro between domain tables and fallback RMRR > tables. The new fallback RMRR table would be just that -- a fallback > table used for any device not currently assigned to any domain (and > hence those devices should only have DMAs initiated by the BIOS, if > at all). > > -- Keir > > On 24/7/08 09:20, "Han, Weidong" <weidong.han@intel.com> wrote: > >> I have another concern, when BIOS is initiating DMA operation using >> RMRR, can we use RMRR VT-d page table to replace dom0 VT-d page >> table? Does it result in some DMA failures? >> >> Randy (weidong) >> >> Han, Weidong wrote: >>> Espen Skoglund wrote: >>>> [Keir Fraser] >>>>> On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> wrote: >>>>>>> So this would be one extra VT-d pagetable, for the whole system, >>>>>>> which would be the fallback location for RMRR mappings for >>>>>>> devices which are currently not assigned to any domain? Thus >>>>>>> allowing firmware to successfully initiate DMA operations on >>>>>>> those devices? Sounds sensible. >>>> >>>> Well, the VT-d page table for RMRR devices need not contain the >>>> whole system memory---only those regions specified in the DMAR >>>> tables. >>>> >>>>>> Is it possible that idle_domain owns the RMRR VT-d page table? >>>> >>>>> If that''s a convenient place to stash it then why not? Either way, >>>>> seems you''re going to have it special-cased in the code as >>>>> fallback owner for unassigned devices. It''s possible that having >>>>> it stashed in the idle domain will simply make the code more >>>>> confusing. I''m not sure though. >>>> >>>> Right. I don''t see any particular good reason to associate it with >>>> the idle domain. As noted above, the page table need not even >>>> cover the whole memory, and it will never change after being set >>>> up at boot time. If special case code is needed anyway, then one >>>> might as well install a custom VT-d page table. >>> >>> What does "custom VT-d page table" mean? >>> >>> Due to domain id is not the same with DID field in context, we can >>> reverse a DID for RMRR VT-d page table, it can avoid to associate >>> with idle domain. >>> >>> Currently we reassign the device from dom0 to target domain when >>> assign a device, and return the device to dom0 when target domain >>> tears down. It''s not correct due to some devices may be not assigned >>> to any domain. Current device_assigned() also needs to be changed. >>> Maybe it needs more changes on VT-d. >>> >>> I have some concerns, maybe I missed something. Why did you use dom0 >>> hypercall approach to replace original method? What''s the main >>> benefit? I also wonder it''s suitable to wrap pci_bus_probe() >>> function. >>> >>> Randy (Weidong) >>> >>>> >>>> If supported by hardware, the extra page tables can even be skipped >>>> altogether and the device marked as having passthrough access. >>>> That would give the RMRR device complete access to system memory, >>>> though. >>>> >>>> eSk_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jul-24 08:37 UTC
Re: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
Can the pagetable switch not be made atomic (i.e., so that the RMRR regions appear continuously available throughout)? I''d have thought that would just naturally happen. If creating/destroying RMRR mappings is part of the switch operation, you''d have to destroy RMRR mappings in the dom0 table after the switch, and create RMRR mappings in the fallback table before the switch. Or just have RMRR mappings always mapped into all tables. -- Keir On 24/7/08 09:32, "Han, Weidong" <weidong.han@intel.com> wrote:> We found USB (has RMRR) is firstly assigned to dom0, but pci_bus_probe() > failed, then it was removed from dom0. The removing needs switch to RMRR > VT-d page table. At the same time, BIOS was using its RMRR. > > Randy (Weidong) > > Keir Fraser wrote: >> If a device is assigned to a domain (in this case dom0) then that >> domain''s VT-d pagetables will contain the RMRR mappings for that >> device. Hence BIOS can perform DMA in those RMRR-indicated regions >> without swapping to and fro between domain tables and fallback RMRR >> tables. The new fallback RMRR table would be just that -- a fallback >> table used for any device not currently assigned to any domain (and >> hence those devices should only have DMAs initiated by the BIOS, if >> at all). >> >> -- Keir >> >> On 24/7/08 09:20, "Han, Weidong" <weidong.han@intel.com> wrote: >> >>> I have another concern, when BIOS is initiating DMA operation using >>> RMRR, can we use RMRR VT-d page table to replace dom0 VT-d page >>> table? Does it result in some DMA failures? >>> >>> Randy (weidong) >>> >>> Han, Weidong wrote: >>>> Espen Skoglund wrote: >>>>> [Keir Fraser] >>>>>> On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> wrote: >>>>>>>> So this would be one extra VT-d pagetable, for the whole system, >>>>>>>> which would be the fallback location for RMRR mappings for >>>>>>>> devices which are currently not assigned to any domain? Thus >>>>>>>> allowing firmware to successfully initiate DMA operations on >>>>>>>> those devices? Sounds sensible. >>>>> >>>>> Well, the VT-d page table for RMRR devices need not contain the >>>>> whole system memory---only those regions specified in the DMAR >>>>> tables. >>>>> >>>>>>> Is it possible that idle_domain owns the RMRR VT-d page table? >>>>> >>>>>> If that''s a convenient place to stash it then why not? Either way, >>>>>> seems you''re going to have it special-cased in the code as >>>>>> fallback owner for unassigned devices. It''s possible that having >>>>>> it stashed in the idle domain will simply make the code more >>>>>> confusing. I''m not sure though. >>>>> >>>>> Right. I don''t see any particular good reason to associate it with >>>>> the idle domain. As noted above, the page table need not even >>>>> cover the whole memory, and it will never change after being set >>>>> up at boot time. If special case code is needed anyway, then one >>>>> might as well install a custom VT-d page table. >>>> >>>> What does "custom VT-d page table" mean? >>>> >>>> Due to domain id is not the same with DID field in context, we can >>>> reverse a DID for RMRR VT-d page table, it can avoid to associate >>>> with idle domain. >>>> >>>> Currently we reassign the device from dom0 to target domain when >>>> assign a device, and return the device to dom0 when target domain >>>> tears down. It''s not correct due to some devices may be not assigned >>>> to any domain. Current device_assigned() also needs to be changed. >>>> Maybe it needs more changes on VT-d. >>>> >>>> I have some concerns, maybe I missed something. Why did you use dom0 >>>> hypercall approach to replace original method? What''s the main >>>> benefit? I also wonder it''s suitable to wrap pci_bus_probe() >>>> function. >>>> >>>> Randy (Weidong) >>>> >>>>> >>>>> If supported by hardware, the extra page tables can even be skipped >>>>> altogether and the device marked as having passthrough access. >>>>> That would give the RMRR device complete access to system memory, >>>>> though. >>>>> >>>>> eSk >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Jul-24 08:43 UTC
RE: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
Isn''t enough to first switch VT-d page table, and then flush IOTLB? As long as RMRRs are kept same in two VT-d tables, and in any time a valid entry (either in IOTLB or by walking) can be found, above sequence seems complete. Thanks, Kevin>From: Keir Fraser >Sent: 2008年7月24日 16:38 > >Can the pagetable switch not be made atomic (i.e., so that the >RMRR regions >appear continuously available throughout)? I''d have thought >that would just >naturally happen. > >If creating/destroying RMRR mappings is part of the switch >operation, you''d >have to destroy RMRR mappings in the dom0 table after the >switch, and create >RMRR mappings in the fallback table before the switch. Or just >have RMRR >mappings always mapped into all tables. > > -- Keir > >On 24/7/08 09:32, "Han, Weidong" <weidong.han@intel.com> wrote: > >> We found USB (has RMRR) is firstly assigned to dom0, but >pci_bus_probe() >> failed, then it was removed from dom0. The removing needs >switch to RMRR >> VT-d page table. At the same time, BIOS was using its RMRR. >> >> Randy (Weidong) >> >> Keir Fraser wrote: >>> If a device is assigned to a domain (in this case dom0) then that >>> domain''s VT-d pagetables will contain the RMRR mappings for that >>> device. Hence BIOS can perform DMA in those RMRR-indicated regions >>> without swapping to and fro between domain tables and fallback RMRR >>> tables. The new fallback RMRR table would be just that -- a fallback >>> table used for any device not currently assigned to any domain (and >>> hence those devices should only have DMAs initiated by the BIOS, if >>> at all). >>> >>> -- Keir >>> >>> On 24/7/08 09:20, "Han, Weidong" <weidong.han@intel.com> wrote: >>> >>>> I have another concern, when BIOS is initiating DMA operation using >>>> RMRR, can we use RMRR VT-d page table to replace dom0 VT-d page >>>> table? Does it result in some DMA failures? >>>> >>>> Randy (weidong) >>>> >>>> Han, Weidong wrote: >>>>> Espen Skoglund wrote: >>>>>> [Keir Fraser] >>>>>>> On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> wrote: >>>>>>>>> So this would be one extra VT-d pagetable, for the >whole system, >>>>>>>>> which would be the fallback location for RMRR mappings for >>>>>>>>> devices which are currently not assigned to any domain? Thus >>>>>>>>> allowing firmware to successfully initiate DMA operations on >>>>>>>>> those devices? Sounds sensible. >>>>>> >>>>>> Well, the VT-d page table for RMRR devices need not contain the >>>>>> whole system memory---only those regions specified in the DMAR >>>>>> tables. >>>>>> >>>>>>>> Is it possible that idle_domain owns the RMRR VT-d page table? >>>>>> >>>>>>> If that''s a convenient place to stash it then why not? >Either way, >>>>>>> seems you''re going to have it special-cased in the code as >>>>>>> fallback owner for unassigned devices. It''s possible that having >>>>>>> it stashed in the idle domain will simply make the code more >>>>>>> confusing. I''m not sure though. >>>>>> >>>>>> Right. I don''t see any particular good reason to >associate it with >>>>>> the idle domain. As noted above, the page table need not even >>>>>> cover the whole memory, and it will never change after being set >>>>>> up at boot time. If special case code is needed anyway, then one >>>>>> might as well install a custom VT-d page table. >>>>> >>>>> What does "custom VT-d page table" mean? >>>>> >>>>> Due to domain id is not the same with DID field in context, we can >>>>> reverse a DID for RMRR VT-d page table, it can avoid to associate >>>>> with idle domain. >>>>> >>>>> Currently we reassign the device from dom0 to target domain when >>>>> assign a device, and return the device to dom0 when target domain >>>>> tears down. It''s not correct due to some devices may be >not assigned >>>>> to any domain. Current device_assigned() also needs to be changed. >>>>> Maybe it needs more changes on VT-d. >>>>> >>>>> I have some concerns, maybe I missed something. Why did >you use dom0 >>>>> hypercall approach to replace original method? What''s the main >>>>> benefit? I also wonder it''s suitable to wrap pci_bus_probe() >>>>> function. >>>>> >>>>> Randy (Weidong) >>>>> >>>>>> >>>>>> If supported by hardware, the extra page tables can even >be skipped >>>>>> altogether and the device marked as having passthrough access. >>>>>> That would give the RMRR device complete access to system memory, >>>>>> though. >>>>>> >>>>>> eSk >> > > > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jul-24 08:47 UTC
Re: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
Exactly my thought. K. On 24/7/08 09:43, "Tian, Kevin" <kevin.tian@intel.com> wrote:> Isn''t enough to first switch VT-d page table, and then flush IOTLB? > As long as RMRRs are kept same in two VT-d tables, and in any > time a valid entry (either in IOTLB or by walking) can be found, > above sequence seems complete. > > Thanks, > Kevin > >> From: Keir Fraser >> Sent: 2008年7月24日 16:38 >> >> Can the pagetable switch not be made atomic (i.e., so that the >> RMRR regions >> appear continuously available throughout)? I''d have thought >> that would just >> naturally happen. >> >> If creating/destroying RMRR mappings is part of the switch >> operation, you''d >> have to destroy RMRR mappings in the dom0 table after the >> switch, and create >> RMRR mappings in the fallback table before the switch. Or just >> have RMRR >> mappings always mapped into all tables. >> >> -- Keir >> >> On 24/7/08 09:32, "Han, Weidong" <weidong.han@intel.com> wrote: >> >>> We found USB (has RMRR) is firstly assigned to dom0, but >> pci_bus_probe() >>> failed, then it was removed from dom0. The removing needs >> switch to RMRR >>> VT-d page table. At the same time, BIOS was using its RMRR. >>> >>> Randy (Weidong) >>> >>> Keir Fraser wrote: >>>> If a device is assigned to a domain (in this case dom0) then that >>>> domain''s VT-d pagetables will contain the RMRR mappings for that >>>> device. Hence BIOS can perform DMA in those RMRR-indicated regions >>>> without swapping to and fro between domain tables and fallback RMRR >>>> tables. The new fallback RMRR table would be just that -- a fallback >>>> table used for any device not currently assigned to any domain (and >>>> hence those devices should only have DMAs initiated by the BIOS, if >>>> at all). >>>> >>>> -- Keir >>>> >>>> On 24/7/08 09:20, "Han, Weidong" <weidong.han@intel.com> wrote: >>>> >>>>> I have another concern, when BIOS is initiating DMA operation using >>>>> RMRR, can we use RMRR VT-d page table to replace dom0 VT-d page >>>>> table? Does it result in some DMA failures? >>>>> >>>>> Randy (weidong) >>>>> >>>>> Han, Weidong wrote: >>>>>> Espen Skoglund wrote: >>>>>>> [Keir Fraser] >>>>>>>> On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> wrote: >>>>>>>>>> So this would be one extra VT-d pagetable, for the >> whole system, >>>>>>>>>> which would be the fallback location for RMRR mappings for >>>>>>>>>> devices which are currently not assigned to any domain? Thus >>>>>>>>>> allowing firmware to successfully initiate DMA operations on >>>>>>>>>> those devices? Sounds sensible. >>>>>>> >>>>>>> Well, the VT-d page table for RMRR devices need not contain the >>>>>>> whole system memory---only those regions specified in the DMAR >>>>>>> tables. >>>>>>> >>>>>>>>> Is it possible that idle_domain owns the RMRR VT-d page table? >>>>>>> >>>>>>>> If that''s a convenient place to stash it then why not? >> Either way, >>>>>>>> seems you''re going to have it special-cased in the code as >>>>>>>> fallback owner for unassigned devices. It''s possible that having >>>>>>>> it stashed in the idle domain will simply make the code more >>>>>>>> confusing. I''m not sure though. >>>>>>> >>>>>>> Right. I don''t see any particular good reason to >> associate it with >>>>>>> the idle domain. As noted above, the page table need not even >>>>>>> cover the whole memory, and it will never change after being set >>>>>>> up at boot time. If special case code is needed anyway, then one >>>>>>> might as well install a custom VT-d page table. >>>>>> >>>>>> What does "custom VT-d page table" mean? >>>>>> >>>>>> Due to domain id is not the same with DID field in context, we can >>>>>> reverse a DID for RMRR VT-d page table, it can avoid to associate >>>>>> with idle domain. >>>>>> >>>>>> Currently we reassign the device from dom0 to target domain when >>>>>> assign a device, and return the device to dom0 when target domain >>>>>> tears down. It''s not correct due to some devices may be >> not assigned >>>>>> to any domain. Current device_assigned() also needs to be changed. >>>>>> Maybe it needs more changes on VT-d. >>>>>> >>>>>> I have some concerns, maybe I missed something. Why did >> you use dom0 >>>>>> hypercall approach to replace original method? What''s the main >>>>>> benefit? I also wonder it''s suitable to wrap pci_bus_probe() >>>>>> function. >>>>>> >>>>>> Randy (Weidong) >>>>>> >>>>>>> >>>>>>> If supported by hardware, the extra page tables can even >> be skipped >>>>>>> altogether and the device marked as having passthrough access. >>>>>>> That would give the RMRR device complete access to system memory, >>>>>>> though. >>>>>>> >>>>>>> eSk >>> >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Han, Weidong
2008-Jul-24 09:14 UTC
RE: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
VT-d spec says: Software must not modify fields other than the Present (P) field of currently present root-entries or context-entries. If modifications to other fields are required, software must first make these entries not-present (P=0), which requires serial invalidation of context-cache and IOTLB, and then transition them to present (P=1) state along with the modifications. So your suggestion is not feasible. Randy (Weidong) Keir Fraser wrote:> Exactly my thought. > > K. > > On 24/7/08 09:43, "Tian, Kevin" <kevin.tian@intel.com> wrote: > >> Isn''t enough to first switch VT-d page table, and then flush IOTLB? >> As long as RMRRs are kept same in two VT-d tables, and in any >> time a valid entry (either in IOTLB or by walking) can be found, >> above sequence seems complete. >> >> Thanks, >> Kevin >> >>> From: Keir Fraser >>> Sent: 2008年7月24日 16:38 >>> >>> Can the pagetable switch not be made atomic (i.e., so that the RMRR >>> regions appear continuously available throughout)? I''d have thought >>> that would just naturally happen. >>> >>> If creating/destroying RMRR mappings is part of the switch >>> operation, you''d have to destroy RMRR mappings in the dom0 table >>> after the switch, and create RMRR mappings in the fallback table >>> before the switch. Or just have RMRR mappings always mapped into >>> all tables. >>> >>> -- Keir >>> >>> On 24/7/08 09:32, "Han, Weidong" <weidong.han@intel.com> wrote: >>> >>>> We found USB (has RMRR) is firstly assigned to dom0, but >>>> pci_bus_probe() failed, then it was removed from dom0. The >>>> removing needs switch to RMRR VT-d page table. At the same time, >>>> BIOS was using its RMRR. >>>> >>>> Randy (Weidong) >>>> >>>> Keir Fraser wrote: >>>>> If a device is assigned to a domain (in this case dom0) then that >>>>> domain''s VT-d pagetables will contain the RMRR mappings for that >>>>> device. Hence BIOS can perform DMA in those RMRR-indicated regions >>>>> without swapping to and fro between domain tables and fallback >>>>> RMRR tables. The new fallback RMRR table would be just that -- a >>>>> fallback table used for any device not currently assigned to any >>>>> domain (and hence those devices should only have DMAs initiated >>>>> by the BIOS, if at all). >>>>> >>>>> -- Keir >>>>> >>>>> On 24/7/08 09:20, "Han, Weidong" <weidong.han@intel.com> wrote: >>>>> >>>>>> I have another concern, when BIOS is initiating DMA operation >>>>>> using RMRR, can we use RMRR VT-d page table to replace dom0 VT-d >>>>>> page table? Does it result in some DMA failures? >>>>>> >>>>>> Randy (weidong) >>>>>> >>>>>> Han, Weidong wrote: >>>>>>> Espen Skoglund wrote: >>>>>>>> [Keir Fraser] >>>>>>>>> On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> >>>>>>>>> wrote: >>>>>>>>>>> So this would be one extra VT-d pagetable, for the whole >>>>>>>>>>> system, which would be the fallback location for RMRR >>>>>>>>>>> mappings for devices which are currently not assigned to >>>>>>>>>>> any domain? Thus allowing firmware to successfully initiate >>>>>>>>>>> DMA operations on those devices? Sounds sensible. >>>>>>>> >>>>>>>> Well, the VT-d page table for RMRR devices need not contain the >>>>>>>> whole system memory---only those regions specified in the DMAR >>>>>>>> tables. >>>>>>>> >>>>>>>>>> Is it possible that idle_domain owns the RMRR VT-d page >>>>>>>>>> table? >>>>>>>> >>>>>>>>> If that''s a convenient place to stash it then why not? Either >>>>>>>>> way, seems you''re going to have it special-cased in the code >>>>>>>>> as fallback owner for unassigned devices. It''s possible that >>>>>>>>> having it stashed in the idle domain will simply make the >>>>>>>>> code more confusing. I''m not sure though. >>>>>>>> >>>>>>>> Right. I don''t see any particular good reason to associate it >>>>>>>> with the idle domain. As noted above, the page table need not >>>>>>>> even cover the whole memory, and it will never change after >>>>>>>> being set up at boot time. If special case code is needed >>>>>>>> anyway, then one might as well install a custom VT-d page >>>>>>>> table. >>>>>>> >>>>>>> What does "custom VT-d page table" mean? >>>>>>> >>>>>>> Due to domain id is not the same with DID field in context, we >>>>>>> can reverse a DID for RMRR VT-d page table, it can avoid to >>>>>>> associate with idle domain. >>>>>>> >>>>>>> Currently we reassign the device from dom0 to target domain when >>>>>>> assign a device, and return the device to dom0 when target >>>>>>> domain tears down. It''s not correct due to some devices may be >>>>>>> not assigned to any domain. Current device_assigned() also >>>>>>> needs to be changed. Maybe it needs more changes on VT-d. >>>>>>> >>>>>>> I have some concerns, maybe I missed something. Why did you use >>>>>>> dom0 hypercall approach to replace original method? What''s the >>>>>>> main benefit? I also wonder it''s suitable to wrap >>>>>>> pci_bus_probe() function. >>>>>>> >>>>>>> Randy (Weidong) >>>>>>> >>>>>>>> >>>>>>>> If supported by hardware, the extra page tables can even be >>>>>>>> skipped altogether and the device marked as having passthrough >>>>>>>> access. That would give the RMRR device complete access to >>>>>>>> system memory, though. >>>>>>>> >>>>>>>> eSk >>>> >>> >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jul-24 09:27 UTC
Re: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
Then what you do already (leave the device assigned to dom0) is the best you can do. That''s easy. -- Keir On 24/7/08 10:14, "Han, Weidong" <weidong.han@intel.com> wrote:> VT-d spec says: Software must not modify fields other than the Present (P) > field of currently present root-entries or context-entries. If modifications > to other fields are required, software must first make these entries > not-present (P=0), which requires serial invalidation of context-cache and > IOTLB, and then transition them to present (P=1) state along with the > modifications. > > So your suggestion is not feasible. > > Randy (Weidong) > > Keir Fraser wrote: >> Exactly my thought. >> >> K. >> >> On 24/7/08 09:43, "Tian, Kevin" <kevin.tian@intel.com> wrote: >> >>> Isn''t enough to first switch VT-d page table, and then flush IOTLB? >>> As long as RMRRs are kept same in two VT-d tables, and in any >>> time a valid entry (either in IOTLB or by walking) can be found, >>> above sequence seems complete. >>> >>> Thanks, >>> Kevin >>> >>>> From: Keir Fraser >>>> Sent: 2008年7月24日 16:38 >>>> >>>> Can the pagetable switch not be made atomic (i.e., so that the RMRR >>>> regions appear continuously available throughout)? I''d have thought >>>> that would just naturally happen. >>>> >>>> If creating/destroying RMRR mappings is part of the switch >>>> operation, you''d have to destroy RMRR mappings in the dom0 table >>>> after the switch, and create RMRR mappings in the fallback table >>>> before the switch. Or just have RMRR mappings always mapped into >>>> all tables. >>>> >>>> -- Keir >>>> >>>> On 24/7/08 09:32, "Han, Weidong" <weidong.han@intel.com> wrote: >>>> >>>>> We found USB (has RMRR) is firstly assigned to dom0, but >>>>> pci_bus_probe() failed, then it was removed from dom0. The >>>>> removing needs switch to RMRR VT-d page table. At the same time, >>>>> BIOS was using its RMRR. >>>>> >>>>> Randy (Weidong) >>>>> >>>>> Keir Fraser wrote: >>>>>> If a device is assigned to a domain (in this case dom0) then that >>>>>> domain''s VT-d pagetables will contain the RMRR mappings for that >>>>>> device. Hence BIOS can perform DMA in those RMRR-indicated regions >>>>>> without swapping to and fro between domain tables and fallback >>>>>> RMRR tables. The new fallback RMRR table would be just that -- a >>>>>> fallback table used for any device not currently assigned to any >>>>>> domain (and hence those devices should only have DMAs initiated >>>>>> by the BIOS, if at all). >>>>>> >>>>>> -- Keir >>>>>> >>>>>> On 24/7/08 09:20, "Han, Weidong" <weidong.han@intel.com> wrote: >>>>>> >>>>>>> I have another concern, when BIOS is initiating DMA operation >>>>>>> using RMRR, can we use RMRR VT-d page table to replace dom0 VT-d >>>>>>> page table? Does it result in some DMA failures? >>>>>>> >>>>>>> Randy (weidong) >>>>>>> >>>>>>> Han, Weidong wrote: >>>>>>>> Espen Skoglund wrote: >>>>>>>>> [Keir Fraser] >>>>>>>>>> On 23/7/08 10:26, "Han, Weidong" <weidong.han@intel.com> >>>>>>>>>> wrote: >>>>>>>>>>>> So this would be one extra VT-d pagetable, for the whole >>>>>>>>>>>> system, which would be the fallback location for RMRR >>>>>>>>>>>> mappings for devices which are currently not assigned to >>>>>>>>>>>> any domain? Thus allowing firmware to successfully initiate >>>>>>>>>>>> DMA operations on those devices? Sounds sensible. >>>>>>>>> >>>>>>>>> Well, the VT-d page table for RMRR devices need not contain the >>>>>>>>> whole system memory---only those regions specified in the DMAR >>>>>>>>> tables. >>>>>>>>> >>>>>>>>>>> Is it possible that idle_domain owns the RMRR VT-d page >>>>>>>>>>> table? >>>>>>>>> >>>>>>>>>> If that''s a convenient place to stash it then why not? Either >>>>>>>>>> way, seems you''re going to have it special-cased in the code >>>>>>>>>> as fallback owner for unassigned devices. It''s possible that >>>>>>>>>> having it stashed in the idle domain will simply make the >>>>>>>>>> code more confusing. I''m not sure though. >>>>>>>>> >>>>>>>>> Right. I don''t see any particular good reason to associate it >>>>>>>>> with the idle domain. As noted above, the page table need not >>>>>>>>> even cover the whole memory, and it will never change after >>>>>>>>> being set up at boot time. If special case code is needed >>>>>>>>> anyway, then one might as well install a custom VT-d page >>>>>>>>> table. >>>>>>>> >>>>>>>> What does "custom VT-d page table" mean? >>>>>>>> >>>>>>>> Due to domain id is not the same with DID field in context, we >>>>>>>> can reverse a DID for RMRR VT-d page table, it can avoid to >>>>>>>> associate with idle domain. >>>>>>>> >>>>>>>> Currently we reassign the device from dom0 to target domain when >>>>>>>> assign a device, and return the device to dom0 when target >>>>>>>> domain tears down. It''s not correct due to some devices may be >>>>>>>> not assigned to any domain. Current device_assigned() also >>>>>>>> needs to be changed. Maybe it needs more changes on VT-d. >>>>>>>> >>>>>>>> I have some concerns, maybe I missed something. Why did you use >>>>>>>> dom0 hypercall approach to replace original method? What''s the >>>>>>>> main benefit? I also wonder it''s suitable to wrap >>>>>>>> pci_bus_probe() function. >>>>>>>> >>>>>>>> Randy (Weidong) >>>>>>>> >>>>>>>>> >>>>>>>>> If supported by hardware, the extra page tables can even be >>>>>>>>> skipped altogether and the device marked as having passthrough >>>>>>>>> access. That would give the RMRR device complete access to >>>>>>>>> system memory, though. >>>>>>>>> >>>>>>>>> eSk >>>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Espen Skoglund
2008-Jul-24 14:16 UTC
RE: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
Except, the RMRR mappings should be the same in both the old and the new VT-d tables. The fields in the page tables would not change --- only the context entry (and the location of the VT-d page tables). I haven''t got the VT-d spec in front of me, but the quote below seems to suggest that one can not directly reassign a device to another domain. One would first have to mark it as non present in the context table before reassigning it. Can someone from Intel confirm whether this is the case or not? eSk [Weidong Han]> VT-d spec says: Software must not modify fields other than the > Present (P) field of currently present root-entries or > context-entries. If modifications to other fields are required, > software must first make these entries not-present (P=0), which > requires serial invalidation of context-cache and IOTLB, and then > transition them to present (P=1) state along with the modifications. > > So your suggestion is not feasible. > > Randy (Weidong) > > Keir Fraser wrote: >> Exactly my thought. >> >> K. >> >> On 24/7/08 09:43, "Tian, Kevin" <kevin.tian@intel.com> wrote: >> >>> Isn''t enough to first switch VT-d page table, and then flush IOTLB? >>> As long as RMRRs are kept same in two VT-d tables, and in any >>> time a valid entry (either in IOTLB or by walking) can be found, >>> above sequence seems complete._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2008-Jul-24 14:47 UTC
RE: [Xen-devel] Dom0 hypercall for adding and removing PCI devices
I took a look, and it seems that spec does say so: ---- Modifying Root and Context Entries When DMA-remapping hardware is active: * Software must serially invalidate the context-cache and the IOTLB when updating root-entries or context-entries. The serialization is required since hardware may utilize information from the context-caches to tag new entries inserted to the IOTLB while processing in-flight DMA requests. * Software must not modify fields other than the Present (P) field of currently present root-entries or context-entries. If modifications to other fields are required, software must first make these entries not-present (P=0), which requires serial invalidation of context-cache and IOTLB, and then transition them to present (P=1) state along with the modifications. ---- I guess RMRR mapping may be OK even by voilating the spec, since even stale entry is used which serves same. But since spec explicitly states it, I agree with Keir that current solution is easiest. For normal device re-assignment, that''s why some reset action is required before re-assignment, like FLR, etc. as discussed in other thread from Dexuan. Thanks, Kevin>From: Espen Skoglund >Sent: 2008年7月24日 22:17 > >Except, the RMRR mappings should be the same in both the old and the >new VT-d tables. The fields in the page tables would not change --- >only the context entry (and the location of the VT-d page tables). > >I haven''t got the VT-d spec in front of me, but the quote below seems >to suggest that one can not directly reassign a device to another >domain. One would first have to mark it as non present in the context >table before reassigning it. Can someone from Intel confirm whether >this is the case or not? > > eSk > > > >[Weidong Han] >> VT-d spec says: Software must not modify fields other than the >> Present (P) field of currently present root-entries or >> context-entries. If modifications to other fields are required, >> software must first make these entries not-present (P=0), which >> requires serial invalidation of context-cache and IOTLB, and then >> transition them to present (P=1) state along with the modifications. >> >> So your suggestion is not feasible. >> >> Randy (Weidong) >> >> Keir Fraser wrote: >>> Exactly my thought. >>> >>> K. >>> >>> On 24/7/08 09:43, "Tian, Kevin" <kevin.tian@intel.com> wrote: >>> >>>> Isn''t enough to first switch VT-d page table, and then flush IOTLB? >>>> As long as RMRRs are kept same in two VT-d tables, and in any >>>> time a valid entry (either in IOTLB or by walking) can be found, >>>> above sequence seems complete. > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel