I''ve been looking through the VT-d code trying to get a better grip on what''s going on internally, and I''ve got some questions regarding VT-d translation faults. o Currently all VT-d faults are handled in the iommu_page_fault() handler. This is kind of a misnomer since the fault handler must also be able to handle interrupt remapping faults and faults related to lookups for the context entry. I assume that this naming is just temporary? o The fault handler doesn''t actually do much right now. It just clears out the fault queue and prints out warnings. I can only suspect that some more code to handle faults more gracefully are somewhere in the pipeline. The question is what the plans for dealing with DMA translation faults are (i.e., due to accessing unmapped memory or writing to read-only mappings). At the very least the associated driver should have the possibility to somehow be notified about failed transactions due to translation faults. Is something like this being planned for? eSk _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Espen Skoglund wrote:> I''ve been looking through the VT-d code trying to get a better grip on > what''s going on internally, and I''ve got some questions regarding VT-d > translation faults. > > o Currently all VT-d faults are handled in the iommu_page_fault() > handler. This is kind of a misnomer since the fault handler must > also be able to handle interrupt remapping faults and faults > related to lookups for the context entry. I assume that this > naming is just temporary? >I agree iommu_page_fault() is kind of a misnomer.> o The fault handler doesn''t actually do much right now. It just > clears out the fault queue and prints out warnings. I can only > suspect that some more code to handle faults more gracefully are > somewhere in the pipeline. > > The question is what the plans for dealing with DMA translation faults > are (i.e., due to accessing unmapped memory or writing to read-only > mappings). At the very least the associated driver should have the > possibility to somehow be notified about failed transactions due to > translation faults. Is something like this being planned for? >Pls refer to 3.5 setion of VT-d spec. DMA requests that result in remapping faults must be blocked by hardware. The exact method of DMA blocking is implementation-specific. Faulting DMA write / read requests may be handled in much the same way as hardware handles write requests to non-existent memory. So I think our fault handler that clears fault queue and prints out warnings is enough. Randy (Weidong) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
[Weidong Han]> Espen Skoglund wrote: >> o The fault handler doesn''t actually do much right now. It just >> clears out the fault queue and prints out warnings. I can only >> suspect that some more code to handle faults more gracefully are >> somewhere in the pipeline. >> >> The question is what the plans for dealing with DMA translation >> faults are (i.e., due to accessing unmapped memory or writing to >> read-only mappings). At the very least the associated driver >> should have the possibility to somehow be notified about failed >> transactions due to translation faults. Is something like this >> being planned for? >>> Pls refer to 3.5 setion of VT-d spec. DMA requests that result in > remapping faults must be blocked by hardware. The exact method of > DMA blocking is implementation-specific. Faulting DMA write / read > requests may be handled in much the same way as hardware handles > write requests to non-existent memory. So I think our fault handler > that clears fault queue and prints out warnings is enough.Yes, I''m aware that faulting DMA remapping transactions are blocked by the hardware. However, I''m concerned that faulting transactions can cause malfunctions if not detected by the system. Sure, for read requests on PCIe the device will notice the error in the completion status and can handle it apropriately. But how about write requests? (I don''t know if PCIe device will be notified of memory write failures.) And how about requests from non-PCIe devices? It may be the case that all DMA translation faults are entirely the result of device drivers having screwed up and shot themselves in the foot, and that they should be allowed to repeatedly do so until their leg falls off. However, I''m not convinced that faulting DMA translations are always the cause of faulty drivers, in particular if one starts allowing frontends to more directly talk to the device. If DMA translation faults can always safely be ignored then I''m happy to accept that. On the other hand, a device repeatedly raising translation faults is surely doing something wrong, and bringing it down might be a good idea. Anyhow, if all DMA translation faults can be ignored, why then bother enabling faults in the context entries in the first place? eSk _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Given that we are mapping the entire guest memory in VT-d page tables in Xen, it is usually not the fault of the guest drivers when vt-d faults happen - assuming the drivers are not malicious and well tested on the bare metal. With this in mind, vt-d fault messages are currently used to debug vt-d code in Xen VMM. However, it is possible that guest driver might be malicious or have bugs that got exposed in the Xen VMM environment. In these cases, it might be appropirate to have a policy such as terminating the guest after it generates 1000 vt-d faults. Do you have any other sugguestions on how this can best be handled? Allen>-----Original Message----- >From: xen-devel-bounces@lists.xensource.com >[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of >Espen Skoglund >Sent: Friday, March 07, 2008 8:20 AM >To: Han, Weidong >Cc: xen-devel@lists.xensource.com >Subject: RE: [Xen-devel] Handling VT-d translation faults > >[Weidong Han] >> Espen Skoglund wrote: >>> o The fault handler doesn''t actually do much right now. It just >>> clears out the fault queue and prints out warnings. I can only >>> suspect that some more code to handle faults more gracefully are >>> somewhere in the pipeline. >>> >>> The question is what the plans for dealing with DMA translation >>> faults are (i.e., due to accessing unmapped memory or writing to >>> read-only mappings). At the very least the associated driver >>> should have the possibility to somehow be notified about failed >>> transactions due to translation faults. Is something like this >>> being planned for? >>> > >> Pls refer to 3.5 setion of VT-d spec. DMA requests that result in >> remapping faults must be blocked by hardware. The exact method of >> DMA blocking is implementation-specific. Faulting DMA write / read >> requests may be handled in much the same way as hardware handles >> write requests to non-existent memory. So I think our fault handler >> that clears fault queue and prints out warnings is enough. > >Yes, I''m aware that faulting DMA remapping transactions are blocked by >the hardware. However, I''m concerned that faulting transactions can >cause malfunctions if not detected by the system. Sure, for read >requests on PCIe the device will notice the error in the completion >status and can handle it apropriately. But how about write requests? >(I don''t know if PCIe device will be notified of memory write >failures.) And how about requests from non-PCIe devices? > >It may be the case that all DMA translation faults are entirely the >result of device drivers having screwed up and shot themselves in the >foot, and that they should be allowed to repeatedly do so until their >leg falls off. However, I''m not convinced that faulting DMA >translations are always the cause of faulty drivers, in particular if >one starts allowing frontends to more directly talk to the device. > >If DMA translation faults can always safely be ignored then I''m happy >to accept that. On the other hand, a device repeatedly raising >translation faults is surely doing something wrong, and bringing it >down might be a good idea. > >Anyhow, if all DMA translation faults can be ignored, why then bother >enabling faults in the context entries in the first place? > > eSk > > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel