thr3ads.net - Xen devel - [Xen-devel] IOMMU faults [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Tim Deegan

2011-Jun-16 09:25 UTC

[Xen-devel] IOMMU faults

Hi, IOMMU maintainers,

What should Xen do when an IOMMU fault happens?  As far as I can
see both the AMD and Intel code clears the error in the IOMMU and
carries on, but I suspect some more vigorous action is appropriate.
I''ve seen traces from an Intel machine that seemed to be livelocked on
IOMMU faults from a passed-through VGA card, until it was killed by the
watchdog.  I think I can see two things that contribute to that:

 - The Intel IOMMU fault handler prints quite a lot of info in interrupt
   context, making it easier to livelock.  Still I think the general
   problem applies on AMD too.
 - Domain destruction re-assigns passed though cards to dom0, but the
   cards don''t seem to get reset.  So there''s nothing to stop
a card
   battering away at DMA in the meantime.  That seems like a problem
   independent of livelock, actually.

In any case, it seems like it would be a good idea to stop a
broken/malicious/deassigned card from flooding Xen with IOMMU faults.

I was considering just writing 0 to the faulting card''s PCI command
register, but I''m told that''s not always enough to properly
deactivate
a card, and it might be a little over-zealous to do it on the first
offence. 

Ideas?

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jean Guyader

2011-Jun-16 09:47 UTC

head link

[Xen-devel] Re: IOMMU faults

On 16/06 10:25, Tim Deegan wrote:> Hi, IOMMU maintainers,
> 
> What should Xen do when an IOMMU fault happens?  As far as I can
> see both the AMD and Intel code clears the error in the IOMMU and
> carries on, but I suspect some more vigorous action is appropriate.
> I''ve seen traces from an Intel machine that seemed to be
livelocked on
> IOMMU faults from a passed-through VGA card, until it was killed by the
> watchdog.  I think I can see two things that contribute to that:
> 
>  - The Intel IOMMU fault handler prints quite a lot of info in interrupt
>    context, making it easier to livelock.  Still I think the general
>    problem applies on AMD too.
>  - Domain destruction re-assigns passed though cards to dom0, but the
>    cards don''t seem to get reset.  So there''s nothing to
stop a card
>    battering away at DMA in the meantime.  That seems like a problem
>    independent of livelock, actually.
> 
> In any case, it seems like it would be a good idea to stop a
> broken/malicious/deassigned card from flooding Xen with IOMMU faults.
> 
> I was considering just writing 0 to the faulting card''s PCI
command
> register, but I''m told that''s not always enough to
properly deactivate
> a card, and it might be a little over-zealous to do it on the first
> offence. 
> 
> Ideas?
> 
Hi Tim,

We have seed such behavior when we were testing GPU assignement especially
the Intel GPU. The problem is that domain destruction in Xen is assynchronous
and right now the pci device reset is done in dom0 with some help of the
toolstack.

In the Intel GPU case we need to make sure that the guest memory and the IOMMU
are still in place while we perform to reset otherwise the device drift into
an unstable state.

There is probably other ways to do that in a cleaner way but we decided to move
the pci reset code into Xen, so we are sure we perform the reset while the
device
is in a known state (functionning state).

Attached is the patch we have in XenClient that move the pci reset into Xen.
The modifications we have made to the VT-d code should go in the IOMMU generic
section. I appologise but this patch is based on Xen 3.4, if we think this is
the right way to do it, I can submit a proper patch against unstable and 4.1.

Regards,
Jean

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2011-Jun-16 10:07 UTC

head link

Re: [Xen-devel] Re: IOMMU faults

Hi Jean, 

At 10:47 +0100 on 16 Jun (1308221277), Jean Guyader
wrote:> We have seed such behavior when we were testing GPU assignement especially
> the Intel GPU. The problem is that domain destruction in Xen is
assynchronous
> and right now the pci device reset is done in dom0 with some help of the
toolstack.
> 
> In the Intel GPU case we need to make sure that the guest memory and the
IOMMU
> are still in place while we perform to reset otherwise the device drift
into
> an unstable state.
> 
> There is probably other ways to do that in a cleaner way but we decided to
move
> the pci reset code into Xen, so we are sure we perform the reset while the
device
> is in a known state (functionning state).
> 
> Attached is the patch we have in XenClient that move the pci reset into
Xen.
Thanks, Jean.  This sounds like a good idea to me, though I''d like to
hear Wei and Allen''s opinions.

The patch is incomplete (missing the new pci_reset.[ch] files) but I get
the general idea.  A few questions:

 - Why the special handling for one graphics device on each domain? 
   (And if one, why not all?)
 - Why not reset when the target is dom0?  It seems like it can do no
   harm and should protect dom0 from assigning itself an active PCI
   card. 

Of course, even with this patch, my original question still stands:
should Xen do something more assertive in the IOMMU fault handler?

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jean Guyader

2011-Jun-16 10:28 UTC

head link

Re: [Xen-devel] Re: IOMMU faults

On 16/06 11:07, Tim Deegan wrote:> Hi Jean, 
> 
Reply below,
> At 10:47 +0100 on 16 Jun (1308221277), Jean Guyader wrote:
> > We have seed such behavior when we were testing GPU assignement
especially
> > the Intel GPU. The problem is that domain destruction in Xen is
assynchronous
> > and right now the pci device reset is done in dom0 with some help of
the toolstack.
> > 
> > In the Intel GPU case we need to make sure that the guest memory and
the IOMMU
> > are still in place while we perform to reset otherwise the device
drift into
> > an unstable state.
> > 
> > There is probably other ways to do that in a cleaner way but we
decided to move
> > the pci reset code into Xen, so we are sure we perform the reset while
the device
> > is in a known state (functionning state).
> > 
> > Attached is the patch we have in XenClient that move the pci reset
into Xen.
> 
> Thanks, Jean.  This sounds like a good idea to me, though I''d like
to
> hear Wei and Allen''s opinions.
> 
> The patch is incomplete (missing the new pci_reset.[ch] files) but I get
> the general idea.  A few questions:
Reattach the full patch.
> 
>  - Why the special handling for one graphics device on each domain? 
>    (And if one, why not all?)
No good reason really just a limitation of the patch, we can trivially
get ride of the limitation.
>  - Why not reset when the target is dom0?  It seems like it can do no
>    harm and should protect dom0 from assigning itself an active PCI
>    card. 
Reset could be quiet expensive (couple of seconds sometimes).
We did that to avoid a double reset on domain reboot.
I agree that we should remove that, or extend the IOMMU API
so we can reassign from domU to domU without going through
dom0.
> 
> Of course, even with this patch, my original question still stands:
> should Xen do something more assertive in the IOMMU fault handler?
> 
What we really want to achive here is to stop DMA on this device.
One way of doing it is to perform a proper PCI reset (FLR, secondary
bus reset, ...) when that happens.

Jean


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Wei Wang2

2011-Jun-16 14:30 UTC

head link

[Xen-devel] Re: IOMMU faults

Alberto BozzoOn Thursday 16 June 2011 11:25:09 Tim Deegan
wrote:> Hi, IOMMU maintainers,
>
> What should Xen do when an IOMMU fault happens?  As far as I can
> see both the AMD and Intel code clears the error in the IOMMU and
> carries on, but I suspect some more vigorous action is appropriate.
> I''ve seen traces from an Intel machine that seemed to be
livelocked on
> IOMMU faults from a passed-through VGA card, until it was killed by the
> watchdog.  I think I can see two things that contribute to that:
>
>  - The Intel IOMMU fault handler prints quite a lot of info in interrupt
>    context, making it easier to livelock.  Still I think the general
>    problem applies on AMD too.
This info could still be useful for debugging, but we should only enable this 
for debug build. 
>  - Domain destruction re-assigns passed though cards to dom0, but the
>    cards don''t seem to get reset.  So there''s nothing to
stop a card
>    battering away at DMA in the meantime.  That seems like a problem
>    independent of livelock, actually.
There should  be some FLR codes in tools (both xm and xl). But this might not 
work well with some devices...
> In any case, it seems like it would be a good idea to stop a
> broken/malicious/deassigned card from flooding Xen with IOMMU faults.
Yes, agree that. Actually I saw a lot could be improved in the fault handler. 
When iommu faults come from dma error, we should either stop the device from 
doing dma or inject errors to guest if the guest driver is able to handle io 
page fault.
> I was considering just writing 0 to the faulting card''s PCI
command
> register, but I''m told that''s not always enough to
properly deactivate
> a card, and it might be a little over-zealous to do it on the first
> offence.
> Ideas?It seems difficult to find a generic approach to stop a device without knowing 
more device specific details... 

Thanks,
Wei> Tim.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jun-16 14:47 UTC

head link

Re: [Xen-devel] Re: IOMMU faults

> > I was considering just writing 0 to the faulting card''s PCI
command
> > register, but I''m told that''s not always enough to
properly deactivate
> > a card, and it might be a little over-zealous to do it on the first
> > offence.
> > Ideas?
> It seems difficult to find a generic approach to stop a device without
knowing
> more device specific details... 
Perhaps make something similar to the MCE fault interrupts? As in when the error
happens, the Dom0 is notified of the offending BDF and persuses whatever action
it thinks are neccessary. The action would be to tell the device driver to
turn itself off. But how it would interact with the driver.. Well how does Linux
deal with this today? Is there an extension to the device driver API (similar to
the power) to notify the driver that it has done bad things and to shut itself
off?

Perhaps similar to the PCIe AER handling?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Kay, Allen M

2011-Jun-16 19:21 UTC

head link

[Xen-devel] RE: IOMMU faults

> - The Intel IOMMU fault handler prints quite a lot of info in interrupt
>   context, making it easier to livelock.  Still I think the general
>   problem applies on AMD too.
Someone at Intel looked into implementing measured rate printing in vt-d fault
handler.  He encountered some complications.  I remember it had to do with
measured rate printing not enabled by default (?).  For now, I think having it
print out only for debug case sounds simple enough.  I will submit a patch for
it.
> - Domain destruction re-assigns passed though cards to dom0, but the
>   cards don''t seem to get reset.  So there''s nothing to
stop a card
>   battering away at DMA in the meantime.  That seems like a problem
>   independent of livelock, actually.
>From reading the code in libxl, it seems libxl__device_pci_reset() is called
by both libxl__device_pci_add() and do_pci_remove().  Isn''t
do_pci_remove() called when the pass through device is reassigned to dom0 during
a domain teardown?
Allen

-----Original Message-----
From: Tim Deegan [mailto:Tim.Deegan@citrix.com] 
Sent: Thursday, June 16, 2011 2:25 AM
To: Kay, Allen M; Wei Wang
Cc: xen-devel@lists.xensource.com; Jean Guyader
Subject: IOMMU faults

Hi, IOMMU maintainers,

What should Xen do when an IOMMU fault happens?  As far as I can
see both the AMD and Intel code clears the error in the IOMMU and
carries on, but I suspect some more vigorous action is appropriate.
I''ve seen traces from an Intel machine that seemed to be livelocked on
IOMMU faults from a passed-through VGA card, until it was killed by the
watchdog.  I think I can see two things that contribute to that:

 - The Intel IOMMU fault handler prints quite a lot of info in interrupt
   context, making it easier to livelock.  Still I think the general
   problem applies on AMD too.
 - Domain destruction re-assigns passed though cards to dom0, but the
   cards don''t seem to get reset.  So there''s nothing to stop
a card
   battering away at DMA in the meantime.  That seems like a problem
   independent of livelock, actually.

In any case, it seems like it would be a good idea to stop a
broken/malicious/deassigned card from flooding Xen with IOMMU faults.

I was considering just writing 0 to the faulting card''s PCI command
register, but I''m told that''s not always enough to properly
deactivate
a card, and it might be a little over-zealous to do it on the first
offence. 

Ideas?

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2011-Jun-17 08:06 UTC

head link

Re: [Xen-devel] RE: IOMMU faults

At 12:21 -0700 on 16 Jun (1308226873), Kay, Allen M
wrote:> > - The Intel IOMMU fault handler prints quite a lot of info in
interrupt
> >   context, making it easier to livelock.  Still I think the general
> >   problem applies on AMD too.
> 
> Someone at Intel looked into implementing measured rate printing in vt-d
fault handler.  He encountered some complications.  I remember it had to do with
measured rate printing not enabled by default (?).  For now, I think having it
print out only for debug case sounds simple enough.  I will submit a patch for
it.
> 
That''s great, thanks. 
> > - Domain destruction re-assigns passed though cards to dom0, but the
> >   cards don''t seem to get reset.  So there''s nothing
to stop a card
> >   battering away at DMA in the meantime.  That seems like a problem
> >   independent of livelock, actually.
> 
> >From reading the code in libxl, it seems libxl__device_pci_reset() is
called by both libxl__device_pci_add() and do_pci_remove().  Isn''t
do_pci_remove() called when the pass through device is reassigned to dom0 during
a domain teardown?
Libxl could be too late, though.  When a domain is destroyed, its iommu
tables get torn down in Xen.  So if it has active devices:
 - they can start raising IOMMU faults immediately, and in some
   circumstances libxl might never get to run. 
 - since deassign is implemented as "assign to dom0" they might start 
   DMAing over dom0 memory.
If we can rely on the dom0 tools always completely resetting a
domains''s
devices before calling domctl_destroydomain, that should never happen.
That seems a bit fragile, though I guess dom0 can shoot itself in the foot in
enough other ways.   I prefer Jean''s reset-in-xen approach;
it''s only a few
hundred lines of code and we could reuse some of it for resetting
badly-behaved cards from the IOMMU fault handler. 

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2011-Jun-17 08:08 UTC

head link

Re: [Xen-devel] Re: IOMMU faults

At 10:47 -0400 on 16 Jun (1308221250), Konrad Rzeszutek Wilk
wrote:> Perhaps make something similar to the MCE fault interrupts? As in when
> the error happens, the Dom0 is notified of the offending BDF and
> persuses whatever action it thinks are neccessary. The action would be
> to tell the device driver to turn itself off. But how it would
> interact with the driver.. Well how does Linux deal with this today?
> Is there an extension to the device driver API (similar to the power)
> to notify the driver that it has done bad things and to shut itself
> off?
That sort of interface might be nice too, but I was worried more about
badly-behaved guests or devices.  In the livelock case the guest might
never get to run so can''t do anything, and a malicious guest would just
ignore the message anyway.

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2011-Jun-24 13:32 UTC

head link

Re: [Xen-devel] Re: IOMMU faults

Hi, 

At 11:28 +0100 on 16 Jun (1308223697), Jean Guyader
wrote:> > Of course, even with this patch, my original question still stands:
> > should Xen do something more assertive in the IOMMU fault handler?
> 
> What we really want to achive here is to stop DMA on this device.
> One way of doing it is to perform a proper PCI reset (FLR, secondary
> bus reset, ...) when that happens.
I think that''s more or less a consensus then, that we should try to
stop the device from the IOMMU fault handler.

Looking at your patch in a bit more detail, I see two things that worry
me.  The first is that the new pci_reset_device() function does nothing
at all if the device isn''t one of the particular graphics cards it know
about!

The second is this comment: 
> +    /* Leave CMD MEMORY set otherwise the platform can crashe during FLR
*/
> +    pci_conf_write16(bus, d, f, PCI_COMMAND, 2);
which implies that my current approach of just disabling the card might
have pretty bad conequences.  Can you expand on that?  Would it be
better just to mask out PCI_COMMAND_MASTER?  And if I do that do I need
to try and issue a reset as well (i.e. are there cards that are known to
ignore this bit?)

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2011-Jun-30 10:08 UTC

head link

Re: [Xen-devel] Re: IOMMU faults

At 14:32 +0100 on 24 Jun (1308925934), Tim Deegan wrote:> At 11:28 +0100 on 16 Jun (1308223697), Jean Guyader wrote:
> > > Of course, even with this patch, my original question still
stands:
> > > should Xen do something more assertive in the IOMMU fault
handler?
> > 
> > What we really want to achive here is to stop DMA on this device.
> > One way of doing it is to perform a proper PCI reset (FLR, secondary
> > bus reset, ...) when that happens.
> 
> I think that''s more or less a consensus then, that we should try
to
> stop the device from the IOMMU fault handler.
> 
> Looking at your patch in a bit more detail, I see two things that worry
> me.  The first is that the new pci_reset_device() function does nothing
> at all if the device isn''t one of the particular graphics cards it
know
> about!
> 
> The second is this comment: 
> 
> > +    /* Leave CMD MEMORY set otherwise the platform can crashe during
FLR */
> > +    pci_conf_write16(bus, d, f, PCI_COMMAND, 2);
> 
> which implies that my current approach of just disabling the card might
> have pretty bad conequences.  Can you expand on that?  Would it be
> better just to mask out PCI_COMMAND_MASTER?  And if I do that do I need
> to try and issue a reset as well (i.e. are there cards that are known to
> ignore this bit?)
Ping?

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jean Guyader

2011-Jun-30 10:31 UTC

head link

Re: [Xen-devel] Re: IOMMU faults

On 30/06 11:08, Tim Deegan wrote:> At 14:32 +0100 on 24 Jun (1308925934), Tim Deegan wrote:
> > At 11:28 +0100 on 16 Jun (1308223697), Jean Guyader wrote:
> > > > Of course, even with this patch, my original question still
stands:
> > > > should Xen do something more assertive in the IOMMU fault
handler?
> > > 
> > > What we really want to achive here is to stop DMA on this device.
> > > One way of doing it is to perform a proper PCI reset (FLR,
secondary
> > > bus reset, ...) when that happens.
> > 
> > I think that''s more or less a consensus then, that we should
try to
> > stop the device from the IOMMU fault handler.
> > 
> > Looking at your patch in a bit more detail, I see two things that
worry
> > me.  The first is that the new pci_reset_device() function does
nothing
> > at all if the device isn''t one of the particular graphics
cards it know
> > about!
> > 
In our case the reset of other devices is done using the classic Xen toolstack
way in dom0. But the code could be easily changed to do a proper reset on all
the type of devices.

> > The second is this comment: 
> > 
> > > +    /* Leave CMD MEMORY set otherwise the platform can crashe
during FLR */
> > > +    pci_conf_write16(bus, d, f, PCI_COMMAND, 2);
> > 
> > which implies that my current approach of just disabling the card
might
> > have pretty bad conequences.  Can you expand on that?  Would it be
> > better just to mask out PCI_COMMAND_MASTER?  And if I do that do I
need
> > to try and issue a reset as well (i.e. are there cards that are known
to
> > ignore this bit?)
Agreed, masking out PCI_COMMAND_MASTER should be enough, Linux only do that.

Jean

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jun 2011 - IOMMU faults

[Xen-devel] IOMMU faults

[Xen-devel] Re: IOMMU faults

Re: [Xen-devel] Re: IOMMU faults

Re: [Xen-devel] Re: IOMMU faults

[Xen-devel] Re: IOMMU faults

Re: [Xen-devel] Re: IOMMU faults

[Xen-devel] RE: IOMMU faults

Re: [Xen-devel] RE: IOMMU faults

Re: [Xen-devel] Re: IOMMU faults

Re: [Xen-devel] Re: IOMMU faults

Re: [Xen-devel] Re: IOMMU faults

Re: [Xen-devel] Re: IOMMU faults