Wei Wang2
2011-Apr-08  10:52 UTC
[Xen-devel] [PATCH] AMD IOMMU: Fix an interrupt remapping issue
Some device could generate bogus interrupts if an IO-APIC RTE and an iommu interrupt remapping entry are not consistent during 2 adjacent 64bits IO-APIC RTE updates. For example, if the 2nd operation updates destination bits in RTE for SATA device and unmask it, in some case, SATA device will assert ioapic pin to generate interrupt immediately using new destination but iommu could still translate it into the old destination, then dom0 would be confused. To fix that, we sync up interrupt remapping entry with IO-APIC IRE on every 32 bits operation and foward IOAPIC RTE updates after interrupt remapping table has been changed. Jan, This patch fixes SATA device issue we observed (Bug #680824), please review it. Thanks! Wei -- Advanced Micro Devices GmbH Sitz: Dornach, Gemeinde Aschheim, Landkreis München Registergericht München, HRB Nr. 43632 WEEE-Reg-Nr: DE 12919551 Geschäftsführer: Alberto Bozzo, Andrew Bowd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Apr-08  11:26 UTC
[Xen-devel] Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue
>>> On 08.04.11 at 12:52, Wei Wang2 <wei.wang2@amd.com> wrote: > Some device could generate bogus interrupts if an IO-APIC RTE and an iommu > interrupt remapping entry are not consistent during 2 adjacent 64bits IO-APIC > > RTE updates. For example, if the 2nd operation updates destination bits in > RTE for SATA device and unmask it, in some case, SATA device will assert > ioapic pin to generate interrupt immediately using new destination but iommu > > could still translate it into the old destination, then dom0 would be > confused. To fix that, we sync up interrupt remapping entry with IO-APIC IRE > on every 32 bits operation and foward IOAPIC RTE updates after interrupt > remapping table has been changed. > > Jan, This patch fixes SATA device issue we observed (Bug #680824), please > review it. Thanks!Sure - once you attach the actual patch ;-) Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei Wang2
2011-Apr-08  11:35 UTC
[Xen-devel] [PATCH] AMD IOMMU: Fix an interrupt remapping issue
(sorry, Forget the patch) Some device could generate bogus interrupts if an IO-APIC RTE and an iommu interrupt remapping entry are not consistent during 2 adjacent 64bits IO-APIC RTE updates. For example, if the 2nd operation updates destination bits in RTE for SATA device and unmask it, in some case, SATA device will assert ioapic pin to generate interrupt immediately using new destination but iommu could still translate it into the old destination, then dom0 would be confused. To fix that, we sync up interrupt remapping entry with IO-APIC IRE on every 32 bits operation and foward IOAPIC RTE updates after interrupt remapping table has been changed. Jan, This patch fixes SATA device issue we observed (Bug #680824), please review it. Thanks! Wei -- Advanced Micro Devices GmbH Sitz: Dornach, Gemeinde Aschheim, Landkreis München Registergericht München, HRB Nr. 43632 WEEE-Reg-Nr: DE 12919551 Geschäftsführer: Alberto Bozzo, Andrew Bowd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Apr-08  13:43 UTC
[Xen-devel] Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue
>>> On 08.04.11 at 13:35, Wei Wang2 <wei.wang2@amd.com> wrote: > Some device could generate bogus interrupts if an IO-APIC RTE and an iommu > interrupt remapping entry are not consistent during 2 adjacent 64bits IO-APIC > RTE updates. For example, if the 2nd operation updates destination bits in > RTE for SATA device and unmask it, in some case, SATA device will assert > ioapic pin to generate interrupt immediately using new destination but iommu > could still translate it into the old destination, then dom0 would be > confused. To fix that, we sync up interrupt remapping entry with IO-APIC IRE > on every 32 bits operation and foward IOAPIC RTE updates after interrupt > remapping table has been changed.I don''t think this is correct: Without the patch, the filling of ioapic_rte takes into account the value already written. Now that you only write the value at the end of the function, you should overwrite the affected half with "value" immediately before calling update_intremap_entry_from_ioapic(). (Without knowing which half gets written, passing "value" to update_intremap_entry_from_ioapic() is pointless, and indeed the function doesn''t use that parameter.) Eliminating the double write if reg == rte_lo would also seem desirable (and in no case should you write back the old value after having called update_intremap_entry_from_ioapic()). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei Wang2
2011-Apr-08  14:26 UTC
[Xen-devel] Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue
On Friday 08 April 2011 15:43:57 Jan Beulich wrote:> >>> On 08.04.11 at 13:35, Wei Wang2 <wei.wang2@amd.com> wrote: > > > > Some device could generate bogus interrupts if an IO-APIC RTE and an > > iommu interrupt remapping entry are not consistent during 2 adjacent > > 64bits IO-APIC RTE updates. For example, if the 2nd operation updates > > destination bits in RTE for SATA device and unmask it, in some case, SATA > > device will assert ioapic pin to generate interrupt immediately using new > > destination but iommu could still translate it into the old destination, > > then dom0 would be confused. To fix that, we sync up interrupt remapping > > entry with IO-APIC IRE on every 32 bits operation and foward IOAPIC RTE > > updates after interrupt remapping table has been changed. > > I don''t think this is correct: Without the patch, the filling of ioapic_rte > takes into account the value already written. Now that you only write > the value at the end of the function, you should overwrite the > affected half with "value" immediately before calling > update_intremap_entry_from_ioapic().Sorry, not quite understand your point. My thought is, no matter dom0 tried to updates lower half or upper half of RTE, we always updates interrupt table from the lower half. This will keep iommu table strictly identically to RTE. The old code has an assumption that both lower half and upper of RTE should be updated together. But this might not be always true. If by incident, dom0 only updates the upper half and we don''t sync iommu with it, then the destination in RTE and iommu table will be different.> (Without knowing which half > gets written, passing "value" to update_intremap_entry_from_ioapic() > is pointless, and indeed the function doesn''t use that parameter.)True, I will remove this parameter in update_intremap_entry_from_ioapic(). ioapic_rte should have all information.> Eliminating the double write if reg == rte_lo would also seem desirable > (and in no case should you write back the old value after having called > update_intremap_entry_from_ioapic()).It not a write back, It just finishes IO-APIC RTE writes. After updating interrupt remapping table we still have to update RTE. It is just a copy of __io_apic_write (maybe I should just call it). Old code updates ioapic earlier than interrupt remapping table and sata device might generate interrupt right after this, which is not expected. Thanks, Wei> Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Apr-08  14:39 UTC
[Xen-devel] Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue
>>> On 08.04.11 at 16:26, Wei Wang2 <wei.wang2@amd.com> wrote:> On Friday 08 April 2011 15:43:57 Jan Beulich wrote: >> >>> On 08.04.11 at 13:35, Wei Wang2 <wei.wang2@amd.com> wrote: >> > >> > Some device could generate bogus interrupts if an IO-APIC RTE and an >> > iommu interrupt remapping entry are not consistent during 2 adjacent >> > 64bits IO-APIC RTE updates. For example, if the 2nd operation updates >> > destination bits in RTE for SATA device and unmask it, in some case, SATA >> > device will assert ioapic pin to generate interrupt immediately using new >> > destination but iommu could still translate it into the old destination, >> > then dom0 would be confused. To fix that, we sync up interrupt remapping >> > entry with IO-APIC IRE on every 32 bits operation and foward IOAPIC RTE >> > updates after interrupt remapping table has been changed. >> >> I don''t think this is correct: Without the patch, the filling of ioapic_rte >> takes into account the value already written. Now that you only write >> the value at the end of the function, you should overwrite the >> affected half with "value" immediately before calling >> update_intremap_entry_from_ioapic(). > Sorry, not quite understand your point. My thought is, no matter dom0 tried > to > updates lower half or upper half of RTE, we always updates interrupt table > from the lower half. This will keep iommu table strictly identically to RTE. > The old code has an assumption that both lower half and upper of RTE should > be updated together. But this might not be always true. If by incident, dom0 > only updates the upper half and we don''t sync iommu with it, then the > destination in RTE and iommu table will be different.No, that''s not my point. The problem I''m seeing is that you pass the old value (as read from the IO-APIC) to update_intremap_entry_from_ioapic(), but the function certainly should use the to-be-written one. Previously this was implicit because the IO-APIC register write happened first.>> Eliminating the double write if reg == rte_lo would also seem desirable >> (and in no case should you write back the old value after having called >> update_intremap_entry_from_ioapic()). > > It not a write back, It just finishes IO-APIC RTE writes. After updating > interrupt remapping table we still have to update RTE. It is just a copy of > __io_apic_write (maybe I should just call it). Old code updates ioapic > earlier than interrupt remapping table and sata device might generate > interrupt right after this, which is not expected.No. If reg == ret_lo, you write that IO-APIC register twice, which is pointless. With the other problem unaddressed, you actually first write back the old value (with the mask bit restored), which gets IO-APIC and remapping tables out of sync for a brief period of time (which is a problem by itself), then write the new value. With the other problem addressed, you would simply write the new value twice, which is wasteful given that these writes are uncached. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei Wang2
2011-Apr-08  15:06 UTC
[Xen-devel] Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue
On Friday 08 April 2011 16:39:13 Jan Beulich wrote:> >>> On 08.04.11 at 16:26, Wei Wang2 <wei.wang2@amd.com> wrote: > > > > On Friday 08 April 2011 15:43:57 Jan Beulich wrote: > >> >>> On 08.04.11 at 13:35, Wei Wang2 <wei.wang2@amd.com> wrote: > >> > > >> > Some device could generate bogus interrupts if an IO-APIC RTE and an > >> > iommu interrupt remapping entry are not consistent during 2 adjacent > >> > 64bits IO-APIC RTE updates. For example, if the 2nd operation updates > >> > destination bits in RTE for SATA device and unmask it, in some case, > >> > SATA device will assert ioapic pin to generate interrupt immediately > >> > using new destination but iommu could still translate it into the old > >> > destination, then dom0 would be confused. To fix that, we sync up > >> > interrupt remapping entry with IO-APIC IRE on every 32 bits operation > >> > and foward IOAPIC RTE updates after interrupt remapping table has been > >> > changed. > >> > >> I don''t think this is correct: Without the patch, the filling of > >> ioapic_rte takes into account the value already written. Now that you > >> only write the value at the end of the function, you should overwrite > >> the > >> affected half with "value" immediately before calling > >> update_intremap_entry_from_ioapic(). > > > > Sorry, not quite understand your point. My thought is, no matter dom0 > > tried to > > updates lower half or upper half of RTE, we always updates interrupt > > table from the lower half. This will keep iommu table strictly > > identically to RTE. The old code has an assumption that both lower half > > and upper of RTE should be updated together. But this might not be always > > true. If by incident, dom0 only updates the upper half and we don''t sync > > iommu with it, then the destination in RTE and iommu table will be > > different. > > No, that''s not my point. The problem I''m seeing is that you pass the > old value (as read from the IO-APIC) to > update_intremap_entry_from_ioapic(), but the function certainly > should use the to-be-written one. Previously this was implicit because > the IO-APIC register write happened first.OK, got it. That is definitely problematic. will fix it.> >> Eliminating the double write if reg == rte_lo would also seem desirable > >> (and in no case should you write back the old value after having called > >> update_intremap_entry_from_ioapic()). > > > > It not a write back, It just finishes IO-APIC RTE writes. After updating > > interrupt remapping table we still have to update RTE. It is just a copy > > of __io_apic_write (maybe I should just call it). Old code updates ioapic > > earlier than interrupt remapping table and sata device might generate > > interrupt right after this, which is not expected. > > No. If reg == ret_lo, you write that IO-APIC register twice, which is > pointless. With the other problem unaddressed, you actually first write > back the old value (with the mask bit restored), which gets IO-APIC > and remapping tables out of sync for a brief period of time (which is > a problem by itself), then write the new value. With the other problem > addressed, you would simply write the new value twice, which is > wasteful given that these writes are uncached.True. I will rework the patch try to eliminate this. Thanks Wei> Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei Wang2
2011-Apr-08  16:52 UTC
[Xen-devel] [PATCH] AMD IOMMU: Fix an interrupt remapping issue (v2)
Jan, How dose this one look like to you? Thanks, Wei Signed-off-by Wei Wang <wei.wang2@amd.com> -- Advanced Micro Devices GmbH Sitz: Dornach, Gemeinde Aschheim, Landkreis München Registergericht München, HRB Nr. 43632 WEEE-Reg-Nr: DE 12919551 Geschäftsführer: Alberto Bozzo, Andrew Bowd On Friday 08 April 2011 17:06:16 Wei Wang2 wrote:> On Friday 08 April 2011 16:39:13 Jan Beulich wrote: > > >>> On 08.04.11 at 16:26, Wei Wang2 <wei.wang2@amd.com> wrote: > > > > > > On Friday 08 April 2011 15:43:57 Jan Beulich wrote: > > >> >>> On 08.04.11 at 13:35, Wei Wang2 <wei.wang2@amd.com> wrote: > > >> > > > >> > Some device could generate bogus interrupts if an IO-APIC RTE and an > > >> > iommu interrupt remapping entry are not consistent during 2 adjacent > > >> > 64bits IO-APIC RTE updates. For example, if the 2nd operation > > >> > updates destination bits in RTE for SATA device and unmask it, in > > >> > some case, SATA device will assert ioapic pin to generate interrupt > > >> > immediately using new destination but iommu could still translate it > > >> > into the old destination, then dom0 would be confused. To fix that, > > >> > we sync up interrupt remapping entry with IO-APIC IRE on every 32 > > >> > bits operation and foward IOAPIC RTE updates after interrupt > > >> > remapping table has been changed. > > >> > > >> I don''t think this is correct: Without the patch, the filling of > > >> ioapic_rte takes into account the value already written. Now that you > > >> only write the value at the end of the function, you should overwrite > > >> the > > >> affected half with "value" immediately before calling > > >> update_intremap_entry_from_ioapic(). > > > > > > Sorry, not quite understand your point. My thought is, no matter dom0 > > > tried to > > > updates lower half or upper half of RTE, we always updates interrupt > > > table from the lower half. This will keep iommu table strictly > > > identically to RTE. The old code has an assumption that both lower half > > > and upper of RTE should be updated together. But this might not be > > > always true. If by incident, dom0 only updates the upper half and we > > > don''t sync iommu with it, then the destination in RTE and iommu table > > > will be different. > > > > No, that''s not my point. The problem I''m seeing is that you pass the > > old value (as read from the IO-APIC) to > > update_intremap_entry_from_ioapic(), but the function certainly > > should use the to-be-written one. Previously this was implicit because > > the IO-APIC register write happened first. > > OK, got it. That is definitely problematic. will fix it. > > > >> Eliminating the double write if reg == rte_lo would also seem > > >> desirable (and in no case should you write back the old value after > > >> having called update_intremap_entry_from_ioapic()). > > > > > > It not a write back, It just finishes IO-APIC RTE writes. After > > > updating interrupt remapping table we still have to update RTE. It is > > > just a copy of __io_apic_write (maybe I should just call it). Old code > > > updates ioapic earlier than interrupt remapping table and sata device > > > might generate interrupt right after this, which is not expected. > > > > No. If reg == ret_lo, you write that IO-APIC register twice, which is > > pointless. With the other problem unaddressed, you actually first write > > back the old value (with the mask bit restored), which gets IO-APIC > > and remapping tables out of sync for a brief period of time (which is > > a problem by itself), then write the new value. With the other problem > > addressed, you would simply write the new value twice, which is > > wasteful given that these writes are uncached. > > True. I will rework the patch try to eliminate this. > Thanks > Wei > > > Jan > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Apr-11  07:23 UTC
[Xen-devel] Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue (v2)
>>> On 08.04.11 at 18:52, Wei Wang2 <wei.wang2@amd.com> wrote: > Jan, How dose this one look like to you?Much better, but still not quite there: The unmasking must happen *after* the writing of the upper half (if that''s what is being modified). You could also skip the unmasking altogether if saved_mask == 1. And if you start using __io_apic_write() (which I find very desirable) is there a reason not to use it (and __io_apic_read()) in all of the other places you touch anyway, too? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei Wang2
2011-Apr-11  07:39 UTC
[Xen-devel] Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue (v2)
On Monday 11 April 2011 09:23:38 Jan Beulich wrote:> >>> On 08.04.11 at 18:52, Wei Wang2 <wei.wang2@amd.com> wrote: > > > > Jan, How dose this one look like to you? > > Much better, but still not quite there: The unmasking must happen > *after* the writing of the upper half (if that''s what is being modified). > > You could also skip the unmasking altogether if saved_mask == 1. > > And if you start using __io_apic_write() (which I find very desirable) > is there a reason not to use it (and __io_apic_read()) in all of the > other places you touch anyway, too? > > JanGood points. I will send a new version addressing the remaining issues. Thanks, Wei _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei Wang2
2011-Apr-11  10:31 UTC
[Xen-devel] [PATCH V3] AMD IOMMU: Fix an interrupt remapping issue
Jan, This patch is the latest. Should have addressed all of your concerns. Please take a look. Thanks, Wei Signed-off-by: Wei Wang <wei.wang2@amd.com> On Monday 11 April 2011 09:23:38 Jan Beulich wrote:> >>> On 08.04.11 at 18:52, Wei Wang2 <wei.wang2@amd.com> wrote: > > > > Jan, How dose this one look like to you? > > Much better, but still not quite there: The unmasking must happen > *after* the writing of the upper half (if that''s what is being modified). > > You could also skip the unmasking altogether if saved_mask == 1. > > And if you start using __io_apic_write() (which I find very desirable) > is there a reason not to use it (and __io_apic_read()) in all of the > other places you touch anyway, too? > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Apr-11  11:35 UTC
[Xen-devel] Re: [PATCH V3] AMD IOMMU: Fix an interrupt remapping issue
>>> On 11.04.11 at 12:31, Wei Wang2 <wei.wang2@amd.com> wrote: > This patch is the latest. Should have addressed all of your concerns. Please > take a look.Looks good to me now. Assuming it also still addresses the problem at hand: Acked-by: Jan Beulich <jbeulich@novell.com> Thanks, Jan> Thanks, > Wei > > Signed-off-by: Wei Wang <wei.wang2@amd.com> > > On Monday 11 April 2011 09:23:38 Jan Beulich wrote: >> >>> On 08.04.11 at 18:52, Wei Wang2 <wei.wang2@amd.com> wrote: >> > >> > Jan, How dose this one look like to you? >> >> Much better, but still not quite there: The unmasking must happen >> *after* the writing of the upper half (if that''s what is being modified). >> >> You could also skip the unmasking altogether if saved_mask == 1. >> >> And if you start using __io_apic_write() (which I find very desirable) >> is there a reason not to use it (and __io_apic_read()) in all of the >> other places you touch anyway, too? >> >> Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jul-19  09:37 UTC
Re: [Xen-devel] Re: [PATCH V3] AMD IOMMU: Fix an interrupt remapping issue
Is this a candidate for backporting to 4.1? -George On Mon, Apr 11, 2011 at 12:35 PM, Jan Beulich <JBeulich@novell.com> wrote:>>>> On 11.04.11 at 12:31, Wei Wang2 <wei.wang2@amd.com> wrote: >> This patch is the latest. Should have addressed all of your concerns. Please >> take a look. > > Looks good to me now. Assuming it also still addresses the problem > at hand: > Acked-by: Jan Beulich <jbeulich@novell.com> > > Thanks, Jan > >> Thanks, >> Wei >> >> Signed-off-by: Wei Wang <wei.wang2@amd.com> >> >> On Monday 11 April 2011 09:23:38 Jan Beulich wrote: >>> >>> On 08.04.11 at 18:52, Wei Wang2 <wei.wang2@amd.com> wrote: >>> > >>> > Jan, How dose this one look like to you? >>> >>> Much better, but still not quite there: The unmasking must happen >>> *after* the writing of the upper half (if that''s what is being modified). >>> >>> You could also skip the unmasking altogether if saved_mask == 1. >>> >>> And if you start using __io_apic_write() (which I find very desirable) >>> is there a reason not to use it (and __io_apic_read()) in all of the >>> other places you touch anyway, too? >>> >>> Jan > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jul-19  09:59 UTC
Re: [Xen-devel] Re: [PATCH V3] AMD IOMMU: Fix an interrupt remapping issue
Oops, n/m -- already in. :-) -G On Tue, Jul 19, 2011 at 10:37 AM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> Is this a candidate for backporting to 4.1? > > -George > > On Mon, Apr 11, 2011 at 12:35 PM, Jan Beulich <JBeulich@novell.com> wrote: >>>>> On 11.04.11 at 12:31, Wei Wang2 <wei.wang2@amd.com> wrote: >>> This patch is the latest. Should have addressed all of your concerns. Please >>> take a look. >> >> Looks good to me now. Assuming it also still addresses the problem >> at hand: >> Acked-by: Jan Beulich <jbeulich@novell.com> >> >> Thanks, Jan >> >>> Thanks, >>> Wei >>> >>> Signed-off-by: Wei Wang <wei.wang2@amd.com> >>> >>> On Monday 11 April 2011 09:23:38 Jan Beulich wrote: >>>> >>> On 08.04.11 at 18:52, Wei Wang2 <wei.wang2@amd.com> wrote: >>>> > >>>> > Jan, How dose this one look like to you? >>>> >>>> Much better, but still not quite there: The unmasking must happen >>>> *after* the writing of the upper half (if that''s what is being modified). >>>> >>>> You could also skip the unmasking altogether if saved_mask == 1. >>>> >>>> And if you start using __io_apic_write() (which I find very desirable) >>>> is there a reason not to use it (and __io_apic_read()) in all of the >>>> other places you touch anyway, too? >>>> >>>> Jan >> >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel