Miroslav Rezanina
2009-Oct-19 11:18 UTC
Re: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
----- "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:> From: "Keir Fraser" <keir.fraser@eu.citrix.com> > To: "Miroslav Rezanina" <mrezanin@redhat.com>, "Dexuan Cui" <dexuan.cui@intel.com> > Cc: xen-devel@lists.xensource.com > Sent: Monday, October 19, 2009 11:22:48 AM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna > Subject: Re: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off > > On 19/10/2009 09:34, "Miroslav Rezanina" <mrezanin@redhat.com> wrote: > > >> Thanks, > >> -- Dexuan > >> > > > > Hi Dexuan, > > you''re right. We should print warning. In your patch, I do not > understand > > why you put comment only in setup_dom0_devices function. There is > more > > calling of domain_context_mapping and we check NULL also in > > domain_context_unmap > > and reassign_device_ownership. We should put warning in there too, > shouldn''t > > we? > > The warnings are silly, if we believe find_matched_drhd_unit() should > not > return NULL in those cases. Since obviously we wouldn''t know what to > do in > that case: bailing and doing nothing, while convenient and requiring > little > thought to implement, probably causes other subtler problems later on > since > those remap functions are supposed to actually do something! Crashing > immediately is the nice thing to do here: nice for the poor developer > who > may have to debug this case sometime in the future, in the hopefully > unlikely event our belief turns out to be false. > > I''ll be applying Dexuan''s original replacement patch. > > -- Keir >I retest version with removed NULL checks and in this case, my system crashes. So there has to be checks for NULL or something else is wrong. Regards, Mirek> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Miroslav Rezanina Software Engineer - Virtualization Team - XEN kernel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2009-Oct-19 11:31 UTC
RE: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
Miroslav Rezanina wrote:>>> >>> Hi Dexuan, >>> you're right. We should print warning. In your patch, I do not >>> understand why you put comment only in setup_dom0_devices function. >>> There is more calling of domain_context_mapping and we check NULL >>> also in domain_context_unmap and reassign_device_ownership. We >>> should put warning in there too, shouldn't we? >> >> The warnings are silly, if we believe find_matched_drhd_unit() should >> not >> return NULL in those cases. Since obviously we wouldn't know what to >> do in >> that case: bailing and doing nothing, while convenient and requiring >> little >> thought to implement, probably causes other subtler problems later on >> since >> those remap functions are supposed to actually do something! Crashing >> immediately is the nice thing to do here: nice for the poor developer >> who >> may have to debug this case sometime in the future, in the hopefully >> unlikely event our belief turns out to be false. >> >> I'll be applying Dexuan's original replacement patch. >> >> -- Keir >> > > I retest version with removed NULL checks and in this case, my system > crashes. So there has to be checks for NULL or something else is > wrong. > > Regards, > MirekHi Mirek, Do you mean: with changeset 20338: 5f28661bb2bb and "acpi=off iommu=1", xen crashes in your host? Can you post the entire serial log? Thanks, -- Dexuan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-19 11:43 UTC
Re: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
On 19/10/2009 12:18, "Miroslav Rezanina" <mrezanin@redhat.com> wrote:>> I''ll be applying Dexuan''s original replacement patch. >> >> -- Keir >> > > I retest version with removed NULL checks and in this case, my system crashes. > So there has to be checks for NULL or something else is wrong.Got a backtrace and Xen boot params? If you pass acpi=off, then disable_acpi() is invoked, and this sets acpi_disabled. If acpi_disabled=1, then iommu_setup() sets iommu_enabled=0. If iommu_enabled=0 then I think all the update_ire_from_* and similar hooks get disabled in the callers. So something unexpected must be happening. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-19 11:51 UTC
Re: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
On 19/10/2009 12:43, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>> I retest version with removed NULL checks and in this case, my system >> crashes. >> So there has to be checks for NULL or something else is wrong. > > Got a backtrace and Xen boot params? If you pass acpi=off, then > disable_acpi() is invoked, and this sets acpi_disabled. If acpi_disabled=1, > then iommu_setup() sets iommu_enabled=0. If iommu_enabled=0 then I think all > the update_ire_from_* and similar hooks get disabled in the callers. So > something unexpected must be happening.Hmmm, perhaps this could be an early boot-time crash before iommu_setup() is even called? Perhaps moving the if(acpi_disabled) iommu_enabled=0 somewhere early-ish in setup.c:__start_xen(), with a warning message, would be the right thing to do in that case (e.g., immediately after the call to cmdline_parse()). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-19 12:02 UTC
Re: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
On 19/10/2009 12:51, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>> Got a backtrace and Xen boot params? If you pass acpi=off, then >> disable_acpi() is invoked, and this sets acpi_disabled. If acpi_disabled=1, >> then iommu_setup() sets iommu_enabled=0. If iommu_enabled=0 then I think all >> the update_ire_from_* and similar hooks get disabled in the callers. So >> something unexpected must be happening. > > Hmmm, perhaps this could be an early boot-time crash before iommu_setup() is > even called? Perhaps moving the if(acpi_disabled) iommu_enabled=0 somewhere > early-ish in setup.c:__start_xen(), with a warning message, would be the > right thing to do in that case (e.g., immediately after the call to > cmdline_parse()).After some more hmmm''ing I have to admit your original patch was actually the best way to go. :-) All other callers of acpi_find_matched_drhd_unit() that you didn''t patch already check for NULL return, so it''s presumably expected as a possibility. And it is okay for msi_msg_{read,write}_remap_rte() to bail silently if they cannot determine there''s any work to do, since they operate as potential modifiers to operations which are primarily implemented in their callers. So sorry about that. I''ll revert Dexuan''s patch. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2009-Oct-19 14:18 UTC
RE: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
Keir Fraser wrote:> On 19/10/2009 12:51, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > >>> Got a backtrace and Xen boot params? If you pass acpi=off, then >>> disable_acpi() is invoked, and this sets acpi_disabled. If >>> acpi_disabled=1, then iommu_setup() sets iommu_enabled=0. If >>> iommu_enabled=0 then I think all the update_ire_from_* and similar >>> hooks get disabled in the callers. So something unexpected must be >>> happening. >> >> Hmmm, perhaps this could be an early boot-time crash before >> iommu_setup() is even called? Perhaps moving the if(acpi_disabled) >> iommu_enabled=0 somewhere early-ish in setup.c:__start_xen(), with a >> warning message, would be the right thing to do in that case (e.g., >> immediately after the call to cmdline_parse()). > > After some more hmmm''ing I have to admit your original patch was > actually the best way to go. :-) > > All other callers of acpi_find_matched_drhd_unit() that you didn''t > patch already check for NULL return, so it''s presumably expected as a > possibility. And it is okay for msi_msg_{read,write}_remap_rte() to > bail silently if they cannot determine there''s any work to do, since > they operate as potential modifiers to operations which are primarily > implemented in their callers. > > So sorry about that. I''ll revert Dexuan''s patch. > > -- KeirBut, can''t you reproduce the crash I mentioned before? Please see the attached crash log -- I''m using c/s 20341:ea34183c5c11 and with "iommu=1 acpi=off" and I use a DQ35 host. Actually what I care is the " if ( acpi_disabled ) iommu_enabled = 0". Thanks, -- Dexuan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2009-Oct-19 14:46 UTC
RE: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
Cui, Dexuan wrote:> Keir Fraser wrote: >> On 19/10/2009 12:51, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: >> >>>> Got a backtrace and Xen boot params? If you pass acpi=off, then >>>> disable_acpi() is invoked, and this sets acpi_disabled. If >>>> acpi_disabled=1, then iommu_setup() sets iommu_enabled=0. If >>>> iommu_enabled=0 then I think all the update_ire_from_* and similar >>>> hooks get disabled in the callers. So something unexpected must be >>>> happening. >>> >>> Hmmm, perhaps this could be an early boot-time crash before >>> iommu_setup() is even called? Perhaps moving the if(acpi_disabled) >>> iommu_enabled=0 somewhere early-ish in setup.c:__start_xen(), with a >>> warning message, would be the right thing to do in that case (e.g., >>> immediately after the call to cmdline_parse()). >> >> After some more hmmm''ing I have to admit your original patch was >> actually the best way to go. :-) >> >> All other callers of acpi_find_matched_drhd_unit() that you didn''t >> patch already check for NULL return, so it''s presumably expected as a >> possibility. And it is okay for msi_msg_{read,write}_remap_rte() to >> bail silently if they cannot determine there''s any work to do, since >> they operate as potential modifiers to operations which are >> primarily implemented in their callers. >> >> So sorry about that. I''ll revert Dexuan''s patch. >> >> -- Keir > > But, can''t you reproduce the crash I mentioned before? > Please see the attached crash log -- I''m using c/s 20341:ea34183c5c11 > and with "iommu=1 acpi=off" and I use a DQ35 host. > > Actually what I care is the " if ( acpi_disabled ) iommu_enabled = 0".BTW: from my crash log, you can see the bogus info -- actually the host doesn''t support SC, QI and IR. (XEN) Intel VT-d Snoop Control supported. (XEN) Intel VT-d DMA Passthrough not supported. (XEN) Intel VT-d Queued Invalidation supported. (XEN) Intel VT-d Interrupt Remapping supported. (XEN) I/O virtualisation enabled Thanks, -- Dexuan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-19 15:34 UTC
Re: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
On 19/10/2009 15:18, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:>> So sorry about that. I''ll revert Dexuan''s patch. >> >> -- Keir > > But, can''t you reproduce the crash I mentioned before? > Please see the attached crash log -- I''m using c/s 20341:ea34183c5c11 and with > "iommu=1 acpi=off" and I use a DQ35 host. > > Actually what I care is the " if ( acpi_disabled ) iommu_enabled = 0".Does this really save you though? What if you specify just "iommu=1" on an ACPI system that doesn''t have VT-d (and therefore declares no DHRD units)? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2009-Oct-19 15:53 UTC
RE: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
Keir Fraser wrote:> On 19/10/2009 15:18, "Cui, Dexuan" <dexuan.cui@intel.com> wrote: > >>> So sorry about that. I''ll revert Dexuan''s patch. >>> >>> -- Keir >> >> But, can''t you reproduce the crash I mentioned before? >> Please see the attached crash log -- I''m using c/s >> 20341:ea34183c5c11 and with "iommu=1 acpi=off" and I use a DQ35 host. >> >> Actually what I care is the " if ( acpi_disabled ) iommu_enabled >> 0". > > Does this really save you though?Do you mean we can leave the crash mentioned in my crash.log as it it? Looks not a good idea.> What if you specify just "iommu=1" > on an ACPI system that doesn''t have VT-d (and therefore declares no > DHRD units)?In such a case, acpi_dmar_init() would return -ENODEV and clear iommu_enabled to 0; later iommu_setup() would actually do nothing to try to enable VT-d, and would print "I/O virtualisation disabled". This is the expected correct behavior. Thanks, -- Dexuan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-19 15:57 UTC
Re: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
On 19/10/2009 15:46, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:>> But, can''t you reproduce the crash I mentioned before? >> Please see the attached crash log -- I''m using c/s 20341:ea34183c5c11 >> and with "iommu=1 acpi=off" and I use a DQ35 host. >> >> Actually what I care is the " if ( acpi_disabled ) iommu_enabled = 0". > > BTW: from my crash log, you can see the bogus info -- actually the host > doesn''t support SC, QI and IR. > (XEN) Intel VT-d Snoop Control supported. > (XEN) Intel VT-d DMA Passthrough not supported. > (XEN) Intel VT-d Queued Invalidation supported. > (XEN) Intel VT-d Interrupt Remapping supported. > (XEN) I/O virtualisation enabledAh, hm, well maybe you need that too. Actually I checked in a slightly broader check as c/s 20342, which checks that !list_empty(acpi_drhd_units). If you have no such units then initialising IOMMU support is rather pointless. And if you did not do ACPI bootstrap then you cannot have parsed any units. So the check is at least as strong as checking !acpi_disabled, I think. Hopefully everyone will be happy that this unlikely corner case, requiring the user to have actually shot themselves in the foot by manually specifying two contradictory boot parameters, is now solved to their satisfaction. ;-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2009-Oct-19 16:09 UTC
RE: [Xen-devel] RE: [Xen-changelog] [xen-unstable]vt-d: Fixpanic in msi_msg_read_remap_rte with acpi=off
Keir Fraser wrote:> On 19/10/2009 15:46, "Cui, Dexuan" <dexuan.cui@intel.com> wrote: > >>> But, can''t you reproduce the crash I mentioned before? >>> Please see the attached crash log -- I''m using c/s >>> 20341:ea34183c5c11 and with "iommu=1 acpi=off" and I use a DQ35 >>> host. >>> >>> Actually what I care is the " if ( acpi_disabled ) iommu_enabled >>> 0". >> >> BTW: from my crash log, you can see the bogus info -- actually the >> host doesn''t support SC, QI and IR. >> (XEN) Intel VT-d Snoop Control supported. >> (XEN) Intel VT-d DMA Passthrough not supported. >> (XEN) Intel VT-d Queued Invalidation supported. >> (XEN) Intel VT-d Interrupt Remapping supported. >> (XEN) I/O virtualisation enabled > > Ah, hm, well maybe you need that too. Actually I checked in a slightly > broader check as c/s 20342, which checks that > !list_empty(acpi_drhd_units). If you have no such units then > initialising IOMMU support is rather pointless. And if you did not do > ACPI bootstrap then you cannot have parsed any units. So the check is > at least as strong as checking !acpi_disabled, I think. > > Hopefully everyone will be happy that this unlikely corner case, > requiring the user to have actually shot themselves in the foot by > manually specifying two contradictory boot parameters, is now solved > to their satisfaction. ;-)Yes, I think 20342 should fix the issue. :-) Thanks, -- Dexuan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel