thr3ads.net - Xen devel - Xen 4.2.1 boot failure with IOMMU enabled [Feb 2013]

If this information is useful, please help other people find it:
Share via:

povder

2013-Feb-11 17:31 UTC

Xen 4.2.1 boot failure with IOMMU enabled

Hi all

I already posted about this problem on xen-users some time ago
(http://markmail.org/message/sbgtyjqh6bzmqx4s) but I couldn''t
resolve my problem using help from people on xen-users, so I''m posting
here .

I have a problem with enabling IOMMU on Xen 4.2.1. When I enable it in BIOS
and in grub.conf using iommu=1 kernel option, my machine cannot boot.
I get a following error on serial console:

(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Xen BUG at pci_amd_iommu.c:35
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...

Is it a bug in Xen or maybe bug in BIOS?

I run CentOS 6.3 and use dom0 kernel 3.7.1 and Xen from
http://au1.mirror.crc.id.au/repo/el6/x86_64/ repository,
but I also tried with other 3.x and 2.6.32 kernels and different Xen
builds with no luck.

GRUB entry:
 title CentOS Xen kernel IOMMU serial console (3.7.1-3.el6xen.x86_64)
        root (hd0,0)
        kernel /xen.gz dom0_mem=1G,max:1G dom0_max_vcpus=1
dom0_vcpus_pin iommu=verbose loglvl=all guest_loglvl=all iommu=1
com1=38400,8n1 console=com1
        module /vmlinuz-3.7.1-3.el6xen.x86_64 ro
root=/dev/mapper/vg_titan_raid5-lv_titan_root
rd_LVM_LV=vg_titan_raid5/lv_titan_root rd_NO_LUKS LANG=en_US.UTF-8
rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto
rd_LVM_LV=vg_titan_raid5/lv_titan_swap  KEYBOARDTYPE=pc KEYTABLE=us
rd_NO_DM console=hvc0 earlyprintk=xen nomodeset
        module /initramfs-3.7.1-3.el6xen.x86_64.img

Hardware info:
 Motherboard: ASUS M4A89TD PRO USB3 (AMD 890FX chipset, reported to
work with IOMMU on Xen wiki)
 CPU: AMD Phenom II X6 1045T

Software info:
 OS: CentOS 6.3 64bit
 Xen: 4.2.1
 BIOS version: 3029 (up to date, also tried with older versions)

Detailed information:
 Full serial output: http://pastebin.com/raw.php?i=K1DuhDcj
 xl info (when booting with iommu=0): http://pastebin.com/raw.php?i=jU7bEFrN
 lspci -vvv: http://pastebin.com/raw.php?i=3wpKPQT9
 dmidecode: http://pastebin.com/raw.php?i=7wEcTXzr
 kernel config: http://pastebin.com/raw.php?i=zYgGZ84f

Please help
povder

Boris Ostrovsky

2013-Feb-12 03:20 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

----- povder@gmail.com wrote:
> 
> GRUB entry:
>  title CentOS Xen kernel IOMMU serial console (3.7.1-3.el6xen.x86_64)
>         root (hd0,0)
>         kernel /xen.gz dom0_mem=1G,max:1G dom0_max_vcpus=1
> dom0_vcpus_pin iommu=verbose loglvl=all guest_loglvl=all iommu=1
> com1=38400,8n1 console=com1
Try adding "iommu=debug" option --- it will print more information
including
dump of the ACPI table that describes IOMMU.

-boris

povder

2013-Feb-12 06:26 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

2013/2/12 Boris Ostrovsky
<boris.ostrovsky@oracle.com>:>
> Try adding "iommu=debug" option --- it will print more
information including
> dump of the ACPI table that describes IOMMU.
>
Thanks for a quick reply.
Here is the output with iommu=debug: http://pastebin.com/raw.php?i=1wwLw82c

Jan Beulich

2013-Feb-12 10:06 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 12.02.13 at 07:26, povder <povder@gmail.com> wrote:
> 2013/2/12 Boris Ostrovsky <boris.ostrovsky@oracle.com>:
>>
>> Try adding "iommu=debug" option --- it will print more
information including
>> dump of the ACPI table that describes IOMMU.
>>
> 
> Thanks for a quick reply.
> Here is the output with iommu=debug: http://pastebin.com/raw.php?i=1wwLw82c
There''s no boot failure in that log, so please clarify what this was
generated from.

Also, sadly, the debug output isn''t really helpful, which is why
the patch that supposedly fixes your boot problem also adjusts
what gets printed here (for the unlikely case that your problem
persists with that patch:
http://lists.xen.org/archives/html/xen-devel/2013-02/msg00408.html).

Jan

povder

2013-Feb-12 10:55 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

2013/2/12 Jan Beulich <JBeulich@suse.com>:>> On 12.02.13 at 07:26, povder <povder@gmail.com> wrote:
>> Here is the output with iommu=debug:
http://pastebin.com/raw.php?i=1wwLw82c
>
> There''s no boot failure in that log, so please clarify what this
was
> generated from.
>
I don''t know if I understood you well but for me this log contains a
boot failure.
Maybe I misunderstand the term "boot failure" but for me (non-native
english speaker)
when a machine cannot start I understand it as a "boot failure".
This log is a full serial output of start process of my machine
which ends up with a failure so in my opinion it contains a boot
failure - correct me
if I''m wrong.
> Also, sadly, the debug output isn''t really helpful, which is why
> the patch that supposedly fixes your boot problem also adjusts
> what gets printed here (for the unlikely case that your problem
> persists with that patch:
> http://lists.xen.org/archives/html/xen-devel/2013-02/msg00408.html).
>
So to ensure myself: I should apply a path from
http://lists.xen.org/archives/html/xen-devel/2013-02/msg00408.html
and that should resolve my problem? Or maybe is there is a planned release
of next Xen version that will already contain this patch? If a release
is planned
to be in near future I would prefer to wait for it rather than apply the patch
myself and compile Xen from sources.

Jan Beulich

2013-Feb-12 11:03 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 12.02.13 at 11:55, povder <povder@gmail.com> wrote:
> 2013/2/12 Jan Beulich <JBeulich@suse.com>:
>>> On 12.02.13 at 07:26, povder <povder@gmail.com> wrote:
>>> Here is the output with iommu=debug:
http://pastebin.com/raw.php?i=1wwLw82c
>>
>> There''s no boot failure in that log, so please clarify what
this was
>> generated from.
>>
> 
> I don''t know if I understood you well but for me this log contains
a
> boot failure.
> Maybe I misunderstand the term "boot failure" but for me
(non-native
> english speaker)
> when a machine cannot start I understand it as a "boot failure".
> This log is a full serial output of start process of my machine
> which ends up with a failure so in my opinion it contains a boot
> failure - correct me
> if I''m wrong.
Oh, I''m sorry, I didn''t look through all the way to the end of
the
log, as I expected the crash to happen before Dom0 even starts.
>> Also, sadly, the debug output isn''t really helpful, which is
why
>> the patch that supposedly fixes your boot problem also adjusts
>> what gets printed here (for the unlikely case that your problem
>> persists with that patch:
>> http://lists.xen.org/archives/html/xen-devel/2013-02/msg00408.html).
>>
> 
> So to ensure myself: I should apply a path from
> http://lists.xen.org/archives/html/xen-devel/2013-02/msg00408.html 
> and that should resolve my problem? Or maybe is there is a planned release
> of next Xen version that will already contain this patch? If a release
> is planned
> to be in near future I would prefer to wait for it rather than apply the 
> patch
> myself and compile Xen from sources.
With the above, the patch is unlikely to address your problem,
but will likely provide better debugging output. So please
nevertheless try building with that patch included, assuming
the problem first started after you built Xen from a recent
4.2-testing tree (as opposed to this being plain 4.2.1, in which
case the problem is obviously unrelated to the recent changes
I''m thinking of).

Jan

povder

2013-Feb-12 11:15 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

2013/2/12 Jan Beulich <JBeulich@suse.com>:> With the above, the patch is unlikely to address your problem,
> but will likely provide better debugging output. So please
> nevertheless try building with that patch included, assuming
> the problem first started after you built Xen from a recent
> 4.2-testing tree (as opposed to this being plain 4.2.1, in which
> case the problem is obviously unrelated to the recent changes
> I''m thinking of).
>
I haven''t built Xen myself, I use binaries from
http://au1.mirror.crc.id.au/repo/el6/x86_64/ repository and I guess
that builds in this repository are from plain 4.2.1.
xl info (when I boot with iommu disabled) shows:
xen_major              : 4
xen_minor              : 2
xen_extra              : .1

I just started using Xen when 4.2.1 already was released so this
problem appeared to me from the beginning. I can try with 4.2-testing
though.

Jan Beulich

2013-Feb-12 11:22 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 12.02.13 at 12:15, povder <povder@gmail.com> wrote:
> 2013/2/12 Jan Beulich <JBeulich@suse.com>:
>> With the above, the patch is unlikely to address your problem,
>> but will likely provide better debugging output. So please
>> nevertheless try building with that patch included, assuming
>> the problem first started after you built Xen from a recent
>> 4.2-testing tree (as opposed to this being plain 4.2.1, in which
>> case the problem is obviously unrelated to the recent changes
>> I''m thinking of).
>>
> 
> I haven''t built Xen myself, I use binaries from
> http://au1.mirror.crc.id.au/repo/el6/x86_64/ repository and I guess
> that builds in this repository are from plain 4.2.1.
> xl info (when I boot with iommu disabled) shows:
> xen_major              : 4
> xen_minor              : 2
> xen_extra              : .1
> 
> I just started using Xen when 4.2.1 already was released so this
> problem appeared to me from the beginning. I can try with 4.2-testing
> though.
No, there''s no point I''m afraid. We really need to analyze the
debugging output to first understand what''s missing.

Jan

Jan Beulich

2013-Feb-12 11:29 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 12.02.13 at 12:22, "Jan Beulich"
<JBeulich@suse.com> wrote:
>>>> On 12.02.13 at 12:15, povder <povder@gmail.com> wrote:
>> 2013/2/12 Jan Beulich <JBeulich@suse.com>:
>>> With the above, the patch is unlikely to address your problem,
>>> but will likely provide better debugging output. So please
>>> nevertheless try building with that patch included, assuming
>>> the problem first started after you built Xen from a recent
>>> 4.2-testing tree (as opposed to this being plain 4.2.1, in which
>>> case the problem is obviously unrelated to the recent changes
>>> I''m thinking of).
>>>
>> 
>> I haven''t built Xen myself, I use binaries from
>> http://au1.mirror.crc.id.au/repo/el6/x86_64/ repository and I guess
>> that builds in this repository are from plain 4.2.1.
>> xl info (when I boot with iommu disabled) shows:
>> xen_major              : 4
>> xen_minor              : 2
>> xen_extra              : .1
>> 
>> I just started using Xen when 4.2.1 already was released so this
>> problem appeared to me from the beginning. I can try with 4.2-testing
>> though.
> 
> No, there''s no point I''m afraid. We really need to
analyze the
> debugging output to first understand what''s missing.
All there is for bus 7 is

(XEN) AMD-Vi: IVHD Device Entry:
(XEN) AMD-Vi:  Type 0x2
(XEN) AMD-Vi:  Dev_Id 0x700
(XEN) AMD-Vi:  Flags 0x0

i.e. a single device at 07:00.0, yet from the register dump at the
crash it''s fairly clear that we''re talking about 07:00.1 here.
I''m
afraid only a firmware update can help you here (or passing
"iommu=off" to Xen); in particular I can''t see how we could
work
around that problem in software.

Jan

Boris Ostrovsky

2013-Feb-12 15:20 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

On 02/12/2013 06:29 AM, Jan Beulich wrote:>>>> On 12.02.13 at 12:22, "Jan Beulich"
<JBeulich@suse.com> wrote:
>>>>> On 12.02.13 at 12:15, povder <povder@gmail.com>
wrote:
>>> 2013/2/12 Jan Beulich <JBeulich@suse.com>:
>>>> With the above, the patch is unlikely to address your problem,
>>>> but will likely provide better debugging output. So please
>>>> nevertheless try building with that patch included, assuming
>>>> the problem first started after you built Xen from a recent
>>>> 4.2-testing tree (as opposed to this being plain 4.2.1, in
which
>>>> case the problem is obviously unrelated to the recent changes
>>>> I''m thinking of).
>>>>
>>> I haven''t built Xen myself, I use binaries from
>>> http://au1.mirror.crc.id.au/repo/el6/x86_64/ repository and I guess
>>> that builds in this repository are from plain 4.2.1.
>>> xl info (when I boot with iommu disabled) shows:
>>> xen_major              : 4
>>> xen_minor              : 2
>>> xen_extra              : .1
>>>
>>> I just started using Xen when 4.2.1 already was released so this
>>> problem appeared to me from the beginning. I can try with
4.2-testing
>>> though.
>> No, there''s no point I''m afraid. We really need to
analyze the
>> debugging output to first understand what''s missing.
> All there is for bus 7 is
>
> (XEN) AMD-Vi: IVHD Device Entry:
> (XEN) AMD-Vi:  Type 0x2
> (XEN) AMD-Vi:  Dev_Id 0x700
> (XEN) AMD-Vi:  Flags 0x0
>
> i.e. a single device at 07:00.0, yet from the register dump at the
> crash it''s fairly clear that we''re talking about 07:00.1
here. I''m
> afraid only a firmware update can help you here (or passing
> "iommu=off" to Xen); in particular I can''t see how we
could work
> around that problem in software.
I don''t see any devices on bus 7 in lspci output 
(http://pastebin.com/raw.php?i=3wpKPQT9 from original report).

However the log shows

pci 0000:07:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with
''pcie_aspm=force''
..
(XEN) PCI add device 0000:07:00.0



-boris

povder

2013-Feb-12 15:50 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

2013/2/12 Boris Ostrovsky
<boris.ostrovsky@oracle.com>:>
> I don''t see any devices on bus 7 in lspci output
> (http://pastebin.com/raw.php?i=3wpKPQT9 from original report).
>
> However the log shows
>
> pci 0000:07:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it
> with ''pcie_aspm=force''
> ..
> (XEN) PCI add device 0000:07:00.0
>
There is device 00:07.0 in lspci output from original report:
00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to
PCI bridge (PCI express gpp port G) (prog-if 00 [Normal decode])

Jan Beulich

2013-Feb-12 16:04 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 12.02.13 at 16:50, povder <povder@gmail.com> wrote:
> 2013/2/12 Boris Ostrovsky <boris.ostrovsky@oracle.com>:
>>
>> I don''t see any devices on bus 7 in lspci output
>> (http://pastebin.com/raw.php?i=3wpKPQT9 from original report).
>>
>> However the log shows
>>
>> pci 0000:07:00.0: disabling ASPM on pre-1.1 PCIe device.  You can
enable it
>> with ''pcie_aspm=force''
>> ..
>> (XEN) PCI add device 0000:07:00.0
>>
> 
> There is device 00:07.0 in lspci output from original report:
> 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to
> PCI bridge (PCI express gpp port G) (prog-if 00 [Normal decode])
But we''re seeing a device reported as 07:00.0; we don''t care
about the one at 00:07.0.

You ought to explain where this device comes from, or why your
lspci output doesn''t show it. Perhaps handing us a native kernel
boot log (at maximum log level) might already help...

Jan

povder

2013-Feb-12 17:44 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

2013/2/12 Jan Beulich <JBeulich@suse.com>:>>>> On 12.02.13 at 16:50, povder <povder@gmail.com> wrote:
>> 2013/2/12 Boris Ostrovsky <boris.ostrovsky@oracle.com>:
>>>
>>> I don''t see any devices on bus 7 in lspci output
>>> (http://pastebin.com/raw.php?i=3wpKPQT9 from original report).
>>>
>>> However the log shows
>>>
>>> pci 0000:07:00.0: disabling ASPM on pre-1.1 PCIe device.  You can
enable it
>>> with ''pcie_aspm=force''
>>> ..
>>> (XEN) PCI add device 0000:07:00.0
>>>
>>
>> There is device 00:07.0 in lspci output from original report:
>> 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to
>> PCI bridge (PCI express gpp port G) (prog-if 00 [Normal decode])
>
> But we''re seeing a device reported as 07:00.0; we don''t
care
> about the one at 00:07.0.
>
> You ought to explain where this device comes from, or why your
> lspci output doesn''t show it. Perhaps handing us a native kernel
> boot log (at maximum log level) might already help...
>
> Jan
>
Sorry, I''ve mistaken 00:07.0 with 07:00.0. I''ll post some more
info soon.

povder

2013-Feb-12 18:23 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

2013/2/12 Jan Beulich <JBeulich@suse.com>:> You ought to explain where this device comes from, or why your
> lspci output doesn''t show it. Perhaps handing us a native kernel
> boot log (at maximum log level) might already help...
>
The original lspci -vvv output I posted was from the time I had
Firewire disabled in BIOS, I guess I''ve reset settings since then
because I was trying on different BIOS versions. With Firewire enabled
at 06:00.0 is Firewire controller instead of SATA controller, at
07:00.0 is SATA controller and at 07:00.1 is IDE interface.

So I guess boot always fail on the 07:00.1 (or 06:00.1) IDE interface:
JMicron Technology Corp. JMB361 AHCI/IDE (rev 02) (prog-if 85 [Master
SecO PriO]).

If I disable Firewire boot fails with:
(XEN) PCI add device 0000:00:18.4
(XEN) PCI add device 0000:06:00.0
(XEN) Xen BUG at pci_amd_iommu.c:35
(XEN) ----[ Xen-4.2.1  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c48014afd2>] find_iommu_for_device+0x32/0x40
(XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
(XEN) rax: 0000000000000601   rbx: 0000000000000601   rcx: ffff83042c980010

If I enable Firewire boot fails with:
(XEN) PCI add device 0000:00:18.4
(XEN) PCI add device 0000:07:00.0
(XEN) Xen BUG at pci_amd_iommu.c:35
(XEN) ----[ Xen-4.2.1  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c48014afd2>] find_iommu_for_device+0x32/0x40
(XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
(XEN) rax: 0000000000000701   rbx: 0000000000000701   rcx: ffff83042c980010

full lspci -vvv with firewire enabled (with IDE interface as 07:00.1):
http://pastebin.com/raw.php?i=V7YqxNYD
boot log with firewire enabled (posted earlier):
http://pastebin.com/raw.php?i=1wwLw82c

full lspci -vvv with firewire disabled (posted earlier, with IDE
interface as 06:00.1): http://pastebin.com/raw.php?i=3wpKPQT9
boot log with firewire disabled: http://pastebin.com/raw.php?i=LhaN4XeK

povder

2013-Feb-12 18:40 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

I disabled id BIOS IDE interface that was causing problems and system
boots fine. Thanks for your help!

I just wonder if it''s a bug in BIOS or in Xen. If it''s ASUS
bug I
would like to report bug to them.

Boris Ostrovsky

2013-Feb-13 01:33 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

----- povder@gmail.com wrote:
> I disabled id BIOS IDE interface that was causing problems and system
> boots fine. Thanks for your help!
> 
> I just wonder if it''s a bug in BIOS or in Xen. If it''s
ASUS bug I
> would like to report bug to them.

This looks like BIOS bug -- there is no entry for the IDE interface in IVRS
table (which is used by IOMMU driver to discover devices). 

I am wondering whether such cases (undeclared devices in IVRS) should cause a
panic or disabling of IOMMU. This may be a more generic case of Jan''s
earlier
patch for dealing with missing IOAPIC. Not sure whether it would be possible 
to "unwind" IOMMU at this point though.

(For the record, I asked povder to run with xen-unstable that I provided 
to him because for some reason I thought this might be combined mode problem.
Obviously this had nothing to do with combined mode)

-boris

Jan Beulich

2013-Feb-13 08:05 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 12.02.13 at 19:40, povder <povder@gmail.com> wrote:
> I disabled id BIOS IDE interface that was causing problems and system
> boots fine. Thanks for your help!
> 
> I just wonder if it''s a bug in BIOS or in Xen. If it''s
ASUS bug I
> would like to report bug to them.
Quite obviously it''s a BIOS bug, failing to cover all devices in the
IVRS table. Whether we can do any better than crashing in that
case is a different question - I wonder how native Linux with
IOMMU enabled does in that situation...

Jan

Jan Beulich

2013-Feb-13 08:24 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 13.02.13 at 02:33, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> I am wondering whether such cases (undeclared devices in IVRS) should cause
> a panic or disabling of IOMMU. This may be a more generic case of
Jan''s earlier
> patch for dealing with missing IOAPIC. Not sure whether it would be
possible
> to "unwind" IOMMU at this point though.
I doubt it - this would likely cause further problems down the road.

Instead, with us doing a bus scan anyway (as of 4.1), we could
detect the problem much earlier (and only bug on devices that we
don''t find but Dom0 does - in particular, I''m wondering how
SR-IOV
VFs would get dealt with here).

Furthermore I wonder whether in the single IOMMU case we
couldn''t deal with this on the same basis as is being done (sort of
unintendedly) for the IO-APIC case: The single IOMMU in the
system _must_ be the one responsible for any device not reported
by the firmware. That would deal with povder''s case, and
experience tells us that more complex (read: expensive) systems
tend to have less ACPI table flaws (i.e. wouldn''t suffer from not
being covered by such a workaround).

Jan

povder

2013-Feb-13 08:28 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

2013/2/13 Jan Beulich <JBeulich@suse.com>:> I wonder how native Linux with
> IOMMU enabled does in that situation...
>
I can try it today if you want. What kernel option should i use to
enable iommu? "iommu=force"?

Jan Beulich

2013-Feb-13 08:37 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 13.02.13 at 09:28, povder <povder@gmail.com> wrote:
> 2013/2/13 Jan Beulich <JBeulich@suse.com>:
>> I wonder how native Linux with
>> IOMMU enabled does in that situation...
>>
> 
> I can try it today if you want. What kernel option should i use to
> enable iommu? "iommu=force"?
Looks like other than Intel''s, AMD''s IOMMU gets turned on by
default independent of any configuration settings. So providing
a native kernel boot log (at maximum log level) ought to suffice
(assuming of course you don''t have any command line options
in place to _disable_ the IOMMU).

Jan

povder

2013-Feb-13 18:21 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>> 2013/2/13 Jan Beulich <JBeulich@suse.com>:
>>> I wonder how native Linux with
>>> IOMMU enabled does in that situation...
>>>
Here is full boot log of latest centos stable kernel:
http://pastebin.com/raw.php?i=RnrMFXqf
I had to set amd_iommu=on and amd_iommu_dump (to dump ACPI table) -
undocumented kernel options.

Interesting part in my opinion:
calling  pci_iommu_init+0x0/0x21 @ 1
AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
AMD-Vi:        mmio-addr: 00000000f6000000
AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 00:00.0 flags: 00
AMD-Vi:   DEV_RANGE_END		 devid: 00:00.2
AMD-Vi:   DEV_SELECT			 devid: 00:04.0 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 06:00.0 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 00:06.0 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 05:00.0 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 00:07.0 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 04:00.0 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 00:0b.0 flags: 00
AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 03:00.0 flags: 00
AMD-Vi:   DEV_RANGE_END		 devid: 03:00.1
AMD-Vi:   DEV_SELECT			 devid: 00:0d.0 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 02:00.0 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 00:11.0 flags: 00
AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 00:12.0 flags: 00
AMD-Vi:   DEV_RANGE_END		 devid: 00:12.2
AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 00:13.0 flags: 00
AMD-Vi:   DEV_RANGE_END		 devid: 00:13.2
AMD-Vi:   DEV_SELECT			 devid: 00:14.0 flags: d7
AMD-Vi:   DEV_SELECT			 devid: 00:14.1 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 00:14.2 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 00:14.3 flags: 00
AMD-Vi:   DEV_SELECT			 devid: 00:14.4 flags: 00
AMD-Vi:   DEV_ALIAS_RANGE		 devid: 01:00.0 flags: 00 devid_to: 00:14.4
AMD-Vi:   DEV_RANGE_END		 devid: 01:1f.7
AMD-Vi:   DEV_SELECT			 devid: 00:14.5 flags: 00
AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 00:16.0 flags: 00
AMD-Vi:   DEV_RANGE_END		 devid: 00:16.2
  alloc irq_desc for 55 on node 0
  alloc kstat_irqs on node 0
IOAPIC[1]: Set routing entry (7-31 -> 0x79 -> IRQ 55 Mode:1 Active:1)
pci 0000:00:00.2: PCI INT A -> GSI 55 (level, low) -> IRQ 55
  alloc irq_desc for 56 on node 0
  alloc kstat_irqs on node 0
pci 0000:00:00.2: irq 56 for MSI/MSI-X
AMD-Vi: Enabling IOMMU at 0000:00:00.2 cap 0x40
AMD-Vi: Initialized for Passthrough Mode
AMD-Vi: Enabling IOMMU at 0000:00:00.2 cap 0x40
initcall pci_iommu_init+0x0/0x21 returned 0 after 569797 usecs

I don''t see 06:00.1 device in IOMMU enabling process on which Xen
crashes.

lspci output: http://pastebin.com/raw.php?i=3wpKPQT9

Jan Beulich

2013-Feb-14 11:03 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 13.02.13 at 19:21, povder <povder@gmail.com> wrote:
>> > 2013/2/13 Jan Beulich <JBeulich@suse.com>:
>>>> I wonder how native Linux with
>>>> IOMMU enabled does in that situation...
>>>>
> 
> Here is full boot log of latest centos stable kernel:
> http://pastebin.com/raw.php?i=RnrMFXqf 
> I had to set amd_iommu=on and amd_iommu_dump (to dump ACPI table) -
> undocumented kernel options.
> 
> Interesting part in my opinion:
> calling  pci_iommu_init+0x0/0x21 @ 1
> AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
> AMD-Vi:        mmio-addr: 00000000f6000000
> AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 00:00.0 flags: 00
> AMD-Vi:   DEV_RANGE_END		 devid: 00:00.2
> AMD-Vi:   DEV_SELECT			 devid: 00:04.0 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 06:00.0 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 00:06.0 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 05:00.0 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 00:07.0 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 04:00.0 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 00:0b.0 flags: 00
> AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 03:00.0 flags: 00
> AMD-Vi:   DEV_RANGE_END		 devid: 03:00.1
> AMD-Vi:   DEV_SELECT			 devid: 00:0d.0 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 02:00.0 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 00:11.0 flags: 00
> AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 00:12.0 flags: 00
> AMD-Vi:   DEV_RANGE_END		 devid: 00:12.2
> AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 00:13.0 flags: 00
> AMD-Vi:   DEV_RANGE_END		 devid: 00:13.2
> AMD-Vi:   DEV_SELECT			 devid: 00:14.0 flags: d7
> AMD-Vi:   DEV_SELECT			 devid: 00:14.1 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 00:14.2 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 00:14.3 flags: 00
> AMD-Vi:   DEV_SELECT			 devid: 00:14.4 flags: 00
> AMD-Vi:   DEV_ALIAS_RANGE		 devid: 01:00.0 flags: 00 devid_to: 00:14.4
> AMD-Vi:   DEV_RANGE_END		 devid: 01:1f.7
> AMD-Vi:   DEV_SELECT			 devid: 00:14.5 flags: 00
> AMD-Vi:   DEV_SELECT_RANGE_START	 devid: 00:16.0 flags: 00
> AMD-Vi:   DEV_RANGE_END		 devid: 00:16.2
>   alloc irq_desc for 55 on node 0
>   alloc kstat_irqs on node 0
> IOAPIC[1]: Set routing entry (7-31 -> 0x79 -> IRQ 55 Mode:1 Active:1)
> pci 0000:00:00.2: PCI INT A -> GSI 55 (level, low) -> IRQ 55
>   alloc irq_desc for 56 on node 0
>   alloc kstat_irqs on node 0
> pci 0000:00:00.2: irq 56 for MSI/MSI-X
> AMD-Vi: Enabling IOMMU at 0000:00:00.2 cap 0x40
> AMD-Vi: Initialized for Passthrough Mode
> AMD-Vi: Enabling IOMMU at 0000:00:00.2 cap 0x40
> initcall pci_iommu_init+0x0/0x21 returned 0 after 569797 usecs
> 
> I don''t see 06:00.1 device in IOMMU enabling process on which Xen
crashes.
So the problem appears to be that this device has BDF higher than
any known one. Could you therefore try whether the patch below
allows the system to come up?

Jan

--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -32,8 +32,8 @@ struct amd_iommu *find_iommu_for_device(
 {
     struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(seg);
 
-    BUG_ON ( bdf >= ivrs_bdf_entries );
-    return ivrs_mappings ? ivrs_mappings[bdf].iommu : NULL;
+    return ivrs_mappings && bdf < ivrs_bdf_entries ?
ivrs_mappings[bdf].iommu
+                                                   : NULL;
 }
 
 /*

Jan Beulich

2013-Feb-14 11:29 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 13.02.13 at 19:21, povder <povder@gmail.com> wrote:
> I don''t see 06:00.1 device in IOMMU enabling process on which Xen
crashes.
> 
> lspci output: http://pastebin.com/raw.php?i=3wpKPQT9 
This is really odd: The "iommu=debug" output you made available
shows that while there are further devices that have no associated
IOMMU, the bus scan done in the hypervisor didn''t even find a
device at 06:00.1. Which I see possible only in two ways: Either
the device becomes visible on the bus only when the driver for
06:00.0 loads (and is otherwise detectable only by other means,
e.g. ACPI), or 06:00.0 doesn''t have the multi function device flag
properly set. That latter aspect could be checked by looking at
the raw (hex) config space dump of 06:00.0.

Boris, one other thought I had in this context: Is it really possible
for functions on the same (non-bridge) device to be serviced
by different IOMMUs? If not, find_iommu_for_device() could simply
look for function 0 if nothing is known about the passed in function.

Jan

Boris Ostrovsky

2013-Feb-14 14:55 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

On 02/14/2013 06:29 AM, Jan Beulich wrote:>>>> On 13.02.13 at 19:21, povder <povder@gmail.com> wrote:
>> I don''t see 06:00.1 device in IOMMU enabling process on which
Xen crashes.
>>
>> lspci output: http://pastebin.com/raw.php?i=3wpKPQT9
> This is really odd: The "iommu=debug" output you made available
> shows that while there are further devices that have no associated
> IOMMU, the bus scan done in the hypervisor didn''t even find a
> device at 06:00.1. Which I see possible only in two ways: Either
> the device becomes visible on the bus only when the driver for
> 06:00.0 loads (and is otherwise detectable only by other means,
> e.g. ACPI), or 06:00.0 doesn''t have the multi function device flag
> properly set. That latter aspect could be checked by looking at
> the raw (hex) config space dump of 06:00.0.
If I read this correctly, Linux enables multi-functionness (?):

http://lxr.linux.no/#linux+v3.7.7/drivers/pci/quirks.c#L1494

So you are probably right. BIOS does not enumerate 06:00.1 in IVRS because
it doesn''t see it enabled yet.
>
> Boris, one other thought I had in this context: Is it really possible
> for functions on the same (non-bridge) device to be serviced
> by different IOMMUs?
I can''t see how this may be possible: IOMMU is PCIe root complex and
any
downstream device can only send transactions through its root. (I hope I am
using right terminology).
> If not, find_iommu_for_device() could simply
> look for function 0 if nothing is known about the passed in function.
Yes, this could work. But with a warning in the log.

-boris

Jan Beulich

2013-Feb-15 08:21 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

>>> On 14.02.13 at 15:55, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> On 02/14/2013 06:29 AM, Jan Beulich wrote:
>>>>> On 13.02.13 at 19:21, povder <povder@gmail.com>
wrote:
>>> I don''t see 06:00.1 device in IOMMU enabling process on
which Xen crashes.
>>>
>>> lspci output: http://pastebin.com/raw.php?i=3wpKPQT9 
>> This is really odd: The "iommu=debug" output you made
available
>> shows that while there are further devices that have no associated
>> IOMMU, the bus scan done in the hypervisor didn''t even find a
>> device at 06:00.1. Which I see possible only in two ways: Either
>> the device becomes visible on the bus only when the driver for
>> 06:00.0 loads (and is otherwise detectable only by other means,
>> e.g. ACPI), or 06:00.0 doesn''t have the multi function device
flag
>> properly set. That latter aspect could be checked by looking at
>> the raw (hex) config space dump of 06:00.0.
> 
> If I read this correctly, Linux enables multi-functionness (?):
> 
> http://lxr.linux.no/#linux+v3.7.7/drivers/pci/quirks.c#L1494 
> 
> So you are probably right. BIOS does not enumerate 06:00.1 in IVRS because
> it doesn''t see it enabled yet.
Indeed, and it has been that way since 2.6.18 (i.e. virtually
forever; commit 15e0c694367332d7e7114c7c73044bc5fed9ee48).

I''ve got a patch mostly ready to deal with non-zero functions
when we at least know something about function zero.

But I don''t think we can easily deal with the single IOMMU
case, making that IOMMU cover all devices, as we would
still need to figure out the requestor ID for each device. That
requires looking at the PCI bus topology iiuc, and while we
have the necessary logic for VT-d, it seems not really strait
forward (mainly because risky) to make use of this in the AMD Vi
code too.

Jan

Boris Ostrovsky

2013-Feb-15 15:21 UTC

head link

Re: Xen 4.2.1 boot failure with IOMMU enabled

On 02/15/2013 03:21 AM, Jan Beulich wrote:>
> But I don''t think we can easily deal with the single IOMMU
> case, making that IOMMU cover all devices, as we would
> still need to figure out the requestor ID for each device. That
> requires looking at the PCI bus topology iiuc, and while we
> have the necessary logic for VT-d, it seems not really strait
> forward (mainly because risky) to make use of this in the AMD Vi
> code too.
Scanning PCI for devices would effectively mean that we are ignoring 
IVHD (device portion of IVRS). That would be somewhat unfortunate (but 
maybe unavoidable, given the state of BIOSes).


-boris

Reasonably Related Threads

Search for more apparently analagous threads

Xen devel - Feb 2013 - Xen 4.2.1 boot failure with IOMMU enabled

Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Re: Xen 4.2.1 boot failure with IOMMU enabled

Reasonably Related Threads