Hi people, I ran into a trouble when trying to run Xen 4.1.x with pv_ops kernel on IBM eServer x3400. Without noacpi command line option dom0 kernel crashes on ACPI initialization. Kermels 2.6.32 (with konrad xen patches), 3.1, 3.3, Xen 4.1.2 behave the same way. Without hypervisor all kernels run without any problems. There are several patches (available on http://git.altlinux.org/people/silicium/packages/?p=kernel-image.git;a=shortlog;h=refs/heads/kernel-image-xen-dom0, commits f1f91babc9cc9402ccee8938b888240f5bae1574, 72db7b5e069002765ced64f030c1117d72352331, adc4c567a08d4d2377060179639cbfdc3d9d8e0e and 9beeac5589ce5c415e4513016c78e96374b8a895) that allow kernel 2.6.32 to boot on this hardware. This patches written by kernel package maintainer of ALT Linux distribution. Discussion of this issue is available on Sysadmins ALTLinux mailing list (http://lists.altlinux.org/pipermail/sysadmins/2011-April/034471.html and follows) in Russian. I attached Xen 4.1.0 with kernel 2.6.32 crash messages. If you need this messages for more recent versions of Xen and kernel, or more information about hardware, please let me know. Ideally, I would like to run current versions of Xen and Linux kernels on this hardware. Can you help me please? -- WBR, Alex Moskalenko _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Thu, Jun 07, 2012 at 01:29:33PM +0400, Alex Moskalenko wrote:> Hi people, > > I ran into a trouble when trying to run Xen 4.1.x with pv_ops kernel > on IBM eServer x3400. Without noacpi command line option dom0 kernelWell yes! We don''t do ''noacpi'' Why do you supply ''noacpi''?> crashes on ACPI initialization. Kermels 2.6.32 (with konrad xen > patches), 3.1, 3.3, Xen 4.1.2 behave the same way. Without > hypervisor all kernels run without any problems. > > There are several patches (available on http://git.altlinux.org/people/silicium/packages/?p=kernel-image.git;a=shortlog;h=refs/heads/kernel-image-xen-dom0, > commits f1f91babc9cc9402ccee8938b888240f5bae1574, > 72db7b5e069002765ced64f030c1117d72352331, > adc4c567a08d4d2377060179639cbfdc3d9d8e0e and > 9beeac5589ce5c415e4513016c78e96374b8a895) that allow kernel 2.6.32 > to boot on this hardware. This patches written by kernel package > maintainer of ALT Linux distribution. Discussion of this issue is > available on Sysadmins ALTLinux mailing list > (http://lists.altlinux.org/pipermail/sysadmins/2011-April/034471.html > and follows) in Russian. > > I attached Xen 4.1.0 with kernel 2.6.32 crash messages. If you need > this messages for more recent versions of Xen and kernel, or more > information about hardware, please let me know.Please provide the crash using the v3.4 kernel.> > Ideally, I would like to run current versions of Xen and Linux > kernels on this hardware. Can you help me please?Well sure. But pls explain to me why: - you are using ''noacpi'' - why are not using the v3.4 kernel?> > > -- > WBR, Alex Moskalenko> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
07.06.2012 21:24, Konrad Rzeszutek Wilk пишет:> On Thu, Jun 07, 2012 at 01:29:33PM +0400, Alex Moskalenko wrote: >> Hi people, >> >> I ran into a trouble when trying to run Xen 4.1.x with pv_ops kernel >> on IBM eServer x3400. Without noacpi command line option dom0 kernel > Well yes! We don't do 'noacpi' Why do you supply 'noacpi'?If I don't supply 'noacpi' option, dom0 kernel crashes with messages mentioned in my previous letter. 'noacpi' allows kernel to boot with some IRQ issues (only RAID ant network adapters work properly, any other onboard PCI device does not work at all).>> crashes on ACPI initialization. Kermels 2.6.32 (with konrad xen >> patches), 3.1, 3.3, Xen 4.1.2 behave the same way. Without >> hypervisor all kernels run without any problems. >> >> There are several patches (available on http://git.altlinux.org/people/silicium/packages/?p=kernel-image.git;a=shortlog;h=refs/heads/kernel-image-xen-dom0, >> commits f1f91babc9cc9402ccee8938b888240f5bae1574, >> 72db7b5e069002765ced64f030c1117d72352331, >> adc4c567a08d4d2377060179639cbfdc3d9d8e0e and >> 9beeac5589ce5c415e4513016c78e96374b8a895) that allow kernel 2.6.32 >> to boot on this hardware. This patches written by kernel package >> maintainer of ALT Linux distribution. Discussion of this issue is >> available on Sysadmins ALTLinux mailing list >> (http://lists.altlinux.org/pipermail/sysadmins/2011-April/034471.html >> and follows) in Russian. >> >> I attached Xen 4.1.0 with kernel 2.6.32 crash messages. If you need >> this messages for more recent versions of Xen and kernel, or more >> information about hardware, please let me know. > Please provide the crash using the v3.4 kernel.I will provide it as soon as possible.>> Ideally, I would like to run current versions of Xen and Linux >> kernels on this hardware. Can you help me please? > Well sure. But pls explain to me why: > - you are using 'noacpi' > - why are not using the v3.4 kernel?- Only with 'noacpi' kernel does not crush. Without it kernel crashes with logs I previously attached. - I deal with this issue in April 2011, when 3.x kernel was not yet ready to work as dom0 kernel. Periodically I test new kernel versions (3.1, 3.2, 3.3 including 3.3.8), bul all of they are also crashing in ACPI initialization. I will try 3.4 kernel as soon as possible. Thank you for reply! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Thu, Jun 07, 2012 at 11:00:00PM +0400, Alex Moskalenko wrote:> 07.06.2012 21:24, Konrad Rzeszutek Wilk ??????????: > >On Thu, Jun 07, 2012 at 01:29:33PM +0400, Alex Moskalenko wrote: > >>Hi people, > >> > >>I ran into a trouble when trying to run Xen 4.1.x with pv_ops kernel > >>on IBM eServer x3400. Without noacpi command line option dom0 kernel > >Well yes! We don''t do ''noacpi'' Why do you supply ''noacpi''? > If I don''t supply ''noacpi'' option, dom0 kernel crashes with messages > mentioned in my previous letter. ''noacpi'' allows kernel to boot with > some IRQ issues (only RAID ant network adapters work properly, any > other onboard PCI device does not work at all). > > >>crashes on ACPI initialization. Kermels 2.6.32 (with konrad xen > >>patches), 3.1, 3.3, Xen 4.1.2 behave the same way. Without > >>hypervisor all kernels run without any problems. > >> > >>There are several patches (available on http://git.altlinux.org/people/silicium/packages/?p=kernel-image.git;a=shortlog;h=refs/heads/kernel-image-xen-dom0, > >>commits f1f91babc9cc9402ccee8938b888240f5bae1574, > >>72db7b5e069002765ced64f030c1117d72352331, > >>adc4c567a08d4d2377060179639cbfdc3d9d8e0e and > >>9beeac5589ce5c415e4513016c78e96374b8a895) that allow kernel 2.6.32 > >>to boot on this hardware. This patches written by kernel package > >>maintainer of ALT Linux distribution. Discussion of this issue is > >>available on Sysadmins ALTLinux mailing list > >>(http://lists.altlinux.org/pipermail/sysadmins/2011-April/034471.html > >>and follows) in Russian. > >> > >>I attached Xen 4.1.0 with kernel 2.6.32 crash messages. If you need > >>this messages for more recent versions of Xen and kernel, or more > >>information about hardware, please let me know. > >Please provide the crash using the v3.4 kernel. > I will provide it as soon as possible. >When posting the log with Linux 3.4.x please don''t compress the log with bzip2, it makes it difficult to read/comment about it on the mailinglist. It isn''t that big really..> >>Ideally, I would like to run current versions of Xen and Linux > >>kernels on this hardware. Can you help me please? > >Well sure. But pls explain to me why: > > - you are using ''noacpi'' > > - why are not using the v3.4 kernel? > - Only with ''noacpi'' kernel does not crush. Without it kernel > crashes with logs I previously attached. >I didn''t even notice it earlier because it was bzip2''d :) -- Pasi
>>> On 07.06.12 at 11:29, Alex Moskalenko <mav@elserv.ru> wrote: > I ran into a trouble when trying to run Xen 4.1.x with pv_ops kernel on > IBM eServer x3400. Without noacpi command line option dom0 kernel > crashes on ACPI initialization. Kermels 2.6.32 (with konrad xen > patches), 3.1, 3.3, Xen 4.1.2 behave the same way. Without hypervisor > all kernels run without any problems.This is an issue that was reported and discussed previously. Fundamentally, it is a firmware problem from my pov: ACPI has _nothing_ to do with the MMIO space used for the IO-APICs of the system once they are under control of the OS. It shouldn''t even be reading from them (which iirc was the case in earlier reports), but in your case it looks like it''s even writing them. I had been considering to allow Dom0 read access to those pages, but obviously this wouldn''t help in your case. Could you extract and supply the ACPI tables of that system, so we can make an attempt at checking whether there is some reason for the firmware writing to the IO-APIC that we didn''t think of so far?> There are several patches (available on > http://git.altlinux.org/people/silicium/packages/?p=kernel-image.git;a=short > log;h=refs/heads/kernel-image-xen-dom0, > commits f1f91babc9cc9402ccee8938b888240f5bae1574, > 72db7b5e069002765ced64f030c1117d72352331, > adc4c567a08d4d2377060179639cbfdc3d9d8e0e and > 9beeac5589ce5c415e4513016c78e96374b8a895) that allow kernel 2.6.32 to > boot on this hardware. This patches written by kernel package maintainerWhile they could make an attempt at upstreaming them, I don''t think the way this is being dealt with (altering the set_pte() and set_pte_at() return types) would be well received. (The Xen- specific adjustments look bogus altogether, btw.)> of ALT Linux distribution. Discussion of this issue is available on > Sysadmins ALTLinux mailing list > (http://lists.altlinux.org/pipermail/sysadmins/2011-April/034471.html > and follows) in Russian.But there''s no real analysis of the underlying problem there. Perhaps the author of the patches should have turned here? (As a side note - non-pvops Xen wouldn''t have this problem, as the hypercalls underlying ioremap() get properly error-checked there, and result in the ioremap() failing rather than a #PF getting raised.) Jan> I attached Xen 4.1.0 with kernel 2.6.32 crash messages. If you need this > messages for more recent versions of Xen and kernel, or more information > about hardware, please let me know. > > Ideally, I would like to run current versions of Xen and Linux kernels > on this hardware. Can you help me please? > > > -- > WBR, Alex Moskalenko
>>> On 08.06.12 at 12:19, Alex Moskalenko <mav@elserv.ru> wrote: > 08.06.2012 11:43, Jan Beulich пишет: >>>>> On 07.06.12 at 11:29, Alex Moskalenko<mav@elserv.ru> wrote: >>> I ran into a trouble when trying to run Xen 4.1.x with pv_ops kernel on >>> IBM eServer x3400. Without noacpi command line option dom0 kernel >>> crashes on ACPI initialization. Kermels 2.6.32 (with konrad xen >>> patches), 3.1, 3.3, Xen 4.1.2 behave the same way. Without hypervisor >>> all kernels run without any problems. >> This is an issue that was reported and discussed previously. >> Fundamentally, it is a firmware problem from my pov: ACPI has >> _nothing_ to do with the MMIO space used for the IO-APICs of >> the system once they are under control of the OS. It shouldn't >> even be reading from them (which iirc was the case in earlier >> reports), but in your case it looks like it's even writing them. I >> had been considering to allow Dom0 read access to those pages, >> but obviously this wouldn't help in your case. >> >> Could you extract and supply the ACPI tables of that system, so >> we can make an attempt at checking whether there is some reason >> for the firmware writing to the IO-APIC that we didn't think of so >> far? > Please see attached archive. Tables are grabbed with acpidump -b under > Xen 4.1.2 and patched kernel 2.6.32._SB.PCI0._CRS has Store (0x2E, IDX) And (0xFFFEFFFF, WND, WND) with OperationRegion (Z00D, SystemMemory, 0xFEC80000, 0x0100) Field (Z00D, DWordAcc, Lock, Preserve) { IDX, 32, Offset (0x10), WND, 32 } so what the BIOS tries to do is unmask pin 15 of the second IO-APIC. To me this makes no sense at all (and is definitely impossible to be sync-ed properly with any OSes accesses to the IO-APIC registers), but could you nevertheless boot a native kernel with "apic=debug" and post the full set of boot messages (the dumps of the IO-APICs being what I'm really after). Alternatively, "apic_verbosity=debug" passed to Xen or sending the 'z' debug key would produce similar information. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 08.06.12 at 23:11, Alex Moskalenko <mav@elserv.ru> wrote: > I attached full dmesg output of kernel 3.3.8.So this [ 1.752498] IO APIC #9...... ... [ 1.752558] 0f 00 0 0 0 0 0 0 0 00 clearly is a firmware bug - it''s plain unacceptable for the firmware to unmask a completely uninitialized RTE. I''m afraid you''ll need to get IBM to address this (assuming you''re running the latest firmware, and hence they haven''t already).> Also I removed the ''Store'' and ''And'' statements from DSDT and compiled > it. When altered DSDT is loaded with grub, kernel 3.3.8 boots > successfully (good!). But with hypervisor boot process hangs at loading > aacraid module (last message is about setting up IRQ). Aacraid module > loads without any problem on bare hardware. Kernel 2.6.32 boots > successfully and loads all modules. I think this issue is not related to > patched DSDT, but I''m not sure.Given the above plus the fact that the driver uses PCI-MSI this is indeed unlikely. But you ought to post kernel _and_ hypervisor logs to be able to at least try to help. Jan