Imre Szőllősi
2021-May-13 19:13 UTC
[Pkg-xen-devel] Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
Package: src:xen Version: 4.14.1+11-gb0b734a8b3-1 Severity: critical Justification: causes serious data loss X-Debbugs-Cc: debianbts at virtualzone.hu Dear Maintainer, after a clean install of bullseye/testing the xen dmesg shows the following message: (XEN) AMD-Vi: IO_PAGE_FAULT: 0000:01:00.1 d0 addr fffffffdf8000000 flags 0x8 I this is the sata device: 01:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller (rev 01) or on another mb 01:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] Device 43eb in the case of write operations - ie. dbench or windows guest - there are a lot of messages sometimes the filesystem goes to read-only state, and the windows guest goes bsod tested on 3 hw: 1. asus prime b450m-a, ryzen 5 2600x, md raid1, 2x samsung 1TB 860evo, lvm: problem does appear 2. asus prime b550m-k, ryzen 5 5600x, md raid1, 2x samsung 1TB 870evo, lvm: problem does appear 3. asus prime b550m-k, ryzen 5 5600x, 1x samsung 1TB 850evo, lvm: problem does not appear 3. asus prime b550m-k, ryzen 5 5600x, 1x samsung 128GB 840pro, lvm: problem does not appear 3. asus prime b550m-k, ryzen 5 5600x, samsung 1TB 850evo + samsung 128GB 840pro, lvm, dbench on 2 ssds in parallel: problem does appear as i see, the problem does appear, when writes data parallel to 2 ssds Thanks! -- System Information: Debian Release: bullseye/sid APT prefers testing-security APT policy: (500, 'testing-security'), (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 5.10.0-6-amd64 (SMP w/12 CPU threads) Locale: LANG=hu_HU.UTF-8, LC_CTYPE=hu_HU.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled xen-hypervisor-4.14-amd64 depends on no packages. Versions of packages xen-hypervisor-4.14-amd64 recommends: ii xen-hypervisor-common 4.14.1+11-gb0b734a8b3-1 ii xen-utils-4.14 4.14.1+11-gb0b734a8b3-1 xen-hypervisor-4.14-amd64 suggests no packages. -- no debconf information
Imre Szőllősi
2021-Jun-13 13:58 UTC
[Pkg-xen-devel] Bug#988477: Acknowledgement (xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device)
i tested on 4th hw 4. asus m4n78 pro, phenom ii x4 905e, md raid1, 2x samsung 1TB 860evo, lvm: problem does not appear as i see, not all mb/chipset/sata pcie device affected Thanks!
Hans van Kranenburg
2021-Aug-05 20:46 UTC
[Pkg-xen-devel] Bug#988477: Bug#988477: Acknowledgement (xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device)
severity 988477 normal tags 988477 + moreinfo + upstream - bullseye-ignore thanks Hi! On 6/13/21 3:58 PM, Imre Sz?ll?si wrote:> i tested on 4th hw > > 4. asus m4n78 pro, phenom ii x4 905e, md raid1, 2x samsung 1TB 860evo, > lvm: problem does not appear > > as i see, not all mb/chipset/sata pcie device affectedThanks for your report, and for trying out different combinations of hardware. While doing a short internet search about the problems you're seeing while using AMD ryzen, sata, nvme and iommu, I suspect this problem does not have a lot to do with Xen specifically, but more with the hardware and its firmware. This also means that it's not a Debian packaging problem, and it cannot be fixed by me (or the Debian Xen team). If you want to research this problem more, I can maybe be of some help by providing suggestions. Still, you will have to do all of the actual work, since I do not have your hardware here. The first thing I would suggest is to try reproduce the problem when booting with just Linux without Xen, and then trying the dbench test. If you don't actually need to directly pass-through hardware to a Xen guest, you can also try disabling iommu, or researching other iommuoptions that can serve as a workaround. In any case, further reports will need to have more detailed information. For example, instead of "there are a lot of messages", provide a text attachment with a piece of logging that shows these messages. I'm tagging this bug 'moreinfo' now, since it will depend on your availability and abilities to work on it to have it advance. Have fun, Hans van Kranenburg
Imre Szőllősi
2021-Aug-08 13:34 UTC
[Pkg-xen-devel] Bug#988477: Bug#988477: Acknowledgement (xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device)
An HTML attachment was scrubbed... URL: <http://alioth-lists.debian.net/pipermail/pkg-xen-devel/attachments/20210808/f65cb55f/attachment.htm>
tags 988477 - moreinfo found 988477 4.17.2+76-ge1f9cb16e2-1~deb12u1 affects 988477 src:linux severity 988477 critical quit I am also observing #988477 occur. This machine has a AMD Zen 4 processor. The first observation was when motherboard/processor was swapped out, the older motherboard/processor was several generations old. The pattern which is emerging is Linux MD RAID1 plus recent AMD processor which has full IOMMU functionality. The older machine was believed to have an IOMMU, but the BIOS wasn't creating appropriate ACPI tables (IVRS) and thus Xen was unable to utilize it. This seems to be occuring with a small percentage of write operations. Subsequent read operations appear to be fine. I am not convinced this is a Xen bug. I suspect this is instead a bug in the Linux MD subsystem. In particular if the DMA interface was designed assuming only a single device would ever access any page, but the MD RAID1 driver is reusing the same page for both devices. IOMMU page release could be handled by marking the page unused in a device data structure and later removed by sweeping a table. In such case if the MD-RAID1 driver was to redirect the page to another device between these two steps, the entry for a subsequent device could be wiped out when trying to invalidate an entry for a prior device. Anyway, I'm also observing bug #988477. This could also be a kernel bug. So far no crashes/confirmed data loss have occured, but sweeping the mirror does turn up small numbers of inconsistencies. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg at m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Elliott Mitchell
2024-Jul-10 19:25 UTC
[Pkg-xen-devel] Bug#988477: Potential Mitigation for #988477
It was suggested as a debugging step, but adding the option "iommu=no-intremap" to Xen's command-line may work as a short-term mitigation for #988477. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg at m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Maximilian Engelhardt
2024-Aug-25 21:41 UTC
[Pkg-xen-devel] Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
Control: severity -1 normal Hi Elliott, I am changing the severity back to normal as the xen package works fine for many people without any serious issues. From your last message it also seems you found a workaround for your problem. Please don't change the bug severity without at least giving an explanation why you think the new severity is justified.>From the few log lines in this bug report this seems to be an upstream issuewith xen or the linux kernel. Please report your observations upstream. The Debian xen team does not have the resources and knowledge to debug or fix such problems. Once the issue has been identified and fixed upstream we can see if we can backport a fix to our Debian packages, but this is only possible once an upstream fix has landed. Thanks, Maxi -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part. URL: <http://alioth-lists.debian.net/pipermail/pkg-xen-devel/attachments/20240825/937f5da9/attachment.sig>
Debian Bug Tracking System
2024-Aug-25 21:54 UTC
[Pkg-xen-devel] Processed: Re: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
Processing control commands:> severity -1 normalBug #988477 [src:xen] xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device Severity set to 'normal' from 'critical' -- 988477: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=988477 Debian Bug Tracking System Contact owner at bugs.debian.org with problems
Elliott Mitchell
2024-Aug-25 22:58 UTC
[Pkg-xen-devel] Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
On Sun, Aug 25, 2024 at 11:41:44PM +0200, Maximilian Engelhardt wrote:> I am changing the severity back to normal as the xen package works fine for > many people without any serious issues. From your last message it also seemsYet for some lucky people data is corrupted/lost. There could be other people who reproduce this, but don't send e-mail saying "me too" to this bug report. Presently the main reason there aren't very many reproductions is few people are bothering to use RAID with flash. The initial reports are SSDs have a lower failure rate than disks, but the failure rate isn't even close to zero. Whereas the data loss/corruption easily reproduces. While both cases in #988477 were on systems with AMD hardware, I am presently doubtful that is a requirement. The most similar known bug was found to be more severe on AMD hardware, but also occur on Intel hardware. I suspect this issue may be similar, simply no one has noticed the problem yet...> you found a workaround for your problem. Please don't change the bug severitySomething was found which seems to have made another issue more prominent. It may reduce the rate at which data corruption occurs, but I've since confirmed data loss/corruption continues to occur.> without at least giving an explanation why you think the new severity is > justified.I had thought the original reporter's justification was sufficient. This appears to have some specific requirement to meet, but if you meet them you may be in trouble before alerts trigger. So far both reports are with AMD machines with IOMMUv2 functionality (I tried on a machine with IOMMUv1/GART and it didn't reproduce). Both reports feature Samsung SATA devices. A NVMe device from another manufacturer also showed the issue (I'm almost certain Samsung NVMe devices will also show the issue). I suspect Intel machines may also be effected by this issue, but it may not manifest as severely. I suspect this is a case of people with AMD machines being a bit more wary of hardware failure (thus actually bothering to use RAID1 even with flash devices).> >From the few log lines in this bug report this seems to be an upstream issue > with xen or the linux kernel. Please report your observations upstream. The > Debian xen team does not have the resources and knowledge to debug or fix such > problems. Once the issue has been identified and fixed upstream we can see if > we can backport a fix to our Debian packages, but this is only possible once > an upstream fix has landed.Perhaps it has become easier to report things upstream, but the original procedure was reportters were supposed to report to bugs.debian.org and NOT forward upstream. Other problem is I've run into a chasm with upstream and no way to build a bridge across. I do have one more thing to try, but don't yet have a time-frame for when I'll check that. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg at m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Elliott Mitchell
2024-Sep-03 21:58 UTC
[Pkg-xen-devel] Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
found 988477 4.17.3+10-g091466ba55-1~deb12u1 severity 988477 critical quit Justification is same as original, data loss. I'm unsure about of the border between "data loss" and "serious data loss" is, but the original reportter declared it so and I don't disagree. On Sun, Aug 25, 2024 at 11:41:44PM +0200, Maximilian Engelhardt wrote:> I am changing the severity back to normal as the xen package works fine for > many people without any serious issues. From your last message it also seemscritical makes unrelated software on the system (or the whole system) break, or causes serious data loss, or introduces a security hole on systems where you install the package. grave makes the package in question unusable or mostly so, or causes data loss, or introduces a security hole allowing access to the accounts of users who use the package. Both of those are lists of conditions. Since the conditions are "causes serious data loss" and "causes data loss", those have been met as there is no mention of "and cannot work acceptably for anyone".> you found a workaround for your problem. Please don't change the bug severity > without at least giving an explanation why you think the new severity is > justified.The key word was "may". I was being cautious when testing due to the severity of the issue. As stated in the previous message, it was found to merely mildly change the messages and not fix the issue.> >From the few log lines in this bug report this seems to be an upstream issue > with xen or the linux kernel. Please report your observations upstream. The > Debian xen team does not have the resources and knowledge to debug or fix such > problems. Once the issue has been identified and fixed upstream we can see if > we can backport a fix to our Debian packages, but this is only possible once > an upstream fix has landed.My understanding is being an upstream issue has no effect on severity. It allows tagging as "upstream", but does not allow reducing severity. The severity is meant as an alert to others there is a *severe* problem lurking. I've tried interacting with upstream, yet there has been a demand to release `xl dmesg` to a public area. While I cannot state any information in `xl dmesg` can be used to compromise systems, nor can point to hardware serial numbers or other private data which leak in, it still triggers the TMI detector. As such I'm uncomfortable with that being public and I don't know any way to bridge that chasm. If I was an installation of 10K nodes I wouldn't be too bothered with details of a single test machine leaking, alas I'm not in that category. I could also send someone a pair of SATA devices known to manifest the issue, but that has failed to generate interest. As such I'm stuck. Question for the original submitter, Imre Sz?ll?si, what was your situation prior to seeing #988477 manifest? Were you installing Xen 4.14 for the first time on Debian 11/bullseye? Had you previously used Xen 4.11 with Debian 10/buster or earlier? Knowing whether the bug was introduced between Xen 4.11 and Xen 4.14 would be valuable knowledge if you have it. I had been using an older processor with 4.14, so I hadn't observed it until 4.17. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg at m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Maximilian Engelhardt
2025-Mar-14 21:42 UTC
[Pkg-xen-devel] Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
A fix [1] for the IO_PAGE_FAULT went into xen 4.20 which is now available in testing and unstable. The 4.20.0-1 Debian source package can also be compiled for bookworm if you have a bookworm system running and want to test there. Please not that qemu also needs to be recompiled for this xen version if you are using qemu. Can anyone affected by this bug conform if their issue is fixed in xen 4.20 or is still there? [1] https://salsa.debian.org/xen-team/debian-xen/-/commit/b953a99da98d63a7c827248abc450d4e8e015ab6 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part. URL: <http://alioth-lists.debian.net/pipermail/pkg-xen-devel/attachments/20250314/32ba6b55/attachment.sig>
Philipp Kern
2025-Apr-13 11:22 UTC
[Pkg-xen-devel] Bug#988477: syslinux NMU 3:6.04~git20190206.bf6db5b4+dfsg1-3.1
user debian-release at lists.debian.org usertag 1091027 + bsp-2025-04-at-vienna usertag 1057462 + bsp-2025-04-at-vienna usertag 994274 + bsp-2025-04-at-vienna tag 1091027 + pending tag 1057462 + pending tag 994274 + pending thanks Uploaded an NMU to DELAYED/0-day:> syslinux (3:6.04~git20190206.bf6db5b4+dfsg1-3.1) unstable; urgency=medium > > * Non-maintainer upload. > * Add GCC 14 compatibility patch. Thanks to Marek Benc. > (Closes: #1091027, #1057462) > * Add wchar_t definition for gnu-efi >= 3.0.16 compatibility. > (Closes: #994274) > * Update build dependency on e2fslibs-dev => libext2fs-dev. > * Update Lintian overrides to match again. > > -- Philipp Kern <pkern at debian.org> Sun, 13 Apr 2025 11:31:54 +0200Kind regards Philipp Kern -------------- next part -------------- diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/changelog syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/changelog --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/changelog 2020-08-16 15:46:13.000000000 +0200 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/changelog 2025-04-13 11:31:54.000000000 +0200 @@ -1,3 +1,15 @@ +syslinux (3:6.04~git20190206.bf6db5b4+dfsg1-3.1) unstable; urgency=medium + + * Non-maintainer upload. + * Add GCC 14 compatibility patch. Thanks to Marek Benc. + (Closes: #1091027, #1057462) + * Add wchar_t definition for gnu-efi >= 3.0.16 compatibility. + (Closes: #994274) + * Update build dependency on e2fslibs-dev => libext2fs-dev. + * Update Lintian overrides to match again. + + -- Philipp Kern <pkern at debian.org> Sun, 13 Apr 2025 11:31:54 +0200 + syslinux (3:6.04~git20190206.bf6db5b4+dfsg1-3) unstable; urgency=medium * Add patch for compatibility with gcc-10 (Closes: 957858) diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/control syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/control --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/control 2020-08-16 15:46:13.000000000 +0200 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/control 2025-04-13 11:31:54.000000000 +0200 @@ -5,7 +5,7 @@ Uploaders: Lukas Schwaighofer <lukas at schwaighofer.name> Build-Depends: debhelper-compat (= 12), - e2fslibs-dev, + libext2fs-dev, gcc-multilib [amd64 x32], libc6-dev-i386 [amd64 x32], nasm, diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/extlinux.lintian-overrides syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/extlinux.lintian-overrides --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/extlinux.lintian-overrides 2020-08-16 15:46:13.000000000 +0200 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/extlinux.lintian-overrides 2025-04-13 11:31:54.000000000 +0200 @@ -1,2 +1,2 @@ # extlinux contains the VBR (volume boot record) which needs zlib embedded -extlinux: embedded-library usr/bin/extlinux: zlib +extlinux: embedded-library zlib [usr/bin/extlinux] diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/isolinux.lintian-overrides syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/isolinux.lintian-overrides --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/isolinux.lintian-overrides 2020-08-16 15:46:13.000000000 +0200 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/isolinux.lintian-overrides 2025-04-13 11:31:54.000000000 +0200 @@ -1,2 +1,2 @@ # specific documentation for the contents of this directory -isolinux: package-contains-documentation-outside-usr-share-doc usr/lib/ISOLINUX/extra/README +isolinux: package-contains-documentation-outside-usr-share-doc [usr/lib/ISOLINUX/extra/README] diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/0020-gcc-14-compatibility.patch syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/0020-gcc-14-compatibility.patch --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/0020-gcc-14-compatibility.patch 1970-01-01 01:00:00.000000000 +0100 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/0020-gcc-14-compatibility.patch 2025-04-13 11:31:49.000000000 +0200 @@ -0,0 +1,71 @@ +Description: GCC-14 compatibility patch + +* Disable FCF protection on i386, where it's not supported. +* Add missing include to resolve implicit printf() function declaration. +* Add missing header file for long jumps in efi/main.c, fix invocations. +* Type-cast addr_t pointer to size_t, assumes that their size is the same. + +Author: Marek Benc <benc.marek.elektro98 at proton.me> +Bug-Debian: https://bugs.debian.org/1091027 +Last-Update: 2025-04-11 + +--- syslinux-6.04~git20190206.bf6db5b4+dfsg1.orig/mk/embedded.mk ++++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/mk/embedded.mk +@@ -24,6 +24,7 @@ GCCOPT :+ ifeq ($(ARCH),i386) + GCCOPT := $(call gcc_ok,-m32) + GCCOPT += $(call gcc_ok,-march=i386) ++ GCCOPT += $(call gcc_ok,-fcf-protection=none,) + GCCOPT += $(call gcc_ok,-mpreferred-stack-boundary=2,) + GCCOPT += $(call gcc_ok,-mincoming-stack-boundary=2,) + endif + +--- syslinux-6.04~git20190206.bf6db5b4+dfsg1.orig/com32/lib/syslinux/debug.c ++++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/com32/lib/syslinux/debug.c +@@ -1,4 +1,5 @@ + #include <linux/list.h> ++#include <stdio.h> + #include <string.h> + #include <stdbool.h> + +--- syslinux-6.04~git20190206.bf6db5b4+dfsg1.orig/efi/main.c ++++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/efi/main.c +@@ -6,6 +6,7 @@ + #include <core.h> + #include <fs.h> + #include <com32.h> ++#include <setjmp.h> + #include <syslinux/memscan.h> + #include <syslinux/firmware.h> + #include <syslinux/linux.h> +@@ -184,7 +185,7 @@ + * Inform the firmware that we failed to execute correctly, which + * will trigger the next entry in the EFI Boot Manager list. + */ +- longjmp(&load_error_buf, 1); ++ longjmp(load_error_buf, 1); + } + + void bios_timer_cleanup(void) +@@ -1382,7 +1383,7 @@ + status = uefi_call_wrapper(in->ReadKeyStroke, 2, in, &key); + } while (status == EFI_SUCCESS); + +- if (!setjmp(&load_error_buf)) ++ if (!setjmp(load_error_buf)) + load_env32(NULL); + + /* load_env32() failed.. cancel timer and bailout */ + +--- syslinux-6.04~git20190206.bf6db5b4+dfsg1.orig/com32/chain/chain.c ++++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/com32/chain/chain.c +@@ -514,7 +514,7 @@ + if (opt.file) { + fdat.base = (opt.fseg << 4) + opt.foff; + +- if (loadfile(opt.file, &fdat.data, &fdat.size)) { ++ if (loadfile(opt.file, &fdat.data, (size_t *)&fdat.size)) { + error("Couldn't read the boot file."); + goto bail; + } + diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/0021-add_wchar_t-type-definition.patch syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/0021-add_wchar_t-type-definition.patch --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/0021-add_wchar_t-type-definition.patch 1970-01-01 01:00:00.000000000 +0100 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/0021-add_wchar_t-type-definition.patch 2025-04-13 11:31:54.000000000 +0200 @@ -0,0 +1,58 @@ +From 063dac55c45d0264671c3463e824ab659e5cbb87 Mon Sep 17 00:00:00 2001 +From: Julien Olivain <ju.o at free.fr> +Date: Tue, 27 Feb 2024 21:09:15 +0100 +Subject: [PATCH] stddef.h: add wchar_t type definition + +Syslinux fail to build with gnu-efi >= 3.0.16 with error: + + In file included from /host/i686-buildroot-linux-gnu/sysroot/usr/include/efi/efi.h:44, + from /build/syslinux-6.03/efi/efi.h:23, + from /build/syslinux-6.03/efi/adv.h:4, + from /build/syslinux-6.03/efi/adv.c:29: + /host/i686-buildroot-linux-gnu/sysroot/usr/include/efi/ia32/efibind.h:90:9: error: unknown type name 'wchar_t' + typedef wchar_t CHAR16; + ^~~~~~~ + +This is because gnu-efi started to use the "wchar_t" type from the +toolchain's <stddef.h> header, in commit [1]. Before this commit, +gnu-efi was defining the type as "short". + +Syslinux is including its own minimal stddef.h file, which masks the +one provided by the toolchain. See [2]. This file does not have a type +definition for "wchar_t". + +Finally, the POSIX <stddef.h> header is supposed to provide this +"wchar_t" type definition. See [3]. + +This commit fixes the issue by adding the "wchar_t" type definition in +the com32/include/stddef.h header. Since Syslinux has "-fshort-wchar" +in its CFLAGS (see [4]), "wchar_t" is simply defined as "short". This +also follow the previous gnu-efi < 3.0.16 behavior. + +This issue was seen in Buildroot Linux, in [5]. + +[1] https://sourceforge.net/p/gnu-efi/code/ci/189200d0b0f6fff473d302880d9569f45d4d8c4d +[2] https://repo.or.cz/syslinux.git/blob/refs/tags/syslinux-6.03:/com32/include/stddef.h +[3] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/stddef.h.html +[4] https://repo.or.cz/syslinux.git/blob/refs/tags/syslinux-6.03:/mk/efi.mk#l27 +[5] https://lists.buildroot.org/pipermail/buildroot/2024-February/685971.html + +Upstream: Proposed: https://www.syslinux.org/archives/2024-February/026903.html +Signed-off-by: Julien Olivain <ju.o at free.fr> +--- + com32/include/stddef.h | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/com32/include/stddef.h b/com32/include/stddef.h +index f52d62f3..437b11f2 100644 +--- a/com32/include/stddef.h ++++ b/com32/include/stddef.h +@@ -29,4 +29,6 @@ + */ + #define container_of(p, c, m) ((c *)((char *)(p) - offsetof(c,m))) + ++typedef short wchar_t; ++ + #endif /* _STDDEF_H */ +-- +2.44.0 diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/series syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/series --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/series 2020-08-16 15:46:13.000000000 +0200 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/patches/series 2025-04-13 11:31:54.000000000 +0200 @@ -6,3 +6,5 @@ 0017-single-load-segment.patch 0018-prevent-pow-optimization.patch 0019-gcc-10-compatibility.patch +0020-gcc-14-compatibility.patch +0021-add_wchar_t-type-definition.patch diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-common.lintian-overrides syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-common.lintian-overrides --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-common.lintian-overrides 2020-08-16 15:46:13.000000000 +0200 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-common.lintian-overrides 2025-04-13 11:31:54.000000000 +0200 @@ -1,12 +1,13 @@ # bootloader modules are useful on all architectures -syslinux-common: arch-independent-package-contains-binary-or-object usr/lib/syslinux/modules/* +syslinux-common: arch-independent-package-contains-binary-or-object [usr/lib/syslinux/modules/*] # bootloader modules are intentionally not linked against libc -syslinux-common: library-not-linked-against-libc usr/lib/syslinux/modules/* +syslinux-common: library-not-linked-against-libc [usr/lib/syslinux/modules/*] # bootloder modules need zlib and libpng -syslinux-common: embedded-library usr/lib/syslinux/modules/* +syslinux-common: embedded-library libpng [usr/lib/syslinux/modules/*] +syslinux-common: embedded-library zlib [usr/lib/syslinux/modules/*] # bootloader modules are not loaded by the runtime linker; missing depends is a # false positive and the executable stack is intentional syslinux-common: missing-depends-line -syslinux-common: shlib-with-executable-stack usr/lib/syslinux/modules/* +syslinux-common: shlib-with-executable-stack [usr/lib/syslinux/modules/*] # specific documentation for the contents of these directories -syslinux-common: package-contains-documentation-outside-usr-share-doc usr/lib/syslinux/mbr/*/README +syslinux-common: package-contains-documentation-outside-usr-share-doc [usr/lib/syslinux/mbr/*/README] diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-efi.lintian-overrides syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-efi.lintian-overrides --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-efi.lintian-overrides 2020-08-16 15:46:13.000000000 +0200 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-efi.lintian-overrides 1970-01-01 01:00:00.000000000 +0100 @@ -1,5 +0,0 @@ -# The syslinux efi binaries are not Microsoft Windows Portable Executable (PE) -# files. -syslinux-efi: portable-executable-missing-security-features usr/lib/SYSLINUX.EFI/efi32/syslinux.efi ASLR DEP/NX -syslinux-efi: portable-executable-missing-security-features usr/lib/SYSLINUX.EFI/efi64/syslinux.efi ASLR DEP/NX - diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux.lintian-overrides syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux.lintian-overrides --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux.lintian-overrides 2020-08-16 15:46:13.000000000 +0200 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux.lintian-overrides 2025-04-13 11:31:54.000000000 +0200 @@ -1,2 +1,2 @@ # syslinux contains the VBR (volume boot record) which needs zlib embedded -syslinux: embedded-library usr/bin/syslinux: zlib +syslinux: embedded-library zlib [usr/bin/syslinux] diff -Nru syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-utils.lintian-overrides syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-utils.lintian-overrides --- syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-utils.lintian-overrides 2020-08-16 15:46:13.000000000 +0200 +++ syslinux-6.04~git20190206.bf6db5b4+dfsg1/debian/syslinux-utils.lintian-overrides 2025-04-13 11:31:54.000000000 +0200 @@ -1,2 +1,2 @@ # retained for compatibility with existing scripts -syslinux-utils: script-with-language-extension usr/bin/isohybrid.pl +syslinux-utils: script-with-language-extension [usr/bin/isohybrid.pl]
Elliott Mitchell
2025-Apr-13 22:22 UTC
[Pkg-xen-devel] Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
On Fri, Mar 14, 2025 at 10:42:24PM +0100, Maximilian Engelhardt wrote:> A fix [1] for the IO_PAGE_FAULT went into xen 4.20 which is now available in > testing and unstable. > The 4.20.0-1 Debian source package can also be compiled for bookworm if you > have a bookworm system running and want to test there. Please not that qemu > also needs to be recompiled for this xen version if you are using qemu. > > Can anyone affected by this bug conform if their issue is fixed in xen 4.20 or > is still there? > > [1] https://salsa.debian.org/xen-team/debian-xen/-/commit/b953a99da98d63a7c827248abc450d4e8e015ab6The analysis is the "(XEN) AMD-Vi: IO_PAGE_FAULT" message, and the software RAID data loss are distinct bugs. That patch/commit likely makes the correlated message disappear, but almost certainly leaves the software RAID data loss behind. Do any of the Debian maintainers have an AMD machine setup for debugging? I'm not very well setup for debugging this particular issue. If you've got an AMD machine with a pair of available SATA ports (including SATA power!), I could send a pair of SATA devices known to readily reproduce the issue. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg at m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Maximilian Engelhardt
2025-May-18 12:10 UTC
[Pkg-xen-devel] Bug#988477: Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
On Montag, 14. April 2025 00:22:01 CEST Elliott Mitchell wrote:> The analysis is the "(XEN) AMD-Vi: IO_PAGE_FAULT" message, and the > software RAID data loss are distinct bugs. That patch/commit likely > makes the correlated message disappear, but almost certainly leaves the > software RAID data loss behind. > > Do any of the Debian maintainers have an AMD machine setup for debugging? > I'm not very well setup for debugging this particular issue. If you've > got an AMD machine with a pair of available SATA ports (including SATA > power!), I could send a pair of SATA devices known to readily reproduce > the issue.I'm not aware of anybody in our team having hardware where they can reproduce this issue, else I'm sure they would have already provided feedback here. There are also not many reports here of people running into this problem. Thus I assume it needs a special (and probably rare) hardware combination to trigger this. One thing I can add is that I have been running software raid1 with Xen on two SATA SSDs on an Intel CPU since many years without seeing any data corruption. As Debian packages versions of xen, linux, etc. have changed a bit since the last time this issue was reported as reproduced in this bug, it would be good to get confirmation the problem is still there in Debian unstable or testing. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part. URL: <http://alioth-lists.debian.net/pipermail/pkg-xen-devel/attachments/20250518/1fafe8db/attachment-0001.sig>
Elliott Mitchell
2025-May-29 00:20 UTC
[Pkg-xen-devel] Bug#988477: Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
On Sun, May 18, 2025 at 02:10:25PM +0200, Maximilian Engelhardt wrote:> On Montag, 14. April 2025 00:22:01 CEST Elliott Mitchell wrote: > > > > Do any of the Debian maintainers have an AMD machine setup for debugging? > > I'm not very well setup for debugging this particular issue. If you've > > got an AMD machine with a pair of available SATA ports (including SATA > > power!), I could send a pair of SATA devices known to readily reproduce > > the issue. > > I'm not aware of anybody in our team having hardware where they can reproduce > this issue, else I'm sure they would have already provided feedback here. > There are also not many reports here of people running into this problem. Thus > I assume it needs a special (and probably rare) hardware combination to > trigger this. > One thing I can add is that I have been running software raid1 with Xen on two > SATA SSDs on an Intel CPU since many years without seeing any data corruption.I'm skeptical of it being rare, but certainly uncommon. You've got some similarity to the reproductions, but there are differences. First question, what brand/model are the SSDs? Samsung SSDs are known to be effected (severely effected for some models), while Crucial/Micron SSDs are uneffected (some models might be mildly effected). Second question, where are the SATA ports? They on-motherboard? Add-on card? The reproductions were with on-motherboard ports. What generation is your processor? Are you sure it has an IOMMU and Xen is driving the IOMMU? I had suspected Intel systems would be effected, but you may have disproven this.> As Debian packages versions of xen, linux, etc. have changed a bit since the > last time this issue was reported as reproduced in this bug, it would be good > to get confirmation the problem is still there in Debian unstable or testing.This is possible. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg at m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Reasonably Related Threads
- Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
- Bug#988477: Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
- Bug#988477: xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device
- Bug#988477: Also observing #988477
- [Bug 73233] New: [NV43] GeForce 6600 GT nouveau on AMD-Vi triggers constant errors: Event logged IO_PAGE_FAULT