Philipp Hahn
2011-Nov-29 14:30 UTC
RFH: Corruption with blktap2 on Debian 2.6.32-39 + xen-4.1.2
Hello, I have observed several strange blktap2(?) corruption problems using Xen-4.1.2 on several 2.6.32-39 based amd64 Linux Systems. I run an installation of a domain, which use 3 blktap2 devices: 2× 20 GiB hard disk image files and 1× 1.1 GiB DVD iso file. During installation processes start to SEGV, which aborts the installation. * Not all processes in domU segfault in each run: "date", "bash", "grep", "wc", ... * md5sum finds multiple files where the MD5 sum doesn''t match the expected value after installation. * It doesn''t matter, if the domUy are PV or HV. * If I only use one disk, I''ve not observed the problem. * The problem is likely to manifest more, the more blktap2 processes are running. * If I use loopback or blkback to LVM instead of blktap2, I''ve not observed the corruption. * In one try I even had SEGVs in dom0. * In one try the host rebooted. * In one try the domU switched to read-only after file-system corruption. * I tested three HW systems: On 2 of them the problem always manifests, on a 3rd the corruption wasn''t observed yet. All three have Intel i7 CPUs and 8 GiB RAM. This makes me wonder if this is some kind of memory corruption problem caused by un-synchronized parallism. Both dom0 and domU use our UCS-2.6.32 kernel, which is equivalent to Debian 2.6.32-5-amd64 plus some more patches for a e1000 backport and some KVM fixes. I actually tested the Debian as well and observed the same problem, so the problem should not be specific to our kernel, but Debians as well. Has someone observed or does know about any similar problems? Anything more I can do to narrow down that problem? Sincerely Philipp Hahn -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH Linux for Your Business fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/ _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Philipp Hahn
2011-Dec-02 12:51 UTC
Re: RFH: Corruption with blktap2 on Debian 2.6.32-39 + xen-4.1.2
Hello, On Tuesday 29 November 2011 15:30:11 Philipp Hahn wrote:> I have observed several strange blktap2(?) corruption problems using > Xen-4.1.2 on several 2.6.32-39 based amd64 Linux Systems. I run an > installation of a domain, which use 3 blktap2 devices: 2× 20 GiB hard disk > image files and 1× 1.1 GiB DVD iso file. During installation processes > start to SEGV, which aborts the installation....> * In one try the domU switched to read-only after file-system corruption. > * I tested three HW systems: On 2 of them the problem always manifests, on > a 3rd the corruption wasn''t observed yet. All three have Intel i7 CPUs and > 8 GiB RAM.It seems to depend on some CPU feature flags: lynx1: fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf pni pclmulqdq est ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm ida arat xen14: fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf pni est ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm ida On ''lynx1'' I can reproduce the crash, but not on ''xen14''. With linux-3.1.x it''s also fixed, but not with 2.6.32. Any ideas to narrow this further down? Sincerely Philipp Hahn -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH Linux for Your Business fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/ _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Possibly Parallel Threads
- Fw: [PATCH] blktap2: blktap2 and pygrub (xen-unstable)
- RFH: loopback & blktap(2) and CDROM
- [BUG] insufficient quoting between "tap-ctl list" and xend/server/BlktapController.py
- a problem with using qcow2 format image files as virtual disks
- blktap2, also broken in current pv_ops stable-2.6.32.x?