Konrad Rzeszutek Wilk
2012-Dec-06 03:14 UTC
LVM Checksum error when using persistent grants (#linux-next + stable/for-jens-3.8)
Hey Roger, I am seeing this weird behavior when using #linux-next + stable/for-jens-3.8 tree. Basically I can do ''pvscan'' on xvd* disk and quite often I get checksum errors: # pvscan /dev/xvdf PV /dev/xvdf2 VG VolGroup00 lvm2 [18.88 GiB / 0 free] PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] Total: 5 [962.38 GiB] / in use: 5 [962.38 GiB] / in no VG: 0 [0 ] # pvscan /dev/xvdf /dev/xvdf2: Checksum error Couldn''t read volume group metadata. /dev/xvdf2: Checksum error Couldn''t read volume group metadata. PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] Total: 4 [943.50 GiB] / in use: 4 [943.50 GiB] / in no VG: 0 [0 ] This is with a i386 dom0, 64-bit Xen 4.1.3 hypervisor, and with either 64-bit or 32-bit PV or PVHVM guest. Have you seen something like this? Note, the other LV disks are over iSCSI and are working fine.
Roger Pau Monné
2012-Dec-07 10:08 UTC
Re: LVM Checksum error when using persistent grants (#linux-next + stable/for-jens-3.8)
On 06/12/12 04:14, Konrad Rzeszutek Wilk wrote:> Hey Roger, > > I am seeing this weird behavior when using #linux-next + stable/for-jens-3.8 tree. > > Basically I can do ''pvscan'' on xvd* disk and quite often I get checksum errors: > > # pvscan /dev/xvdf > PV /dev/xvdf2 VG VolGroup00 lvm2 [18.88 GiB / 0 free] > PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] > Total: 5 [962.38 GiB] / in use: 5 [962.38 GiB] / in no VG: 0 [0 ] > # pvscan /dev/xvdf > /dev/xvdf2: Checksum error > Couldn''t read volume group metadata. > /dev/xvdf2: Checksum error > Couldn''t read volume group metadata. > PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] > Total: 4 [943.50 GiB] / in use: 4 [943.50 GiB] / in no VG: 0 [0 ] > > This is with a i386 dom0, 64-bit Xen 4.1.3 hypervisor, and with either > 64-bit or 32-bit PV or PVHVM guest. > > Have you seen something like this? > > Note, the other LV disks are over iSCSI and are working fine.Thanks for the report Konrad, I''m able to reproduce this: root@debian:~# pvscan -d -v /dev/xvdb2 Wiping cache of LVM-capable devices Wiping internal VG cache Walking through all physical volumes PV /dev/xvdb2 lvm2 [4.99 GiB] Total: 1 [4.99 GiB] / in use: 0 [0 ] / in no VG: 1 [4.99 GiB] root@debian:~# pvscan -d -v /dev/xvdb2 Wiping cache of LVM-capable devices Wiping internal VG cache Walking through all physical volumes No matching physical volumes found What I find strange is that this only happens when using partitions as LVM PVs, if I use the full disk (/dev/xvdb) as a PV I''m not able to reproduce it. I will investigate further.
Konrad Rzeszutek Wilk
2012-Dec-07 14:22 UTC
Re: LVM Checksum error when using persistent grants (#linux-next + stable/for-jens-3.8)
On Wed, Dec 05, 2012 at 10:14:55PM -0500, Konrad Rzeszutek Wilk wrote:> Hey Roger, > > I am seeing this weird behavior when using #linux-next + stable/for-jens-3.8 tree.To make it easier I just used v3.7-rc8 and merged stable/for-jens-3.8 tree.> > Basically I can do ''pvscan'' on xvd* disk and quite often I get checksum errors: > > # pvscan /dev/xvdf > PV /dev/xvdf2 VG VolGroup00 lvm2 [18.88 GiB / 0 free] > PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] > Total: 5 [962.38 GiB] / in use: 5 [962.38 GiB] / in no VG: 0 [0 ] > # pvscan /dev/xvdf > /dev/xvdf2: Checksum error > Couldn''t read volume group metadata. > /dev/xvdf2: Checksum error > Couldn''t read volume group metadata. > PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] > PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] > Total: 4 [943.50 GiB] / in use: 4 [943.50 GiB] / in no VG: 0 [0 ] > > This is with a i386 dom0, 64-bit Xen 4.1.3 hypervisor, and with either > 64-bit or 32-bit PV or PVHVM guest.And it does not matter if dom0 is 64-bit.> > Have you seen something like this?More interestingly is that the failure is the frontend. I ran the "new" guests that do persistent grants with the old backends (so v3.7-rc8 virgin) and still got the same failure.> > Note, the other LV disks are over iSCSI and are working fine. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Roger Pau Monné
2012-Dec-07 17:05 UTC
Re: LVM Checksum error when using persistent grants (#linux-next + stable/for-jens-3.8)
On 07/12/12 15:22, Konrad Rzeszutek Wilk wrote:> On Wed, Dec 05, 2012 at 10:14:55PM -0500, Konrad Rzeszutek Wilk wrote: >> Hey Roger, >> >> I am seeing this weird behavior when using #linux-next + stable/for-jens-3.8 tree. > > To make it easier I just used v3.7-rc8 and merged stable/for-jens-3.8 > tree. > >> >> Basically I can do ''pvscan'' on xvd* disk and quite often I get checksum errors: >> >> # pvscan /dev/xvdf >> PV /dev/xvdf2 VG VolGroup00 lvm2 [18.88 GiB / 0 free] >> PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] >> Total: 5 [962.38 GiB] / in use: 5 [962.38 GiB] / in no VG: 0 [0 ] >> # pvscan /dev/xvdf >> /dev/xvdf2: Checksum error >> Couldn''t read volume group metadata. >> /dev/xvdf2: Checksum error >> Couldn''t read volume group metadata. >> PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] >> Total: 4 [943.50 GiB] / in use: 4 [943.50 GiB] / in no VG: 0 [0 ] >> >> This is with a i386 dom0, 64-bit Xen 4.1.3 hypervisor, and with either >> 64-bit or 32-bit PV or PVHVM guest. > > And it does not matter if dom0 is 64-bit. >> >> Have you seen something like this? > > More interestingly is that the failure is the frontend. I ran the "new" > guests that do persistent grants with the old backends (so v3.7-rc8 > virgin) and still got the same failure. > >> >> Note, the other LV disks are over iSCSI and are working fine.I''ve found the problem, this happens when you copy only a part of the shared data in blkif_completion, this is an example of the problem: 1st loop in rq_for_each_segment * bv_offset: 3584 * bv_len: 512 * offset += bv_len * i: 0 2nd loop: * bv_offset: 0 * bv_len: 512 * i: 0 As you can see, in the second loop i is still 0 (because offset is only 512, so 512 >> PAGE_SHIFT is 0) when it should be 1. This problem made me realize another corner case, which I don''t know if can happen, AFAIK I''ve never seen it: 1st loop in rq_for_each_segment * bv_offset: 1024 * bv_len: 512 * offset += len * i: 0 2nd loop: * bv_offset: 0 * bv_len: 512 * i: 0 In this second case, should i be 1? Can this really happen? I can''t see anyway to get a "global offset" or something similar, that''s not realtive to the bvec being handled right now. For the problem that you described a quick fix follows, but it doesn''t cover the second case exposed above: --- diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index df21b05..6e155d0 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -869,7 +871,7 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info, bvec->bv_len); bvec_kunmap_irq(bvec_data, &flags); kunmap_atomic(shared_data); - offset += bvec->bv_len; + offset = (i * PAGE_SIZE) + (bvec->bv_offset + bvec->bv_len); } } /* Add the persistent grant into the list of free grants */
Seemingly Similar Threads
- making scsi disks visible to RHEL 5 guest
- CentOS 6.x, kernel-2.6.32-220.7.1, EC2 and drive enumeration
- LiveCD System recovery - Mounting LVM?
- Formatting hdb?
- Is: Xen 4.2 and using 'xl' to save/restore is buggy with PVHVM Linux guests (v3.10 and v3.11 and presumarily earlier as well). Works with Xen 4.3 and Xen 4.4. Was:Re: FAILURE 3.11.0-rc7upstream(x86_64) 3.11.0-rc7upstream(i386)\: 2013-08-26 (tst001)