Krutika Dhananjay
2015-Nov-12 05:46 UTC
[Gluster-users] File Corruption with shards - 100% reproducable
OK. What do the client logs say? Could you share the exact steps to recreate this, and I will try it locally on my setup? Also, want to see the output of 'gluster volume info'. -Krutika ----- Original Message -----> From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com> > To: "Krutika Dhananjay" <kdhananj at redhat.com> > Cc: "gluster-users" <gluster-users at gluster.org> > Sent: Thursday, November 12, 2015 11:04:51 AM > Subject: Re: [Gluster-users] File Corruption with shards - 100% reproducable> On 5 November 2015 at 21:55, Krutika Dhananjay < kdhananj at redhat.com > wrote:> > Although I do not have experience with VM live migration, IIUC, it is got > > to > > do with a different server (and as a result a new glusterfs client process) > > taking over the operations and mgmt of the VM. > > > If this is a correct assumption, then I think this could be the result of > > the > > same caching bug that I talked about sometime back in 3.7.5, which is fixed > > in 3.7.6. > > > The issue could cause the new client to not see the correct size and block > > count of the file, leading to errors in reads (perhaps triggered by the > > restart of the vm) and writes on the image. >> Unfortunately this problem is still occuring with 3.7.6, 100% of the time.> Tried with shards disabled and there no problem.> -- > Lindsay-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151112/6b9036ed/attachment.html>
Lindsay Mathieson
2015-Nov-13 06:27 UTC
[Gluster-users] File Corruption with shards - 100% reproducable
On 12 November 2015 at 15:46, Krutika Dhananjay <kdhananj at redhat.com> wrote:> OK. What do the client logs say? >Dumb question - Which logs are those? Could you share the exact steps to recreate this, and I will try it locally> on my setup? >I'm running this on a 3 node proxmox cluster, which makes the vm creation & migration easy to test. Steps: - Create 3 node gluster datastore using proxmox vm host nodes - Add gluster datastore as a storage dvice to proxmox * qemu vms use the gfapi to access the datastore * proxmox also adds a fuse mount for easy acces - create a VM on the gluster storage, QCOW2 format. I just created a simple debain Mate vm - start the vm, open a console to it. - live migrate the VM to a another node - It will rapdily barf itself with disk errors - stop the VM - qemu will show file corruption (many many errors) * qemu-img check <vm disk image> * qemu-img info <vm disk image> Repeating the process with sharding off has no errors.> > Also, want to see the output of 'gluster volume info'. >I've trimmed settings down to a bare minimum. This is a test gluster cluster so I can do with it as I wish. gluster volume info Volume Name: datastore1 Type: Replicate Volume ID: 238fddd0-a88c-4edb-8ac5-ef87c58682bf Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: vnb.proxmox.softlog:/mnt/ext4 Brick2: vng.proxmox.softlog:/mnt/ext4 Brick3: vna.proxmox.softlog:/mnt/ext4 Options Reconfigured: performance.strict-write-ordering: on performance.readdir-ahead: off cluster.quorum-type: auto features.shard: on -- Lindsay -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151113/d0c9fcbf/attachment.html>