Lindsay Mathieson
2015-Nov-14 05:21 UTC
[Gluster-users] File Corruption with shards - 100% reproducable
gluster volume set datastore1 group virt Unable to open file '/var/lib/glusterd/groups/virt'. Error: No such file or directory Not sure I understand this one ? couldn?t find any docs for it. Sent from Mail for Windows 10 From: Krutika Dhananjay Sent: Saturday, 14 November 2015 1:45 PM To: Lindsay Mathieson Cc: gluster-users Subject: Re: [Gluster-users] File Corruption with shards - 100% reproducable The logs are at /var/log/glusterfs/<hyphenated-path-to-the-mountpoint>.log OK. So what do you observe when you set group virt to on? # gluster volume set <VOL> group virt -Krutika From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com> To: "Krutika Dhananjay" <kdhananj at redhat.com> Cc: "gluster-users" <gluster-users at gluster.org> Sent: Friday, November 13, 2015 11:57:15 AM Subject: Re: [Gluster-users] File Corruption with shards - 100% reproducable On 12 November 2015 at 15:46, Krutika Dhananjay <kdhananj at redhat.com> wrote: OK. What do the client logs say? Dumb question - Which logs are those?? Could you share the exact steps to recreate this, and I will try it locally on my setup? I'm running this on a 3 node proxmox cluster, which makes the vm creation & migration easy to test. Steps: - Create 3 node gluster datastore using proxmox vm host nodes - Add gluster datastore as a storage dvice to proxmox ? * qemu vms use the gfapi to access the datastore ? * proxmox also adds a fuse mount for easy acces - create a VM on the gluster storage, QCOW2 format. I just created a simple debain Mate vm - start the vm, open a console to it. - live migrate the VM to a another node - It will rapdily barf itself with disk errors - stop the VM - qemu will show file corruption (many many errors) ? * qemu-img check <vm disk image> ? * qemu-img info <vm disk image> Repeating the process with sharding off has no errors. ? Also, want to see the output of 'gluster volume info'. I've trimmed settings down to a bare minimum. This is a test gluster cluster so I can do with it as I wish. gluster volume info ? Volume Name: datastore1 Type: Replicate Volume ID: 238fddd0-a88c-4edb-8ac5-ef87c58682bf Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: vnb.proxmox.softlog:/mnt/ext4 Brick2: vng.proxmox.softlog:/mnt/ext4 Brick3: vna.proxmox.softlog:/mnt/ext4 Options Reconfigured: performance.strict-write-ordering: on performance.readdir-ahead: off cluster.quorum-type: auto features.shard: on -- Lindsay -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151114/eb1bf38d/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 087031D8F78147B19543CB04F52FB3EB.png Type: image/png Size: 148 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151114/eb1bf38d/attachment.png>
Krutika Dhananjay
2015-Nov-14 07:30 UTC
[Gluster-users] File Corruption with shards - 100% reproducable
You should be able to find a file named group-virt.example under /etc/glusterfs/ Copy that as /var/lib/glusterd/virt. Then execute `gluster volume set datastore1 group virt`. Now with this configuration, could you try your test case and let me know whether the file corruption still exists? -Krutika ----- Original Message -----> From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com> > To: "Krutika Dhananjay" <kdhananj at redhat.com> > Cc: "gluster-users" <gluster-users at gluster.org> > Sent: Saturday, November 14, 2015 10:51:26 AM > Subject: RE: [Gluster-users] File Corruption with shards - 100% reproducable> gluster volume set datastore1 group virt> Unable to open file '/var/lib/glusterd/groups/virt'. Error: No such file or > directory> Not sure I understand this one ? couldn?t find any docs for it.> Sent from Mail for Windows 10> From: Krutika Dhananjay > Sent: Saturday, 14 November 2015 1:45 PM > To: Lindsay Mathieson > Cc: gluster-users > Subject: Re: [Gluster-users] File Corruption with shards - 100% reproducable> The logs are at /var/log/glusterfs/<hyphenated-path-to-the-mountpoint>.log> OK. So what do you observe when you set group virt to on?> # gluster volume set <VOL> group virt> -Krutika> > From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com> > > > To: "Krutika Dhananjay" <kdhananj at redhat.com> > > > Cc: "gluster-users" <gluster-users at gluster.org> > > > Sent: Friday, November 13, 2015 11:57:15 AM > > > Subject: Re: [Gluster-users] File Corruption with shards - 100% > > reproducable >> > On 12 November 2015 at 15:46, Krutika Dhananjay < kdhananj at redhat.com > > > wrote: > > > > OK. What do the client logs say? > > >> > Dumb question - Which logs are those? >> > > Could you share the exact steps to recreate this, and I will try it > > > locally > > > on my setup? > > >> > I'm running this on a 3 node proxmox cluster, which makes the vm creation & > > migration easy to test. >> > Steps: >> > - Create 3 node gluster datastore using proxmox vm host nodes >> > - Add gluster datastore as a storage dvice to proxmox >> > * qemu vms use the gfapi to access the datastore >> > * proxmox also adds a fuse mount for easy acces >> > - create a VM on the gluster storage, QCOW2 format. I just created a simple > > debain Mate vm >> > - start the vm, open a console to it. >> > - live migrate the VM to a another node >> > - It will rapdily barf itself with disk errors >> > - stop the VM >> > - qemu will show file corruption (many many errors) >> > * qemu-img check <vm disk image> > > > * qemu-img info <vm disk image> >> > Repeating the process with sharding off has no errors. >> > > Also, want to see the output of 'gluster volume info'. > > >> > I've trimmed settings down to a bare minimum. This is a test gluster > > cluster > > so I can do with it as I wish. >> > gluster volume info >> > Volume Name: datastore1 > > > Type: Replicate > > > Volume ID: 238fddd0-a88c-4edb-8ac5-ef87c58682bf > > > Status: Started > > > Number of Bricks: 1 x 3 = 3 > > > Transport-type: tcp > > > Bricks: > > > Brick1: vnb.proxmox.softlog:/mnt/ext4 > > > Brick2: vng.proxmox.softlog:/mnt/ext4 > > > Brick3: vna.proxmox.softlog:/mnt/ext4 > > > Options Reconfigured: > > > performance.strict-write-ordering: on > > > performance.readdir-ahead: off > > > cluster.quorum-type: auto > > > features.shard: on >> > -- >> > Lindsay >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151114/d0d3247d/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 087031D8F78147B19543CB04F52FB3EB.png Type: image/png Size: 148 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151114/d0d3247d/attachment.png>