thr3ads.net - Gluster users - [Gluster-users] File Corruption with shards

If this information is useful, please help other people find it:
Share via:

Krutika Dhananjay

2015-Nov-14 07:30 UTC

[Gluster-users] File Corruption with shards - 100% reproducable

You should be able to find a file named group-virt.example under /etc/glusterfs/
Copy that as /var/lib/glusterd/virt. 

Then execute `gluster volume set datastore1 group virt`. 
Now with this configuration, could you try your test case and let me know
whether the file corruption still exists?

-Krutika 

----- Original Message -----
> From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com>
> To: "Krutika Dhananjay" <kdhananj at redhat.com>
> Cc: "gluster-users" <gluster-users at gluster.org>
> Sent: Saturday, November 14, 2015 10:51:26 AM
> Subject: RE: [Gluster-users] File Corruption with shards - 100%
reproducable
> gluster volume set datastore1 group virt
> Unable to open file '/var/lib/glusterd/groups/virt'. Error: No such
file or
> directory
> Not sure I understand this one ? couldn?t find any docs for it.
> Sent from Mail for Windows 10
> From: Krutika Dhananjay
> Sent: Saturday, 14 November 2015 1:45 PM
> To: Lindsay Mathieson
> Cc: gluster-users
> Subject: Re: [Gluster-users] File Corruption with shards - 100%
reproducable
> The logs are at
/var/log/glusterfs/<hyphenated-path-to-the-mountpoint>.log
> OK. So what do you observe when you set group virt to on?
> # gluster volume set <VOL> group virt
> -Krutika
> > From: "Lindsay Mathieson" <lindsay.mathieson at
gmail.com>
> 
> > To: "Krutika Dhananjay" <kdhananj at redhat.com>
> 
> > Cc: "gluster-users" <gluster-users at gluster.org>
> 
> > Sent: Friday, November 13, 2015 11:57:15 AM
> 
> > Subject: Re: [Gluster-users] File Corruption with shards - 100%
> > reproducable
> 
> > On 12 November 2015 at 15:46, Krutika Dhananjay < kdhananj at
redhat.com >
> > wrote:
> 
> > > OK. What do the client logs say?
> > 
> 
> > Dumb question - Which logs are those?
> 
> > > Could you share the exact steps to recreate this, and I will try
it
> > > locally
> > > on my setup?
> > 
> 
> > I'm running this on a 3 node proxmox cluster, which makes the vm
creation &
> > migration easy to test.
> 
> > Steps:
> 
> > - Create 3 node gluster datastore using proxmox vm host nodes
> 
> > - Add gluster datastore as a storage dvice to proxmox
> 
> > * qemu vms use the gfapi to access the datastore
> 
> > * proxmox also adds a fuse mount for easy acces
> 
> > - create a VM on the gluster storage, QCOW2 format. I just created a
simple
> > debain Mate vm
> 
> > - start the vm, open a console to it.
> 
> > - live migrate the VM to a another node
> 
> > - It will rapdily barf itself with disk errors
> 
> > - stop the VM
> 
> > - qemu will show file corruption (many many errors)
> 
> > * qemu-img check <vm disk image>
> 
> > * qemu-img info <vm disk image>
> 
> > Repeating the process with sharding off has no errors.
> 
> > > Also, want to see the output of 'gluster volume info'.
> > 
> 
> > I've trimmed settings down to a bare minimum. This is a test
gluster
> > cluster
> > so I can do with it as I wish.
> 
> > gluster volume info
> 
> > Volume Name: datastore1
> 
> > Type: Replicate
> 
> > Volume ID: 238fddd0-a88c-4edb-8ac5-ef87c58682bf
> 
> > Status: Started
> 
> > Number of Bricks: 1 x 3 = 3
> 
> > Transport-type: tcp
> 
> > Bricks:
> 
> > Brick1: vnb.proxmox.softlog:/mnt/ext4
> 
> > Brick2: vng.proxmox.softlog:/mnt/ext4
> 
> > Brick3: vna.proxmox.softlog:/mnt/ext4
> 
> > Options Reconfigured:
> 
> > performance.strict-write-ordering: on
> 
> > performance.readdir-ahead: off
> 
> > cluster.quorum-type: auto
> 
> > features.shard: on
> 
> > --
> 
> > Lindsay
> -------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151114/d0d3247d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 087031D8F78147B19543CB04F52FB3EB.png
Type: image/png
Size: 148 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151114/d0d3247d/attachment.png>

Lindsay Mathieson

2015-Nov-14 11:26 UTC

head link

[Gluster-users] File Corruption with shards - 100% reproducable

On 14 November 2015 at 17:30, Krutika Dhananjay <kdhananj at redhat.com>
wrote:
> You should be able to find a file named group-virt.example under
> /etc/glusterfs/
> Copy that as /var/lib/glusterd/virt.
>


Doesn't seem to exist in the debian jessie apt repo, but I copied it from
here:


https://raw.githubusercontent.com/gluster/glusterfs/master/extras/group-virt.example

And I think it needed to go here:

  /var/lib/glusterd/virt/group

However the good news is, I applied it and seems to have made the
difference. Freely migrating my test VM now with no corruption. Will stress
test it a bit more.

current settings are now:
gluster volume info

Volume Name: datastore1
Type: Replicate
Volume ID: 238fddd0-a88c-4edb-8ac5-ef87c58682bf
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: vnb.proxmox.softlog:/mnt/ext4
Brick2: vng.proxmox.softlog:/mnt/ext4
Brick3: vna.proxmox.softlog:/mnt/ext4
Options Reconfigured:
cluster.server-quorum-type: server
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
cluster.quorum-type: auto
performance.readdir-ahead: off
features.shard: on
features.shard-block-size: 128MB
performance.strict-write-ordering: on


Thanks for all the help. If you like I can start unsetting setting until we
discover which one does the trick.

-- 
Lindsay
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151114/c0b899f6/attachment.html>

Gluster users - Nov 2015 - File Corruption with shards - 100% reproducable

[Gluster-users] File Corruption with shards - 100% reproducable

[Gluster-users] File Corruption with shards - 100% reproducable