thr3ads.net - Gluster users - [Gluster-users] File Corruption with shards

If this information is useful, please help other people find it:
Share via:

Krutika Dhananjay

2015-Nov-12 05:46 UTC

[Gluster-users] File Corruption with shards - 100% reproducable

OK. What do the client logs say? 
Could you share the exact steps to recreate this, and I will try it locally on
my setup?

Also, want to see the output of 'gluster volume info'. 

-Krutika 
----- Original Message -----
> From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com>
> To: "Krutika Dhananjay" <kdhananj at redhat.com>
> Cc: "gluster-users" <gluster-users at gluster.org>
> Sent: Thursday, November 12, 2015 11:04:51 AM
> Subject: Re: [Gluster-users] File Corruption with shards - 100%
reproducable
> On 5 November 2015 at 21:55, Krutika Dhananjay < kdhananj at redhat.com
> wrote:
> > Although I do not have experience with VM live migration, IIUC, it is
got
> > to
> > do with a different server (and as a result a new glusterfs client
process)
> > taking over the operations and mgmt of the VM.
> 
> > If this is a correct assumption, then I think this could be the result
of
> > the
> > same caching bug that I talked about sometime back in 3.7.5, which is
fixed
> > in 3.7.6.
> 
> > The issue could cause the new client to not see the correct size and
block
> > count of the file, leading to errors in reads (perhaps triggered by
the
> > restart of the vm) and writes on the image.
> 
> Unfortunately this problem is still occuring with 3.7.6, 100% of the time.
> Tried with shards disabled and there no problem.
> --
> Lindsay-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151112/6b9036ed/attachment.html>

Lindsay Mathieson

2015-Nov-13 06:27 UTC

head link

[Gluster-users] File Corruption with shards - 100% reproducable

On 12 November 2015 at 15:46, Krutika Dhananjay <kdhananj at redhat.com>
wrote:
> OK. What do the client logs say?
>
Dumb question - Which logs are those?


Could you share the exact steps to recreate this, and I will try it
locally> on my setup?
>
I'm running this on a 3 node proxmox cluster, which makes the vm creation
&
migration easy to test.

Steps:
- Create 3 node gluster datastore using proxmox vm host nodes

- Add gluster datastore as a storage dvice to proxmox
  * qemu vms use the gfapi to access the datastore
  * proxmox also adds a fuse mount for easy acces

- create a VM on the gluster storage, QCOW2 format. I just created a simple
debain Mate vm

- start the vm, open a console to it.

- live migrate the VM to a another node

- It will rapdily barf itself with disk errors

- stop the VM

- qemu will show file corruption (many many errors)
  * qemu-img check <vm disk image>
  * qemu-img info <vm disk image>


Repeating the process with sharding off has no errors.


>
> Also, want to see the output of 'gluster volume info'.
>

I've trimmed settings down to a bare minimum. This is a test gluster
cluster so I can do with it as I wish.



gluster volume info

Volume Name: datastore1
Type: Replicate
Volume ID: 238fddd0-a88c-4edb-8ac5-ef87c58682bf
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: vnb.proxmox.softlog:/mnt/ext4
Brick2: vng.proxmox.softlog:/mnt/ext4
Brick3: vna.proxmox.softlog:/mnt/ext4
Options Reconfigured:
performance.strict-write-ordering: on
performance.readdir-ahead: off
cluster.quorum-type: auto
features.shard: on



-- 
Lindsay
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151113/d0c9fcbf/attachment.html>

Gluster users - Nov 2015 - File Corruption with shards - 100% reproducable

[Gluster-users] File Corruption with shards - 100% reproducable

[Gluster-users] File Corruption with shards - 100% reproducable