Humble Devassy Chirammal
2015-Nov-13 10:01 UTC
[Gluster-users] File Corruption with shards - 100% reproducable
Hi Lindsay,>- start the vm, open a console to it. - live migrate the VM to a another node - It will rapdily barf itself with disk errors>Can you please share which 'cache' option ( none, writeback, writethrough..etc) has been set for I/O on this problematic VM ? This can be fetched either from process output or from xml schema of the VM. --Humble On Fri, Nov 13, 2015 at 11:57 AM, Lindsay Mathieson < lindsay.mathieson at gmail.com> wrote:> > On 12 November 2015 at 15:46, Krutika Dhananjay <kdhananj at redhat.com> > wrote: > >> OK. What do the client logs say? >> > > Dumb question - Which logs are those? > > > Could you share the exact steps to recreate this, and I will try it >> locally on my setup? >> > > I'm running this on a 3 node proxmox cluster, which makes the vm creation > & migration easy to test. > > Steps: > - Create 3 node gluster datastore using proxmox vm host nodes > > - Add gluster datastore as a storage dvice to proxmox > * qemu vms use the gfapi to access the datastore > * proxmox also adds a fuse mount for easy acces > > - create a VM on the gluster storage, QCOW2 format. I just created a > simple debain Mate vm > > - start the vm, open a console to it. > > - live migrate the VM to a another node > > - It will rapdily barf itself with disk errors > > - stop the VM > > - qemu will show file corruption (many many errors) > * qemu-img check <vm disk image> > * qemu-img info <vm disk image> > > > Repeating the process with sharding off has no errors. > > > >> >> Also, want to see the output of 'gluster volume info'. >> > > > I've trimmed settings down to a bare minimum. This is a test gluster > cluster so I can do with it as I wish. > > > > gluster volume info > > Volume Name: datastore1 > Type: Replicate > Volume ID: 238fddd0-a88c-4edb-8ac5-ef87c58682bf > Status: Started > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: vnb.proxmox.softlog:/mnt/ext4 > Brick2: vng.proxmox.softlog:/mnt/ext4 > Brick3: vna.proxmox.softlog:/mnt/ext4 > Options Reconfigured: > performance.strict-write-ordering: on > performance.readdir-ahead: off > cluster.quorum-type: auto > features.shard: on > > > > -- > Lindsay > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151113/9b0136d1/attachment.html>
Lindsay Mathieson
2015-Nov-13 10:21 UTC
[Gluster-users] File Corruption with shards - 100% reproducable
On 13 November 2015 at 20:01, Humble Devassy Chirammal < humble.devassy at gmail.com> wrote:> Can you please share which 'cache' option ( none, writeback, > writethrough..etc) has been set for I/O on this problematic VM ? This > can be fetched either from process output or from xml schema of the VM. >Writeback. I believe I did test it with writethrough, but I will test again. -- Lindsay -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151113/2156bad5/attachment.html>
Lindsay Mathieson
2015-Nov-13 11:02 UTC
[Gluster-users] File Corruption with shards - 100% reproducable
On 13 November 2015 at 20:01, Humble Devassy Chirammal < humble.devassy at gmail.com> wrote:> Can you please share which 'cache' option ( none, writeback, > writethrough..etc) has been set for I/O on this problematic VM ? This > can be fetched either from process output or from xml schema of the VM. >I tried it with Cache of and Cache = Sync. Both times the image was corrupted. -- Lindsay -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151113/7ccb7f31/attachment.html>