thr3ads.net - Gluster users - [Gluster-users] Gluster 3.8.10 rebalance VMs corruption [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Pranith Kumar Karampuri

2017-Apr-27 11:03 UTC

[Gluster-users] Gluster 3.8.10 rebalance VMs corruption

I am very positive about the two things I told you. These are the latest
things that happened for VM corruption with rebalance.

On Thu, Apr 27, 2017 at 4:30 PM, Gandalf Corvotempesta <
gandalf.corvotempesta at gmail.com> wrote:
> I think we are talking about a different bug.
>
> Il 27 apr 2017 12:58 PM, "Pranith Kumar Karampuri" <pkarampu
at redhat.com>
> ha scritto:
>
>> I am not a DHT developer, so some of what I say could be a little
wrong.
>> But this is what I gather.
>> I think they found 2 classes of bugs in dht
>> 1) Graceful fop failover when rebalance is in progress is missing for
>> some fops, that lead to VM pause.
>>
>> I see that https://review.gluster.org/17085 got merged on 24th on
master
>> for this. I see patches are posted for 3.8.x for this one.
>>
>> 2) I think there is some work needs to be done for dht_[f]xattrop. I
>> believe this is the next step that is underway.
>>
>>
>> On Thu, Apr 27, 2017 at 12:13 PM, Gandalf Corvotempesta <
>> gandalf.corvotempesta at gmail.com> wrote:
>>
>>> Updates on this critical bug ?
>>>
>>> Il 18 apr 2017 8:24 PM, "Gandalf Corvotempesta" <
>>> gandalf.corvotempesta at gmail.com> ha scritto:
>>>
>>>> Any update ?
>>>> In addition, if this is a different bug but the
"workflow" is the same
>>>> as the previous one, how is possible that fixing the previous
bug
>>>> triggered this new one ?
>>>>
>>>> Is possible to have some details ?
>>>>
>>>> 2017-04-04 16:11 GMT+02:00 Krutika Dhananjay <kdhananj at
redhat.com>:
>>>> > Nope. This is a different bug.
>>>> >
>>>> > -Krutika
>>>> >
>>>> > On Mon, Apr 3, 2017 at 5:03 PM, Gandalf Corvotempesta
>>>> > <gandalf.corvotempesta at gmail.com> wrote:
>>>> >>
>>>> >> This is a good news
>>>> >> Is this related to the previously fixed bug?
>>>> >>
>>>> >> Il 3 apr 2017 10:22 AM, "Krutika Dhananjay"
<kdhananj at redhat.com> ha
>>>> >> scritto:
>>>> >>>
>>>> >>> So Raghavendra has an RCA for this issue.
>>>> >>>
>>>> >>> Copy-pasting his comment here:
>>>> >>>
>>>> >>> <RCA>
>>>> >>>
>>>> >>> Following is a rough algorithm of shard_writev:
>>>> >>>
>>>> >>> 1. Based on the offset, calculate the shards
touched by current
>>>> write.
>>>> >>> 2. Look for inodes corresponding to these shard
files in itable.
>>>> >>> 3. If one or more inodes are missing from itable,
issue mknod for
>>>> >>> corresponding shard files and ignore EEXIST in
cbk.
>>>> >>> 4. resume writes on respective shards.
>>>> >>>
>>>> >>> Now, imagine a write which falls to an existing
"shard_file". For
>>>> the
>>>> >>> sake of discussion lets consider a distribute of
three subvols -
>>>> s1, s2, s3
>>>> >>>
>>>> >>> 1. "shard_file" hashes to subvolume s2
and is present on s2
>>>> >>> 2. add a subvolume s4 and initiate a fix layout.
The layout of
>>>> ".shard"
>>>> >>> is fixed to include s4 and hash ranges are
changed.
>>>> >>> 3. write that touches "shard_file" is
issued.
>>>> >>> 4. The inode for "shard_file" is not
present in itable after a graph
>>>> >>> switch and features/shard issues an mknod.
>>>> >>> 5. With new layout of .shard, lets say
"shard_file" hashes to s3 and
>>>> >>> mknod (shard_file) on s3 succeeds. But, the
shard_file is already
>>>> present on
>>>> >>> s2.
>>>> >>>
>>>> >>> So, we have two files on two different subvols of
dht representing
>>>> same
>>>> >>> shard and this will lead to corruption.
>>>> >>>
>>>> >>> </RCA>
>>>> >>>
>>>> >>> Raghavendra will be sending out a patch in DHT to
fix this issue.
>>>> >>>
>>>> >>> -Krutika
>>>> >>>
>>>> >>>
>>>> >>> On Tue, Mar 28, 2017 at 11:49 PM, Pranith Kumar
Karampuri
>>>> >>> <pkarampu at redhat.com> wrote:
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> On Mon, Mar 27, 2017 at 11:29 PM, Mahdi Adnan
<
>>>> mahdi.adnan at outlook.com>
>>>> >>>> wrote:
>>>> >>>>>
>>>> >>>>> Hi,
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Do you guys have any update regarding this
issue ?
>>>> >>>>
>>>> >>>> I do not actively work on this issue so I do
not have an accurate
>>>> >>>> update, but from what I heard from Krutika and
Raghavendra(works
>>>> on DHT) is:
>>>> >>>> Krutika debugged initially and found that the
issue seems more
>>>> likely to be
>>>> >>>> in DHT, Satheesaran who helped us recreate
this issue in lab found
>>>> that just
>>>> >>>> fix-layout without rebalance also caused the
corruption 1 out of 3
>>>> times.
>>>> >>>> Raghavendra came up with a possible RCA for
why this can happen.
>>>> >>>> Raghavendra(CCed) would be the right person to
provide accurate
>>>> update.
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>>
>>>> >>>>> Respectfully
>>>> >>>>> Mahdi A. Mahdi
>>>> >>>>>
>>>> >>>>> ________________________________
>>>> >>>>> From: Krutika Dhananjay <kdhananj at
redhat.com>
>>>> >>>>> Sent: Tuesday, March 21, 2017 3:02:55 PM
>>>> >>>>> To: Mahdi Adnan
>>>> >>>>> Cc: Nithya Balachandran; Gowdappa,
Raghavendra; Susant Palai;
>>>> >>>>> gluster-users at gluster.org List
>>>> >>>>>
>>>> >>>>> Subject: Re: [Gluster-users] Gluster
3.8.10 rebalance VMs
>>>> corruption
>>>> >>>>>
>>>> >>>>> Hi,
>>>> >>>>>
>>>> >>>>> So it looks like Satheesaran managed to
recreate this issue. We
>>>> will be
>>>> >>>>> seeking his help in debugging this. It
will be easier that way.
>>>> >>>>>
>>>> >>>>> -Krutika
>>>> >>>>>
>>>> >>>>> On Tue, Mar 21, 2017 at 1:35 PM, Mahdi
Adnan <
>>>> mahdi.adnan at outlook.com>
>>>> >>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> Hello and thank you for your email.
>>>> >>>>>> Actually no, i didn't check the
gfid of the vms.
>>>> >>>>>> If this will help, i can setup a new
test cluster and get all
>>>> the data
>>>> >>>>>> you need.
>>>> >>>>>>
>>>> >>>>>> Get Outlook for Android
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> From: Nithya Balachandran
>>>> >>>>>> Sent: Monday, March 20, 20:57
>>>> >>>>>> Subject: Re: [Gluster-users] Gluster
3.8.10 rebalance VMs
>>>> corruption
>>>> >>>>>> To: Krutika Dhananjay
>>>> >>>>>> Cc: Mahdi Adnan, Gowdappa,
Raghavendra, Susant Palai,
>>>> >>>>>> gluster-users at gluster.org List
>>>> >>>>>>
>>>> >>>>>> Hi,
>>>> >>>>>>
>>>> >>>>>> Do you know the GFIDs of the VM images
which were corrupted?
>>>> >>>>>>
>>>> >>>>>> Regards,
>>>> >>>>>>
>>>> >>>>>> Nithya
>>>> >>>>>>
>>>> >>>>>> On 20 March 2017 at 20:37, Krutika
Dhananjay <
>>>> kdhananj at redhat.com>
>>>> >>>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> I looked at the logs.
>>>> >>>>>>
>>>> >>>>>> From the time the new graph (since the
add-brick command you
>>>> shared
>>>> >>>>>> where bricks 41 through 44 are added)
is switched to (line 3011
>>>> onwards in
>>>> >>>>>> nfs-gfapi.log), I see the following
kinds of errors:
>>>> >>>>>>
>>>> >>>>>> 1. Lookups to a bunch of files failed
with ENOENT on both
>>>> replicas
>>>> >>>>>> which protocol/client converts to
ESTALE. I am guessing these
>>>> entries got
>>>> >>>>>> migrated to
>>>> >>>>>>
>>>> >>>>>> other subvolumes leading to 'No
such file or directory' errors.
>>>> >>>>>>
>>>> >>>>>> DHT and thereafter shard get the same
error code and log the
>>>> >>>>>> following:
>>>> >>>>>>
>>>> >>>>>>  0 [2017-03-17 14:04:26.353444] E
[MSGID: 109040]
>>>> >>>>>>
[dht-helper.c:1198:dht_migration_complete_check_task]
>>>> 17-vmware2-dht:
>>>> >>>>>>
<gfid:a68ce411-e381-46a3-93cd-d2af6a7c3532>: failed     to
>>>> lookup the file
>>>> >>>>>> on vmware2-dht [Stale file handle]
>>>> >>>>>>   1 [2017-03-17 14:04:26.353528] E
[MSGID: 133014]
>>>> >>>>>> [shard.c:1253:shard_common_stat_cbk]
17-vmware2-shard: stat
>>>> failed:
>>>> >>>>>> a68ce411-e381-46a3-93cd-d2af6a7c3532
[Stale file handle]
>>>> >>>>>>
>>>> >>>>>> which is fine.
>>>> >>>>>>
>>>> >>>>>> 2. The other kind are from AFR logging
of possible split-brain
>>>> which I
>>>> >>>>>> suppose are harmless too.
>>>> >>>>>> [2017-03-17 14:23:36.968883] W [MSGID:
108008]
>>>> >>>>>> [afr-read-txn.c:228:afr_read_txn]
17-vmware2-replicate-13:
>>>> Unreadable
>>>> >>>>>> subvolume -1 found with event
generation 2 for gfid
>>>> >>>>>> 74d49288-8452-40d4-893e-ff4672557ff9.
(Possible split-brain)
>>>> >>>>>>
>>>> >>>>>> Since you are saying the bug is hit
only on VMs that are
>>>> undergoing IO
>>>> >>>>>> while rebalance is running (as opposed
to those that remained
>>>> powered off),
>>>> >>>>>>
>>>> >>>>>> rebalance + IO could be causing some
issues.
>>>> >>>>>>
>>>> >>>>>> CC'ing DHT devs
>>>> >>>>>>
>>>> >>>>>> Raghavendra/Nithya/Susant,
>>>> >>>>>>
>>>> >>>>>> Could you take a look?
>>>> >>>>>>
>>>> >>>>>> -Krutika
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On Sun, Mar 19, 2017 at 4:55 PM, Mahdi
Adnan <
>>>> mahdi.adnan at outlook.com>
>>>> >>>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> Thank you for your email mate.
>>>> >>>>>>
>>>> >>>>>> Yes, im aware of this but, to save
costs i chose replica 2, this
>>>> >>>>>> cluster is all flash.
>>>> >>>>>>
>>>> >>>>>> In version 3.7.x i had issues with
ping timeout, if one hosts
>>>> went
>>>> >>>>>> down for few seconds the whole cluster
hangs and become
>>>> unavailable, to
>>>> >>>>>> avoid this i adjusted the ping timeout
to 5 seconds.
>>>> >>>>>>
>>>> >>>>>> As for choosing Ganesha over gfapi,
VMWare does not support
>>>> Gluster
>>>> >>>>>> (FUSE or gfapi) im stuck with NFS for
this volume.
>>>> >>>>>>
>>>> >>>>>> The other volume is mounted using
gfapi in oVirt cluster.
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>>
>>>> >>>>>> Respectfully
>>>> >>>>>> Mahdi A. Mahdi
>>>> >>>>>>
>>>> >>>>>> From: Krutika Dhananjay <kdhananj
at redhat.com>
>>>> >>>>>> Sent: Sunday, March 19, 2017 2:01:49
PM
>>>> >>>>>>
>>>> >>>>>> To: Mahdi Adnan
>>>> >>>>>> Cc: gluster-users at gluster.org
>>>> >>>>>> Subject: Re: [Gluster-users] Gluster
3.8.10 rebalance VMs
>>>> corruption
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> While I'm still going through the
logs, just wanted to point out
>>>> a
>>>> >>>>>> couple of things:
>>>> >>>>>>
>>>> >>>>>> 1. It is recommended that you use
3-way replication (replica
>>>> count 3)
>>>> >>>>>> for VM store use case
>>>> >>>>>>
>>>> >>>>>> 2. network.ping-timeout at 5 seconds
is way too low. Please
>>>> change it
>>>> >>>>>> to 30.
>>>> >>>>>>
>>>> >>>>>> Is there any specific reason for using
NFS-Ganesha over
>>>> gfapi/FUSE?
>>>> >>>>>>
>>>> >>>>>> Will get back with anything else I
might find or more questions
>>>> if I
>>>> >>>>>> have any.
>>>> >>>>>>
>>>> >>>>>> -Krutika
>>>> >>>>>>
>>>> >>>>>> On Sun, Mar 19, 2017 at 2:36 PM, Mahdi
Adnan <
>>>> mahdi.adnan at outlook.com>
>>>> >>>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> Thanks mate,
>>>> >>>>>>
>>>> >>>>>> Kindly, check the attachment.
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>>
>>>> >>>>>> Respectfully
>>>> >>>>>> Mahdi A. Mahdi
>>>> >>>>>>
>>>> >>>>>> From: Krutika Dhananjay <kdhananj
at redhat.com>
>>>> >>>>>> Sent: Sunday, March 19, 2017 10:00:22
AM
>>>> >>>>>>
>>>> >>>>>> To: Mahdi Adnan
>>>> >>>>>> Cc: gluster-users at gluster.org
>>>> >>>>>> Subject: Re: [Gluster-users] Gluster
3.8.10 rebalance VMs
>>>> corruption
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> In that case could you share the
ganesha-gfapi logs?
>>>> >>>>>>
>>>> >>>>>> -Krutika
>>>> >>>>>>
>>>> >>>>>> On Sun, Mar 19, 2017 at 12:13 PM,
Mahdi Adnan
>>>> >>>>>> <mahdi.adnan at outlook.com>
wrote:
>>>> >>>>>>
>>>> >>>>>> I have two volumes, one is mounted
using libgfapi for ovirt
>>>> mount, the
>>>> >>>>>> other one is exported via NFS-Ganesha
for VMWare which is the
>>>> one im testing
>>>> >>>>>> now.
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>>
>>>> >>>>>> Respectfully
>>>> >>>>>> Mahdi A. Mahdi
>>>> >>>>>>
>>>> >>>>>> From: Krutika Dhananjay <kdhananj
at redhat.com>
>>>> >>>>>> Sent: Sunday, March 19, 2017 8:02:19
AM
>>>> >>>>>>
>>>> >>>>>> To: Mahdi Adnan
>>>> >>>>>> Cc: gluster-users at gluster.org
>>>> >>>>>> Subject: Re: [Gluster-users] Gluster
3.8.10 rebalance VMs
>>>> corruption
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On Sat, Mar 18, 2017 at 10:36 PM,
Mahdi Adnan
>>>> >>>>>> <mahdi.adnan at outlook.com>
wrote:
>>>> >>>>>>
>>>> >>>>>> Kindly, check the attached new log
file, i dont know if it's
>>>> helpful
>>>> >>>>>> or not but, i couldn't find the
log with the name you just
>>>> described.
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> No. Are you using FUSE or libgfapi for
accessing the volume? Or
>>>> is it
>>>> >>>>>> NFS?
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> -Krutika
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>>
>>>> >>>>>> Respectfully
>>>> >>>>>> Mahdi A. Mahdi
>>>> >>>>>>
>>>> >>>>>> From: Krutika Dhananjay <kdhananj
at redhat.com>
>>>> >>>>>> Sent: Saturday, March 18, 2017 6:10:40
PM
>>>> >>>>>>
>>>> >>>>>> To: Mahdi Adnan
>>>> >>>>>> Cc: gluster-users at gluster.org
>>>> >>>>>> Subject: Re: [Gluster-users] Gluster
3.8.10 rebalance VMs
>>>> corruption
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> mnt-disk11-vmware2.log seems like a
brick log. Could you attach
>>>> the
>>>> >>>>>> fuse mount logs? It should be right
under /var/log/glusterfs/
>>>> directory
>>>> >>>>>>
>>>> >>>>>> named after the mount point name, only
hyphenated.
>>>> >>>>>>
>>>> >>>>>> -Krutika
>>>> >>>>>>
>>>> >>>>>> On Sat, Mar 18, 2017 at 7:27 PM, Mahdi
Adnan <
>>>> mahdi.adnan at outlook.com>
>>>> >>>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> Hello Krutika,
>>>> >>>>>>
>>>> >>>>>> Kindly, check the attached logs.
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>>
>>>> >>>>>> Respectfully
>>>> >>>>>> Mahdi A. Mahdi
>>>> >>>>>>
>>>> >>>>>> From: Krutika Dhananjay <kdhananj
at redhat.com>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> Sent: Saturday, March 18, 2017 3:29:03
PM
>>>> >>>>>> To: Mahdi Adnan
>>>> >>>>>> Cc: gluster-users at gluster.org
>>>> >>>>>> Subject: Re: [Gluster-users] Gluster
3.8.10 rebalance VMs
>>>> corruption
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> Hi Mahdi,
>>>> >>>>>>
>>>> >>>>>> Could you attach mount, brick and
rebalance logs?
>>>> >>>>>>
>>>> >>>>>> -Krutika
>>>> >>>>>>
>>>> >>>>>> On Sat, Mar 18, 2017 at 12:14 AM,
Mahdi Adnan
>>>> >>>>>> <mahdi.adnan at outlook.com>
wrote:
>>>> >>>>>>
>>>> >>>>>> Hi,
>>>> >>>>>>
>>>> >>>>>> I have upgraded to Gluster 3.8.10
today and ran the add-brick
>>>> >>>>>> procedure in a volume contains few
VMs.
>>>> >>>>>>
>>>> >>>>>> After the completion of rebalance, i
have rebooted the VMs, some
>>>> of
>>>> >>>>>> ran just fine, and others just
crashed.
>>>> >>>>>>
>>>> >>>>>> Windows boot to recovery mode and
Linux throw xfs errors and
>>>> does not
>>>> >>>>>> boot.
>>>> >>>>>>
>>>> >>>>>> I ran the test again and it happened
just as the first one, but
>>>> i have
>>>> >>>>>> noticed only VMs doing disk IOs are
affected by this bug.
>>>> >>>>>>
>>>> >>>>>> The VMs in power off mode started fine
and even md5 of the disk
>>>> file
>>>> >>>>>> did not change after the rebalance.
>>>> >>>>>>
>>>> >>>>>> anyone else can confirm this ?
>>>> >>>>>>
>>>> >>>>>> Volume info:
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> Volume Name: vmware2
>>>> >>>>>>
>>>> >>>>>> Type: Distributed-Replicate
>>>> >>>>>>
>>>> >>>>>> Volume ID:
02328d46-a285-4533-aa3a-fb9bfeb688bf
>>>> >>>>>>
>>>> >>>>>> Status: Started
>>>> >>>>>>
>>>> >>>>>> Snapshot Count: 0
>>>> >>>>>>
>>>> >>>>>> Number of Bricks: 22 x 2 = 44
>>>> >>>>>>
>>>> >>>>>> Transport-type: tcp
>>>> >>>>>>
>>>> >>>>>> Bricks:
>>>> >>>>>>
>>>> >>>>>> Brick1: gluster01:/mnt/disk1/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick2: gluster03:/mnt/disk1/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick3: gluster02:/mnt/disk1/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick4: gluster04:/mnt/disk1/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick5: gluster01:/mnt/disk2/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick6: gluster03:/mnt/disk2/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick7: gluster02:/mnt/disk2/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick8: gluster04:/mnt/disk2/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick9: gluster01:/mnt/disk3/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick10: gluster03:/mnt/disk3/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick11: gluster02:/mnt/disk3/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick12: gluster04:/mnt/disk3/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick13: gluster01:/mnt/disk4/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick14: gluster03:/mnt/disk4/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick15: gluster02:/mnt/disk4/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick16: gluster04:/mnt/disk4/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick17: gluster01:/mnt/disk5/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick18: gluster03:/mnt/disk5/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick19: gluster02:/mnt/disk5/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick20: gluster04:/mnt/disk5/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick21: gluster01:/mnt/disk6/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick22: gluster03:/mnt/disk6/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick23: gluster02:/mnt/disk6/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick24: gluster04:/mnt/disk6/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick25: gluster01:/mnt/disk7/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick26: gluster03:/mnt/disk7/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick27: gluster02:/mnt/disk7/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick28: gluster04:/mnt/disk7/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick29: gluster01:/mnt/disk8/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick30: gluster03:/mnt/disk8/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick31: gluster02:/mnt/disk8/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick32: gluster04:/mnt/disk8/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick33: gluster01:/mnt/disk9/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick34: gluster03:/mnt/disk9/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick35: gluster02:/mnt/disk9/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick36: gluster04:/mnt/disk9/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick37: gluster01:/mnt/disk10/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick38: gluster03:/mnt/disk10/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick39: gluster02:/mnt/disk10/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick40: gluster04:/mnt/disk10/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick41: gluster01:/mnt/disk11/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick42: gluster03:/mnt/disk11/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick43: gluster02:/mnt/disk11/vmware2
>>>> >>>>>>
>>>> >>>>>> Brick44: gluster04:/mnt/disk11/vmware2
>>>> >>>>>>
>>>> >>>>>> Options Reconfigured:
>>>> >>>>>>
>>>> >>>>>> cluster.server-quorum-type: server
>>>> >>>>>>
>>>> >>>>>> nfs.disable: on
>>>> >>>>>>
>>>> >>>>>> performance.readdir-ahead: on
>>>> >>>>>>
>>>> >>>>>> transport.address-family: inet
>>>> >>>>>>
>>>> >>>>>> performance.quick-read: off
>>>> >>>>>>
>>>> >>>>>> performance.read-ahead: off
>>>> >>>>>>
>>>> >>>>>> performance.io-cache: off
>>>> >>>>>>
>>>> >>>>>> performance.stat-prefetch: off
>>>> >>>>>>
>>>> >>>>>> cluster.eager-lock: enable
>>>> >>>>>>
>>>> >>>>>> network.remote-dio: enable
>>>> >>>>>>
>>>> >>>>>> features.shard: on
>>>> >>>>>>
>>>> >>>>>> cluster.data-self-heal-algorithm: full
>>>> >>>>>>
>>>> >>>>>> features.cache-invalidation: on
>>>> >>>>>>
>>>> >>>>>> ganesha.enable: on
>>>> >>>>>>
>>>> >>>>>> features.shard-block-size: 256MB
>>>> >>>>>>
>>>> >>>>>> client.event-threads: 2
>>>> >>>>>>
>>>> >>>>>> server.event-threads: 2
>>>> >>>>>>
>>>> >>>>>> cluster.favorite-child-policy: size
>>>> >>>>>>
>>>> >>>>>> storage.build-pgfid: off
>>>> >>>>>>
>>>> >>>>>> network.ping-timeout: 5
>>>> >>>>>>
>>>> >>>>>> cluster.enable-shared-storage: enable
>>>> >>>>>>
>>>> >>>>>> nfs-ganesha: enable
>>>> >>>>>>
>>>> >>>>>> cluster.server-quorum-ratio: 51%
>>>> >>>>>>
>>>> >>>>>> Adding bricks:
>>>> >>>>>>
>>>> >>>>>> gluster volume add-brick vmware2
replica 2
>>>> >>>>>> gluster01:/mnt/disk11/vmware2
gluster03:/mnt/disk11/vmware2
>>>> >>>>>> gluster02:/mnt/disk11/vmware2
gluster04:/mnt/disk11/vmware2
>>>> >>>>>>
>>>> >>>>>> starting fix layout:
>>>> >>>>>>
>>>> >>>>>> gluster volume rebalance vmware2
fix-layout start
>>>> >>>>>>
>>>> >>>>>> Starting rebalance:
>>>> >>>>>>
>>>> >>>>>> gluster volume rebalance vmware2 
start
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>>
>>>> >>>>>> Respectfully
>>>> >>>>>> Mahdi A. Mahdi
>>>> >>>>>>
>>>> >>>>>>
_______________________________________________
>>>> >>>>>> Gluster-users mailing list
>>>> >>>>>> Gluster-users at gluster.org
>>>> >>>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
_______________________________________________
>>>> >>>>> Gluster-users mailing list
>>>> >>>>> Gluster-users at gluster.org
>>>> >>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> --
>>>> >>>> Pranith
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> Gluster-users mailing list
>>>> >>> Gluster-users at gluster.org
>>>> >>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>> >
>>>> >
>>>>
>>>
>>
>>
>> --
>> Pranith
>>
>

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170427/15f5c9fc/attachment.html>

Serkan Çoban

2017-Apr-27 11:21 UTC

head link

[Gluster-users] Gluster 3.8.10 rebalance VMs corruption

I think this is he fix Gandalf asking for:
https://github.com/gluster/glusterfs/commit/6e3054b42f9aef1e35b493fbb002ec47e1ba27ce


On Thu, Apr 27, 2017 at 2:03 PM, Pranith Kumar Karampuri
<pkarampu at redhat.com> wrote:> I am very positive about the two things I told you. These are the latest
> things that happened for VM corruption with rebalance.
>
> On Thu, Apr 27, 2017 at 4:30 PM, Gandalf Corvotempesta
> <gandalf.corvotempesta at gmail.com> wrote:
>>
>> I think we are talking about a different bug.
>>
>> Il 27 apr 2017 12:58 PM, "Pranith Kumar Karampuri"
<pkarampu at redhat.com>
>> ha scritto:
>>>
>>> I am not a DHT developer, so some of what I say could be a little
wrong.
>>> But this is what I gather.
>>> I think they found 2 classes of bugs in dht
>>> 1) Graceful fop failover when rebalance is in progress is missing
for
>>> some fops, that lead to VM pause.
>>>
>>> I see that https://review.gluster.org/17085 got merged on 24th on
master
>>> for this. I see patches are posted for 3.8.x for this one.
>>>
>>> 2) I think there is some work needs to be done for dht_[f]xattrop.
I
>>> believe this is the next step that is underway.
>>>
>>>
>>> On Thu, Apr 27, 2017 at 12:13 PM, Gandalf Corvotempesta
>>> <gandalf.corvotempesta at gmail.com> wrote:
>>>>
>>>> Updates on this critical bug ?
>>>>
>>>> Il 18 apr 2017 8:24 PM, "Gandalf Corvotempesta"
>>>> <gandalf.corvotempesta at gmail.com> ha scritto:
>>>>>
>>>>> Any update ?
>>>>> In addition, if this is a different bug but the
"workflow" is the same
>>>>> as the previous one, how is possible that fixing the
previous bug
>>>>> triggered this new one ?
>>>>>
>>>>> Is possible to have some details ?
>>>>>
>>>>> 2017-04-04 16:11 GMT+02:00 Krutika Dhananjay <kdhananj
at redhat.com>:
>>>>> > Nope. This is a different bug.
>>>>> >
>>>>> > -Krutika
>>>>> >
>>>>> > On Mon, Apr 3, 2017 at 5:03 PM, Gandalf Corvotempesta
>>>>> > <gandalf.corvotempesta at gmail.com> wrote:
>>>>> >>
>>>>> >> This is a good news
>>>>> >> Is this related to the previously fixed bug?
>>>>> >>
>>>>> >> Il 3 apr 2017 10:22 AM, "Krutika
Dhananjay" <kdhananj at redhat.com> ha
>>>>> >> scritto:
>>>>> >>>
>>>>> >>> So Raghavendra has an RCA for this issue.
>>>>> >>>
>>>>> >>> Copy-pasting his comment here:
>>>>> >>>
>>>>> >>> <RCA>
>>>>> >>>
>>>>> >>> Following is a rough algorithm of
shard_writev:
>>>>> >>>
>>>>> >>> 1. Based on the offset, calculate the shards
touched by current
>>>>> >>> write.
>>>>> >>> 2. Look for inodes corresponding to these
shard files in itable.
>>>>> >>> 3. If one or more inodes are missing from
itable, issue mknod for
>>>>> >>> corresponding shard files and ignore EEXIST in
cbk.
>>>>> >>> 4. resume writes on respective shards.
>>>>> >>>
>>>>> >>> Now, imagine a write which falls to an
existing "shard_file". For
>>>>> >>> the
>>>>> >>> sake of discussion lets consider a distribute
of three subvols -
>>>>> >>> s1, s2, s3
>>>>> >>>
>>>>> >>> 1. "shard_file" hashes to subvolume
s2 and is present on s2
>>>>> >>> 2. add a subvolume s4 and initiate a fix
layout. The layout of
>>>>> >>> ".shard"
>>>>> >>> is fixed to include s4 and hash ranges are
changed.
>>>>> >>> 3. write that touches "shard_file"
is issued.
>>>>> >>> 4. The inode for "shard_file" is not
present in itable after a
>>>>> >>> graph
>>>>> >>> switch and features/shard issues an mknod.
>>>>> >>> 5. With new layout of .shard, lets say
"shard_file" hashes to s3
>>>>> >>> and
>>>>> >>> mknod (shard_file) on s3 succeeds. But, the
shard_file is already
>>>>> >>> present on
>>>>> >>> s2.
>>>>> >>>
>>>>> >>> So, we have two files on two different subvols
of dht representing
>>>>> >>> same
>>>>> >>> shard and this will lead to corruption.
>>>>> >>>
>>>>> >>> </RCA>
>>>>> >>>
>>>>> >>> Raghavendra will be sending out a patch in DHT
to fix this issue.
>>>>> >>>
>>>>> >>> -Krutika
>>>>> >>>
>>>>> >>>
>>>>> >>> On Tue, Mar 28, 2017 at 11:49 PM, Pranith
Kumar Karampuri
>>>>> >>> <pkarampu at redhat.com> wrote:
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> On Mon, Mar 27, 2017 at 11:29 PM, Mahdi
Adnan
>>>>> >>>> <mahdi.adnan at outlook.com>
>>>>> >>>> wrote:
>>>>> >>>>>
>>>>> >>>>> Hi,
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> Do you guys have any update regarding
this issue ?
>>>>> >>>>
>>>>> >>>> I do not actively work on this issue so I
do not have an accurate
>>>>> >>>> update, but from what I heard from Krutika
and Raghavendra(works
>>>>> >>>> on DHT) is:
>>>>> >>>> Krutika debugged initially and found that
the issue seems more
>>>>> >>>> likely to be
>>>>> >>>> in DHT, Satheesaran who helped us recreate
this issue in lab found
>>>>> >>>> that just
>>>>> >>>> fix-layout without rebalance also caused
the corruption 1 out of 3
>>>>> >>>> times.
>>>>> >>>> Raghavendra came up with a possible RCA
for why this can happen.
>>>>> >>>> Raghavendra(CCed) would be the right
person to provide accurate
>>>>> >>>> update.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> --
>>>>> >>>>>
>>>>> >>>>> Respectfully
>>>>> >>>>> Mahdi A. Mahdi
>>>>> >>>>>
>>>>> >>>>> ________________________________
>>>>> >>>>> From: Krutika Dhananjay <kdhananj
at redhat.com>
>>>>> >>>>> Sent: Tuesday, March 21, 2017 3:02:55
PM
>>>>> >>>>> To: Mahdi Adnan
>>>>> >>>>> Cc: Nithya Balachandran; Gowdappa,
Raghavendra; Susant Palai;
>>>>> >>>>> gluster-users at gluster.org List
>>>>> >>>>>
>>>>> >>>>> Subject: Re: [Gluster-users] Gluster
3.8.10 rebalance VMs
>>>>> >>>>> corruption
>>>>> >>>>>
>>>>> >>>>> Hi,
>>>>> >>>>>
>>>>> >>>>> So it looks like Satheesaran managed
to recreate this issue. We
>>>>> >>>>> will be
>>>>> >>>>> seeking his help in debugging this. It
will be easier that way.
>>>>> >>>>>
>>>>> >>>>> -Krutika
>>>>> >>>>>
>>>>> >>>>> On Tue, Mar 21, 2017 at 1:35 PM, Mahdi
Adnan
>>>>> >>>>> <mahdi.adnan at outlook.com>
>>>>> >>>>> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hello and thank you for your
email.
>>>>> >>>>>> Actually no, i didn't check
the gfid of the vms.
>>>>> >>>>>> If this will help, i can setup a
new test cluster and get all
>>>>> >>>>>> the data
>>>>> >>>>>> you need.
>>>>> >>>>>>
>>>>> >>>>>> Get Outlook for Android
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> From: Nithya Balachandran
>>>>> >>>>>> Sent: Monday, March 20, 20:57
>>>>> >>>>>> Subject: Re: [Gluster-users]
Gluster 3.8.10 rebalance VMs
>>>>> >>>>>> corruption
>>>>> >>>>>> To: Krutika Dhananjay
>>>>> >>>>>> Cc: Mahdi Adnan, Gowdappa,
Raghavendra, Susant Palai,
>>>>> >>>>>> gluster-users at gluster.org List
>>>>> >>>>>>
>>>>> >>>>>> Hi,
>>>>> >>>>>>
>>>>> >>>>>> Do you know the GFIDs of the VM
images which were corrupted?
>>>>> >>>>>>
>>>>> >>>>>> Regards,
>>>>> >>>>>>
>>>>> >>>>>> Nithya
>>>>> >>>>>>
>>>>> >>>>>> On 20 March 2017 at 20:37, Krutika
Dhananjay
>>>>> >>>>>> <kdhananj at redhat.com>
>>>>> >>>>>> wrote:
>>>>> >>>>>>
>>>>> >>>>>> I looked at the logs.
>>>>> >>>>>>
>>>>> >>>>>> From the time the new graph (since
the add-brick command you
>>>>> >>>>>> shared
>>>>> >>>>>> where bricks 41 through 44 are
added) is switched to (line 3011
>>>>> >>>>>> onwards in
>>>>> >>>>>> nfs-gfapi.log), I see the
following kinds of errors:
>>>>> >>>>>>
>>>>> >>>>>> 1. Lookups to a bunch of files
failed with ENOENT on both
>>>>> >>>>>> replicas
>>>>> >>>>>> which protocol/client converts to
ESTALE. I am guessing these
>>>>> >>>>>> entries got
>>>>> >>>>>> migrated to
>>>>> >>>>>>
>>>>> >>>>>> other subvolumes leading to
'No such file or directory' errors.
>>>>> >>>>>>
>>>>> >>>>>> DHT and thereafter shard get the
same error code and log the
>>>>> >>>>>> following:
>>>>> >>>>>>
>>>>> >>>>>>  0 [2017-03-17 14:04:26.353444] E
[MSGID: 109040]
>>>>> >>>>>>
[dht-helper.c:1198:dht_migration_complete_check_task]
>>>>> >>>>>> 17-vmware2-dht:
>>>>> >>>>>>
<gfid:a68ce411-e381-46a3-93cd-d2af6a7c3532>: failed     to
>>>>> >>>>>> lookup the file
>>>>> >>>>>> on vmware2-dht [Stale file handle]
>>>>> >>>>>>   1 [2017-03-17 14:04:26.353528] E
[MSGID: 133014]
>>>>> >>>>>>
[shard.c:1253:shard_common_stat_cbk] 17-vmware2-shard: stat
>>>>> >>>>>> failed:
>>>>> >>>>>>
a68ce411-e381-46a3-93cd-d2af6a7c3532 [Stale file handle]
>>>>> >>>>>>
>>>>> >>>>>> which is fine.
>>>>> >>>>>>
>>>>> >>>>>> 2. The other kind are from AFR
logging of possible split-brain
>>>>> >>>>>> which I
>>>>> >>>>>> suppose are harmless too.
>>>>> >>>>>> [2017-03-17 14:23:36.968883] W
[MSGID: 108008]
>>>>> >>>>>> [afr-read-txn.c:228:afr_read_txn]
17-vmware2-replicate-13:
>>>>> >>>>>> Unreadable
>>>>> >>>>>> subvolume -1 found with event
generation 2 for gfid
>>>>> >>>>>>
74d49288-8452-40d4-893e-ff4672557ff9. (Possible split-brain)
>>>>> >>>>>>
>>>>> >>>>>> Since you are saying the bug is
hit only on VMs that are
>>>>> >>>>>> undergoing IO
>>>>> >>>>>> while rebalance is running (as
opposed to those that remained
>>>>> >>>>>> powered off),
>>>>> >>>>>>
>>>>> >>>>>> rebalance + IO could be causing
some issues.
>>>>> >>>>>>
>>>>> >>>>>> CC'ing DHT devs
>>>>> >>>>>>
>>>>> >>>>>> Raghavendra/Nithya/Susant,
>>>>> >>>>>>
>>>>> >>>>>> Could you take a look?
>>>>> >>>>>>
>>>>> >>>>>> -Krutika
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> On Sun, Mar 19, 2017 at 4:55 PM,
Mahdi Adnan
>>>>> >>>>>> <mahdi.adnan at outlook.com>
>>>>> >>>>>> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Thank you for your email mate.
>>>>> >>>>>>
>>>>> >>>>>> Yes, im aware of this but, to save
costs i chose replica 2, this
>>>>> >>>>>> cluster is all flash.
>>>>> >>>>>>
>>>>> >>>>>> In version 3.7.x i had issues with
ping timeout, if one hosts
>>>>> >>>>>> went
>>>>> >>>>>> down for few seconds the whole
cluster hangs and become
>>>>> >>>>>> unavailable, to
>>>>> >>>>>> avoid this i adjusted the ping
timeout to 5 seconds.
>>>>> >>>>>>
>>>>> >>>>>> As for choosing Ganesha over
gfapi, VMWare does not support
>>>>> >>>>>> Gluster
>>>>> >>>>>> (FUSE or gfapi) im stuck with NFS
for this volume.
>>>>> >>>>>>
>>>>> >>>>>> The other volume is mounted using
gfapi in oVirt cluster.
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> --
>>>>> >>>>>>
>>>>> >>>>>> Respectfully
>>>>> >>>>>> Mahdi A. Mahdi
>>>>> >>>>>>
>>>>> >>>>>> From: Krutika Dhananjay
<kdhananj at redhat.com>
>>>>> >>>>>> Sent: Sunday, March 19, 2017
2:01:49 PM
>>>>> >>>>>>
>>>>> >>>>>> To: Mahdi Adnan
>>>>> >>>>>> Cc: gluster-users at gluster.org
>>>>> >>>>>> Subject: Re: [Gluster-users]
Gluster 3.8.10 rebalance VMs
>>>>> >>>>>> corruption
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> While I'm still going through
the logs, just wanted to point out
>>>>> >>>>>> a
>>>>> >>>>>> couple of things:
>>>>> >>>>>>
>>>>> >>>>>> 1. It is recommended that you use
3-way replication (replica
>>>>> >>>>>> count 3)
>>>>> >>>>>> for VM store use case
>>>>> >>>>>>
>>>>> >>>>>> 2. network.ping-timeout at 5
seconds is way too low. Please
>>>>> >>>>>> change it
>>>>> >>>>>> to 30.
>>>>> >>>>>>
>>>>> >>>>>> Is there any specific reason for
using NFS-Ganesha over
>>>>> >>>>>> gfapi/FUSE?
>>>>> >>>>>>
>>>>> >>>>>> Will get back with anything else I
might find or more questions
>>>>> >>>>>> if I
>>>>> >>>>>> have any.
>>>>> >>>>>>
>>>>> >>>>>> -Krutika
>>>>> >>>>>>
>>>>> >>>>>> On Sun, Mar 19, 2017 at 2:36 PM,
Mahdi Adnan
>>>>> >>>>>> <mahdi.adnan at outlook.com>
>>>>> >>>>>> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Thanks mate,
>>>>> >>>>>>
>>>>> >>>>>> Kindly, check the attachment.
>>>>> >>>>>>
>>>>> >>>>>> --
>>>>> >>>>>>
>>>>> >>>>>> Respectfully
>>>>> >>>>>> Mahdi A. Mahdi
>>>>> >>>>>>
>>>>> >>>>>> From: Krutika Dhananjay
<kdhananj at redhat.com>
>>>>> >>>>>> Sent: Sunday, March 19, 2017
10:00:22 AM
>>>>> >>>>>>
>>>>> >>>>>> To: Mahdi Adnan
>>>>> >>>>>> Cc: gluster-users at gluster.org
>>>>> >>>>>> Subject: Re: [Gluster-users]
Gluster 3.8.10 rebalance VMs
>>>>> >>>>>> corruption
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> In that case could you share the
ganesha-gfapi logs?
>>>>> >>>>>>
>>>>> >>>>>> -Krutika
>>>>> >>>>>>
>>>>> >>>>>> On Sun, Mar 19, 2017 at 12:13 PM,
Mahdi Adnan
>>>>> >>>>>> <mahdi.adnan at outlook.com>
wrote:
>>>>> >>>>>>
>>>>> >>>>>> I have two volumes, one is mounted
using libgfapi for ovirt
>>>>> >>>>>> mount, the
>>>>> >>>>>> other one is exported via
NFS-Ganesha for VMWare which is the
>>>>> >>>>>> one im testing
>>>>> >>>>>> now.
>>>>> >>>>>>
>>>>> >>>>>> --
>>>>> >>>>>>
>>>>> >>>>>> Respectfully
>>>>> >>>>>> Mahdi A. Mahdi
>>>>> >>>>>>
>>>>> >>>>>> From: Krutika Dhananjay
<kdhananj at redhat.com>
>>>>> >>>>>> Sent: Sunday, March 19, 2017
8:02:19 AM
>>>>> >>>>>>
>>>>> >>>>>> To: Mahdi Adnan
>>>>> >>>>>> Cc: gluster-users at gluster.org
>>>>> >>>>>> Subject: Re: [Gluster-users]
Gluster 3.8.10 rebalance VMs
>>>>> >>>>>> corruption
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> On Sat, Mar 18, 2017 at 10:36 PM,
Mahdi Adnan
>>>>> >>>>>> <mahdi.adnan at outlook.com>
wrote:
>>>>> >>>>>>
>>>>> >>>>>> Kindly, check the attached new log
file, i dont know if it's
>>>>> >>>>>> helpful
>>>>> >>>>>> or not but, i couldn't find
the log with the name you just
>>>>> >>>>>> described.
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> No. Are you using FUSE or libgfapi
for accessing the volume? Or
>>>>> >>>>>> is it
>>>>> >>>>>> NFS?
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> -Krutika
>>>>> >>>>>>
>>>>> >>>>>> --
>>>>> >>>>>>
>>>>> >>>>>> Respectfully
>>>>> >>>>>> Mahdi A. Mahdi
>>>>> >>>>>>
>>>>> >>>>>> From: Krutika Dhananjay
<kdhananj at redhat.com>
>>>>> >>>>>> Sent: Saturday, March 18, 2017
6:10:40 PM
>>>>> >>>>>>
>>>>> >>>>>> To: Mahdi Adnan
>>>>> >>>>>> Cc: gluster-users at gluster.org
>>>>> >>>>>> Subject: Re: [Gluster-users]
Gluster 3.8.10 rebalance VMs
>>>>> >>>>>> corruption
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> mnt-disk11-vmware2.log seems like
a brick log. Could you attach
>>>>> >>>>>> the
>>>>> >>>>>> fuse mount logs? It should be
right under /var/log/glusterfs/
>>>>> >>>>>> directory
>>>>> >>>>>>
>>>>> >>>>>> named after the mount point name,
only hyphenated.
>>>>> >>>>>>
>>>>> >>>>>> -Krutika
>>>>> >>>>>>
>>>>> >>>>>> On Sat, Mar 18, 2017 at 7:27 PM,
Mahdi Adnan
>>>>> >>>>>> <mahdi.adnan at outlook.com>
>>>>> >>>>>> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hello Krutika,
>>>>> >>>>>>
>>>>> >>>>>> Kindly, check the attached logs.
>>>>> >>>>>>
>>>>> >>>>>> --
>>>>> >>>>>>
>>>>> >>>>>> Respectfully
>>>>> >>>>>> Mahdi A. Mahdi
>>>>> >>>>>>
>>>>> >>>>>> From: Krutika Dhananjay
<kdhananj at redhat.com>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> Sent: Saturday, March 18, 2017
3:29:03 PM
>>>>> >>>>>> To: Mahdi Adnan
>>>>> >>>>>> Cc: gluster-users at gluster.org
>>>>> >>>>>> Subject: Re: [Gluster-users]
Gluster 3.8.10 rebalance VMs
>>>>> >>>>>> corruption
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> Hi Mahdi,
>>>>> >>>>>>
>>>>> >>>>>> Could you attach mount, brick and
rebalance logs?
>>>>> >>>>>>
>>>>> >>>>>> -Krutika
>>>>> >>>>>>
>>>>> >>>>>> On Sat, Mar 18, 2017 at 12:14 AM,
Mahdi Adnan
>>>>> >>>>>> <mahdi.adnan at outlook.com>
wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hi,
>>>>> >>>>>>
>>>>> >>>>>> I have upgraded to Gluster 3.8.10
today and ran the add-brick
>>>>> >>>>>> procedure in a volume contains few
VMs.
>>>>> >>>>>>
>>>>> >>>>>> After the completion of rebalance,
i have rebooted the VMs, some
>>>>> >>>>>> of
>>>>> >>>>>> ran just fine, and others just
crashed.
>>>>> >>>>>>
>>>>> >>>>>> Windows boot to recovery mode and
Linux throw xfs errors and
>>>>> >>>>>> does not
>>>>> >>>>>> boot.
>>>>> >>>>>>
>>>>> >>>>>> I ran the test again and it
happened just as the first one, but
>>>>> >>>>>> i have
>>>>> >>>>>> noticed only VMs doing disk IOs
are affected by this bug.
>>>>> >>>>>>
>>>>> >>>>>> The VMs in power off mode started
fine and even md5 of the disk
>>>>> >>>>>> file
>>>>> >>>>>> did not change after the
rebalance.
>>>>> >>>>>>
>>>>> >>>>>> anyone else can confirm this ?
>>>>> >>>>>>
>>>>> >>>>>> Volume info:
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> Volume Name: vmware2
>>>>> >>>>>>
>>>>> >>>>>> Type: Distributed-Replicate
>>>>> >>>>>>
>>>>> >>>>>> Volume ID:
02328d46-a285-4533-aa3a-fb9bfeb688bf
>>>>> >>>>>>
>>>>> >>>>>> Status: Started
>>>>> >>>>>>
>>>>> >>>>>> Snapshot Count: 0
>>>>> >>>>>>
>>>>> >>>>>> Number of Bricks: 22 x 2 = 44
>>>>> >>>>>>
>>>>> >>>>>> Transport-type: tcp
>>>>> >>>>>>
>>>>> >>>>>> Bricks:
>>>>> >>>>>>
>>>>> >>>>>> Brick1:
gluster01:/mnt/disk1/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick2:
gluster03:/mnt/disk1/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick3:
gluster02:/mnt/disk1/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick4:
gluster04:/mnt/disk1/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick5:
gluster01:/mnt/disk2/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick6:
gluster03:/mnt/disk2/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick7:
gluster02:/mnt/disk2/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick8:
gluster04:/mnt/disk2/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick9:
gluster01:/mnt/disk3/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick10:
gluster03:/mnt/disk3/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick11:
gluster02:/mnt/disk3/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick12:
gluster04:/mnt/disk3/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick13:
gluster01:/mnt/disk4/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick14:
gluster03:/mnt/disk4/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick15:
gluster02:/mnt/disk4/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick16:
gluster04:/mnt/disk4/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick17:
gluster01:/mnt/disk5/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick18:
gluster03:/mnt/disk5/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick19:
gluster02:/mnt/disk5/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick20:
gluster04:/mnt/disk5/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick21:
gluster01:/mnt/disk6/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick22:
gluster03:/mnt/disk6/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick23:
gluster02:/mnt/disk6/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick24:
gluster04:/mnt/disk6/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick25:
gluster01:/mnt/disk7/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick26:
gluster03:/mnt/disk7/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick27:
gluster02:/mnt/disk7/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick28:
gluster04:/mnt/disk7/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick29:
gluster01:/mnt/disk8/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick30:
gluster03:/mnt/disk8/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick31:
gluster02:/mnt/disk8/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick32:
gluster04:/mnt/disk8/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick33:
gluster01:/mnt/disk9/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick34:
gluster03:/mnt/disk9/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick35:
gluster02:/mnt/disk9/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick36:
gluster04:/mnt/disk9/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick37:
gluster01:/mnt/disk10/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick38:
gluster03:/mnt/disk10/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick39:
gluster02:/mnt/disk10/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick40:
gluster04:/mnt/disk10/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick41:
gluster01:/mnt/disk11/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick42:
gluster03:/mnt/disk11/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick43:
gluster02:/mnt/disk11/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Brick44:
gluster04:/mnt/disk11/vmware2
>>>>> >>>>>>
>>>>> >>>>>> Options Reconfigured:
>>>>> >>>>>>
>>>>> >>>>>> cluster.server-quorum-type: server
>>>>> >>>>>>
>>>>> >>>>>> nfs.disable: on
>>>>> >>>>>>
>>>>> >>>>>> performance.readdir-ahead: on
>>>>> >>>>>>
>>>>> >>>>>> transport.address-family: inet
>>>>> >>>>>>
>>>>> >>>>>> performance.quick-read: off
>>>>> >>>>>>
>>>>> >>>>>> performance.read-ahead: off
>>>>> >>>>>>
>>>>> >>>>>> performance.io-cache: off
>>>>> >>>>>>
>>>>> >>>>>> performance.stat-prefetch: off
>>>>> >>>>>>
>>>>> >>>>>> cluster.eager-lock: enable
>>>>> >>>>>>
>>>>> >>>>>> network.remote-dio: enable
>>>>> >>>>>>
>>>>> >>>>>> features.shard: on
>>>>> >>>>>>
>>>>> >>>>>> cluster.data-self-heal-algorithm:
full
>>>>> >>>>>>
>>>>> >>>>>> features.cache-invalidation: on
>>>>> >>>>>>
>>>>> >>>>>> ganesha.enable: on
>>>>> >>>>>>
>>>>> >>>>>> features.shard-block-size: 256MB
>>>>> >>>>>>
>>>>> >>>>>> client.event-threads: 2
>>>>> >>>>>>
>>>>> >>>>>> server.event-threads: 2
>>>>> >>>>>>
>>>>> >>>>>> cluster.favorite-child-policy:
size
>>>>> >>>>>>
>>>>> >>>>>> storage.build-pgfid: off
>>>>> >>>>>>
>>>>> >>>>>> network.ping-timeout: 5
>>>>> >>>>>>
>>>>> >>>>>> cluster.enable-shared-storage:
enable
>>>>> >>>>>>
>>>>> >>>>>> nfs-ganesha: enable
>>>>> >>>>>>
>>>>> >>>>>> cluster.server-quorum-ratio: 51%
>>>>> >>>>>>
>>>>> >>>>>> Adding bricks:
>>>>> >>>>>>
>>>>> >>>>>> gluster volume add-brick vmware2
replica 2
>>>>> >>>>>> gluster01:/mnt/disk11/vmware2
gluster03:/mnt/disk11/vmware2
>>>>> >>>>>> gluster02:/mnt/disk11/vmware2
gluster04:/mnt/disk11/vmware2
>>>>> >>>>>>
>>>>> >>>>>> starting fix layout:
>>>>> >>>>>>
>>>>> >>>>>> gluster volume rebalance vmware2
fix-layout start
>>>>> >>>>>>
>>>>> >>>>>> Starting rebalance:
>>>>> >>>>>>
>>>>> >>>>>> gluster volume rebalance vmware2 
start
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> --
>>>>> >>>>>>
>>>>> >>>>>> Respectfully
>>>>> >>>>>> Mahdi A. Mahdi
>>>>> >>>>>>
>>>>> >>>>>>
_______________________________________________
>>>>> >>>>>> Gluster-users mailing list
>>>>> >>>>>> Gluster-users at gluster.org
>>>>> >>>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>>
_______________________________________________
>>>>> >>>>> Gluster-users mailing list
>>>>> >>>>> Gluster-users at gluster.org
>>>>> >>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> --
>>>>> >>>> Pranith
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>>
_______________________________________________
>>>>> >>> Gluster-users mailing list
>>>>> >>> Gluster-users at gluster.org
>>>>> >>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>> >
>>>>> >
>>>
>>>
>>>
>>>
>>> --
>>> Pranith
>
>
>
>
> --
> Pranith
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

Gluster users - Apr 2017 - Gluster 3.8.10 rebalance VMs corruption

[Gluster-users] Gluster 3.8.10 rebalance VMs corruption

[Gluster-users] Gluster 3.8.10 rebalance VMs corruption