Pranith Kumar Karampuri
2017-Apr-27 11:03 UTC
[Gluster-users] Gluster 3.8.10 rebalance VMs corruption
I am very positive about the two things I told you. These are the latest things that happened for VM corruption with rebalance. On Thu, Apr 27, 2017 at 4:30 PM, Gandalf Corvotempesta < gandalf.corvotempesta at gmail.com> wrote:> I think we are talking about a different bug. > > Il 27 apr 2017 12:58 PM, "Pranith Kumar Karampuri" <pkarampu at redhat.com> > ha scritto: > >> I am not a DHT developer, so some of what I say could be a little wrong. >> But this is what I gather. >> I think they found 2 classes of bugs in dht >> 1) Graceful fop failover when rebalance is in progress is missing for >> some fops, that lead to VM pause. >> >> I see that https://review.gluster.org/17085 got merged on 24th on master >> for this. I see patches are posted for 3.8.x for this one. >> >> 2) I think there is some work needs to be done for dht_[f]xattrop. I >> believe this is the next step that is underway. >> >> >> On Thu, Apr 27, 2017 at 12:13 PM, Gandalf Corvotempesta < >> gandalf.corvotempesta at gmail.com> wrote: >> >>> Updates on this critical bug ? >>> >>> Il 18 apr 2017 8:24 PM, "Gandalf Corvotempesta" < >>> gandalf.corvotempesta at gmail.com> ha scritto: >>> >>>> Any update ? >>>> In addition, if this is a different bug but the "workflow" is the same >>>> as the previous one, how is possible that fixing the previous bug >>>> triggered this new one ? >>>> >>>> Is possible to have some details ? >>>> >>>> 2017-04-04 16:11 GMT+02:00 Krutika Dhananjay <kdhananj at redhat.com>: >>>> > Nope. This is a different bug. >>>> > >>>> > -Krutika >>>> > >>>> > On Mon, Apr 3, 2017 at 5:03 PM, Gandalf Corvotempesta >>>> > <gandalf.corvotempesta at gmail.com> wrote: >>>> >> >>>> >> This is a good news >>>> >> Is this related to the previously fixed bug? >>>> >> >>>> >> Il 3 apr 2017 10:22 AM, "Krutika Dhananjay" <kdhananj at redhat.com> ha >>>> >> scritto: >>>> >>> >>>> >>> So Raghavendra has an RCA for this issue. >>>> >>> >>>> >>> Copy-pasting his comment here: >>>> >>> >>>> >>> <RCA> >>>> >>> >>>> >>> Following is a rough algorithm of shard_writev: >>>> >>> >>>> >>> 1. Based on the offset, calculate the shards touched by current >>>> write. >>>> >>> 2. Look for inodes corresponding to these shard files in itable. >>>> >>> 3. If one or more inodes are missing from itable, issue mknod for >>>> >>> corresponding shard files and ignore EEXIST in cbk. >>>> >>> 4. resume writes on respective shards. >>>> >>> >>>> >>> Now, imagine a write which falls to an existing "shard_file". For >>>> the >>>> >>> sake of discussion lets consider a distribute of three subvols - >>>> s1, s2, s3 >>>> >>> >>>> >>> 1. "shard_file" hashes to subvolume s2 and is present on s2 >>>> >>> 2. add a subvolume s4 and initiate a fix layout. The layout of >>>> ".shard" >>>> >>> is fixed to include s4 and hash ranges are changed. >>>> >>> 3. write that touches "shard_file" is issued. >>>> >>> 4. The inode for "shard_file" is not present in itable after a graph >>>> >>> switch and features/shard issues an mknod. >>>> >>> 5. With new layout of .shard, lets say "shard_file" hashes to s3 and >>>> >>> mknod (shard_file) on s3 succeeds. But, the shard_file is already >>>> present on >>>> >>> s2. >>>> >>> >>>> >>> So, we have two files on two different subvols of dht representing >>>> same >>>> >>> shard and this will lead to corruption. >>>> >>> >>>> >>> </RCA> >>>> >>> >>>> >>> Raghavendra will be sending out a patch in DHT to fix this issue. >>>> >>> >>>> >>> -Krutika >>>> >>> >>>> >>> >>>> >>> On Tue, Mar 28, 2017 at 11:49 PM, Pranith Kumar Karampuri >>>> >>> <pkarampu at redhat.com> wrote: >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Mon, Mar 27, 2017 at 11:29 PM, Mahdi Adnan < >>>> mahdi.adnan at outlook.com> >>>> >>>> wrote: >>>> >>>>> >>>> >>>>> Hi, >>>> >>>>> >>>> >>>>> >>>> >>>>> Do you guys have any update regarding this issue ? >>>> >>>> >>>> >>>> I do not actively work on this issue so I do not have an accurate >>>> >>>> update, but from what I heard from Krutika and Raghavendra(works >>>> on DHT) is: >>>> >>>> Krutika debugged initially and found that the issue seems more >>>> likely to be >>>> >>>> in DHT, Satheesaran who helped us recreate this issue in lab found >>>> that just >>>> >>>> fix-layout without rebalance also caused the corruption 1 out of 3 >>>> times. >>>> >>>> Raghavendra came up with a possible RCA for why this can happen. >>>> >>>> Raghavendra(CCed) would be the right person to provide accurate >>>> update. >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> -- >>>> >>>>> >>>> >>>>> Respectfully >>>> >>>>> Mahdi A. Mahdi >>>> >>>>> >>>> >>>>> ________________________________ >>>> >>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>> >>>>> Sent: Tuesday, March 21, 2017 3:02:55 PM >>>> >>>>> To: Mahdi Adnan >>>> >>>>> Cc: Nithya Balachandran; Gowdappa, Raghavendra; Susant Palai; >>>> >>>>> gluster-users at gluster.org List >>>> >>>>> >>>> >>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>> corruption >>>> >>>>> >>>> >>>>> Hi, >>>> >>>>> >>>> >>>>> So it looks like Satheesaran managed to recreate this issue. We >>>> will be >>>> >>>>> seeking his help in debugging this. It will be easier that way. >>>> >>>>> >>>> >>>>> -Krutika >>>> >>>>> >>>> >>>>> On Tue, Mar 21, 2017 at 1:35 PM, Mahdi Adnan < >>>> mahdi.adnan at outlook.com> >>>> >>>>> wrote: >>>> >>>>>> >>>> >>>>>> Hello and thank you for your email. >>>> >>>>>> Actually no, i didn't check the gfid of the vms. >>>> >>>>>> If this will help, i can setup a new test cluster and get all >>>> the data >>>> >>>>>> you need. >>>> >>>>>> >>>> >>>>>> Get Outlook for Android >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> From: Nithya Balachandran >>>> >>>>>> Sent: Monday, March 20, 20:57 >>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>> corruption >>>> >>>>>> To: Krutika Dhananjay >>>> >>>>>> Cc: Mahdi Adnan, Gowdappa, Raghavendra, Susant Palai, >>>> >>>>>> gluster-users at gluster.org List >>>> >>>>>> >>>> >>>>>> Hi, >>>> >>>>>> >>>> >>>>>> Do you know the GFIDs of the VM images which were corrupted? >>>> >>>>>> >>>> >>>>>> Regards, >>>> >>>>>> >>>> >>>>>> Nithya >>>> >>>>>> >>>> >>>>>> On 20 March 2017 at 20:37, Krutika Dhananjay < >>>> kdhananj at redhat.com> >>>> >>>>>> wrote: >>>> >>>>>> >>>> >>>>>> I looked at the logs. >>>> >>>>>> >>>> >>>>>> From the time the new graph (since the add-brick command you >>>> shared >>>> >>>>>> where bricks 41 through 44 are added) is switched to (line 3011 >>>> onwards in >>>> >>>>>> nfs-gfapi.log), I see the following kinds of errors: >>>> >>>>>> >>>> >>>>>> 1. Lookups to a bunch of files failed with ENOENT on both >>>> replicas >>>> >>>>>> which protocol/client converts to ESTALE. I am guessing these >>>> entries got >>>> >>>>>> migrated to >>>> >>>>>> >>>> >>>>>> other subvolumes leading to 'No such file or directory' errors. >>>> >>>>>> >>>> >>>>>> DHT and thereafter shard get the same error code and log the >>>> >>>>>> following: >>>> >>>>>> >>>> >>>>>> 0 [2017-03-17 14:04:26.353444] E [MSGID: 109040] >>>> >>>>>> [dht-helper.c:1198:dht_migration_complete_check_task] >>>> 17-vmware2-dht: >>>> >>>>>> <gfid:a68ce411-e381-46a3-93cd-d2af6a7c3532>: failed to >>>> lookup the file >>>> >>>>>> on vmware2-dht [Stale file handle] >>>> >>>>>> 1 [2017-03-17 14:04:26.353528] E [MSGID: 133014] >>>> >>>>>> [shard.c:1253:shard_common_stat_cbk] 17-vmware2-shard: stat >>>> failed: >>>> >>>>>> a68ce411-e381-46a3-93cd-d2af6a7c3532 [Stale file handle] >>>> >>>>>> >>>> >>>>>> which is fine. >>>> >>>>>> >>>> >>>>>> 2. The other kind are from AFR logging of possible split-brain >>>> which I >>>> >>>>>> suppose are harmless too. >>>> >>>>>> [2017-03-17 14:23:36.968883] W [MSGID: 108008] >>>> >>>>>> [afr-read-txn.c:228:afr_read_txn] 17-vmware2-replicate-13: >>>> Unreadable >>>> >>>>>> subvolume -1 found with event generation 2 for gfid >>>> >>>>>> 74d49288-8452-40d4-893e-ff4672557ff9. (Possible split-brain) >>>> >>>>>> >>>> >>>>>> Since you are saying the bug is hit only on VMs that are >>>> undergoing IO >>>> >>>>>> while rebalance is running (as opposed to those that remained >>>> powered off), >>>> >>>>>> >>>> >>>>>> rebalance + IO could be causing some issues. >>>> >>>>>> >>>> >>>>>> CC'ing DHT devs >>>> >>>>>> >>>> >>>>>> Raghavendra/Nithya/Susant, >>>> >>>>>> >>>> >>>>>> Could you take a look? >>>> >>>>>> >>>> >>>>>> -Krutika >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> On Sun, Mar 19, 2017 at 4:55 PM, Mahdi Adnan < >>>> mahdi.adnan at outlook.com> >>>> >>>>>> wrote: >>>> >>>>>> >>>> >>>>>> Thank you for your email mate. >>>> >>>>>> >>>> >>>>>> Yes, im aware of this but, to save costs i chose replica 2, this >>>> >>>>>> cluster is all flash. >>>> >>>>>> >>>> >>>>>> In version 3.7.x i had issues with ping timeout, if one hosts >>>> went >>>> >>>>>> down for few seconds the whole cluster hangs and become >>>> unavailable, to >>>> >>>>>> avoid this i adjusted the ping timeout to 5 seconds. >>>> >>>>>> >>>> >>>>>> As for choosing Ganesha over gfapi, VMWare does not support >>>> Gluster >>>> >>>>>> (FUSE or gfapi) im stuck with NFS for this volume. >>>> >>>>>> >>>> >>>>>> The other volume is mounted using gfapi in oVirt cluster. >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> -- >>>> >>>>>> >>>> >>>>>> Respectfully >>>> >>>>>> Mahdi A. Mahdi >>>> >>>>>> >>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>> >>>>>> Sent: Sunday, March 19, 2017 2:01:49 PM >>>> >>>>>> >>>> >>>>>> To: Mahdi Adnan >>>> >>>>>> Cc: gluster-users at gluster.org >>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>> corruption >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> While I'm still going through the logs, just wanted to point out >>>> a >>>> >>>>>> couple of things: >>>> >>>>>> >>>> >>>>>> 1. It is recommended that you use 3-way replication (replica >>>> count 3) >>>> >>>>>> for VM store use case >>>> >>>>>> >>>> >>>>>> 2. network.ping-timeout at 5 seconds is way too low. Please >>>> change it >>>> >>>>>> to 30. >>>> >>>>>> >>>> >>>>>> Is there any specific reason for using NFS-Ganesha over >>>> gfapi/FUSE? >>>> >>>>>> >>>> >>>>>> Will get back with anything else I might find or more questions >>>> if I >>>> >>>>>> have any. >>>> >>>>>> >>>> >>>>>> -Krutika >>>> >>>>>> >>>> >>>>>> On Sun, Mar 19, 2017 at 2:36 PM, Mahdi Adnan < >>>> mahdi.adnan at outlook.com> >>>> >>>>>> wrote: >>>> >>>>>> >>>> >>>>>> Thanks mate, >>>> >>>>>> >>>> >>>>>> Kindly, check the attachment. >>>> >>>>>> >>>> >>>>>> -- >>>> >>>>>> >>>> >>>>>> Respectfully >>>> >>>>>> Mahdi A. Mahdi >>>> >>>>>> >>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>> >>>>>> Sent: Sunday, March 19, 2017 10:00:22 AM >>>> >>>>>> >>>> >>>>>> To: Mahdi Adnan >>>> >>>>>> Cc: gluster-users at gluster.org >>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>> corruption >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> In that case could you share the ganesha-gfapi logs? >>>> >>>>>> >>>> >>>>>> -Krutika >>>> >>>>>> >>>> >>>>>> On Sun, Mar 19, 2017 at 12:13 PM, Mahdi Adnan >>>> >>>>>> <mahdi.adnan at outlook.com> wrote: >>>> >>>>>> >>>> >>>>>> I have two volumes, one is mounted using libgfapi for ovirt >>>> mount, the >>>> >>>>>> other one is exported via NFS-Ganesha for VMWare which is the >>>> one im testing >>>> >>>>>> now. >>>> >>>>>> >>>> >>>>>> -- >>>> >>>>>> >>>> >>>>>> Respectfully >>>> >>>>>> Mahdi A. Mahdi >>>> >>>>>> >>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>> >>>>>> Sent: Sunday, March 19, 2017 8:02:19 AM >>>> >>>>>> >>>> >>>>>> To: Mahdi Adnan >>>> >>>>>> Cc: gluster-users at gluster.org >>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>> corruption >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> On Sat, Mar 18, 2017 at 10:36 PM, Mahdi Adnan >>>> >>>>>> <mahdi.adnan at outlook.com> wrote: >>>> >>>>>> >>>> >>>>>> Kindly, check the attached new log file, i dont know if it's >>>> helpful >>>> >>>>>> or not but, i couldn't find the log with the name you just >>>> described. >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> No. Are you using FUSE or libgfapi for accessing the volume? Or >>>> is it >>>> >>>>>> NFS? >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> -Krutika >>>> >>>>>> >>>> >>>>>> -- >>>> >>>>>> >>>> >>>>>> Respectfully >>>> >>>>>> Mahdi A. Mahdi >>>> >>>>>> >>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>> >>>>>> Sent: Saturday, March 18, 2017 6:10:40 PM >>>> >>>>>> >>>> >>>>>> To: Mahdi Adnan >>>> >>>>>> Cc: gluster-users at gluster.org >>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>> corruption >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> mnt-disk11-vmware2.log seems like a brick log. Could you attach >>>> the >>>> >>>>>> fuse mount logs? It should be right under /var/log/glusterfs/ >>>> directory >>>> >>>>>> >>>> >>>>>> named after the mount point name, only hyphenated. >>>> >>>>>> >>>> >>>>>> -Krutika >>>> >>>>>> >>>> >>>>>> On Sat, Mar 18, 2017 at 7:27 PM, Mahdi Adnan < >>>> mahdi.adnan at outlook.com> >>>> >>>>>> wrote: >>>> >>>>>> >>>> >>>>>> Hello Krutika, >>>> >>>>>> >>>> >>>>>> Kindly, check the attached logs. >>>> >>>>>> >>>> >>>>>> -- >>>> >>>>>> >>>> >>>>>> Respectfully >>>> >>>>>> Mahdi A. Mahdi >>>> >>>>>> >>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> Sent: Saturday, March 18, 2017 3:29:03 PM >>>> >>>>>> To: Mahdi Adnan >>>> >>>>>> Cc: gluster-users at gluster.org >>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>> corruption >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> Hi Mahdi, >>>> >>>>>> >>>> >>>>>> Could you attach mount, brick and rebalance logs? >>>> >>>>>> >>>> >>>>>> -Krutika >>>> >>>>>> >>>> >>>>>> On Sat, Mar 18, 2017 at 12:14 AM, Mahdi Adnan >>>> >>>>>> <mahdi.adnan at outlook.com> wrote: >>>> >>>>>> >>>> >>>>>> Hi, >>>> >>>>>> >>>> >>>>>> I have upgraded to Gluster 3.8.10 today and ran the add-brick >>>> >>>>>> procedure in a volume contains few VMs. >>>> >>>>>> >>>> >>>>>> After the completion of rebalance, i have rebooted the VMs, some >>>> of >>>> >>>>>> ran just fine, and others just crashed. >>>> >>>>>> >>>> >>>>>> Windows boot to recovery mode and Linux throw xfs errors and >>>> does not >>>> >>>>>> boot. >>>> >>>>>> >>>> >>>>>> I ran the test again and it happened just as the first one, but >>>> i have >>>> >>>>>> noticed only VMs doing disk IOs are affected by this bug. >>>> >>>>>> >>>> >>>>>> The VMs in power off mode started fine and even md5 of the disk >>>> file >>>> >>>>>> did not change after the rebalance. >>>> >>>>>> >>>> >>>>>> anyone else can confirm this ? >>>> >>>>>> >>>> >>>>>> Volume info: >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> Volume Name: vmware2 >>>> >>>>>> >>>> >>>>>> Type: Distributed-Replicate >>>> >>>>>> >>>> >>>>>> Volume ID: 02328d46-a285-4533-aa3a-fb9bfeb688bf >>>> >>>>>> >>>> >>>>>> Status: Started >>>> >>>>>> >>>> >>>>>> Snapshot Count: 0 >>>> >>>>>> >>>> >>>>>> Number of Bricks: 22 x 2 = 44 >>>> >>>>>> >>>> >>>>>> Transport-type: tcp >>>> >>>>>> >>>> >>>>>> Bricks: >>>> >>>>>> >>>> >>>>>> Brick1: gluster01:/mnt/disk1/vmware2 >>>> >>>>>> >>>> >>>>>> Brick2: gluster03:/mnt/disk1/vmware2 >>>> >>>>>> >>>> >>>>>> Brick3: gluster02:/mnt/disk1/vmware2 >>>> >>>>>> >>>> >>>>>> Brick4: gluster04:/mnt/disk1/vmware2 >>>> >>>>>> >>>> >>>>>> Brick5: gluster01:/mnt/disk2/vmware2 >>>> >>>>>> >>>> >>>>>> Brick6: gluster03:/mnt/disk2/vmware2 >>>> >>>>>> >>>> >>>>>> Brick7: gluster02:/mnt/disk2/vmware2 >>>> >>>>>> >>>> >>>>>> Brick8: gluster04:/mnt/disk2/vmware2 >>>> >>>>>> >>>> >>>>>> Brick9: gluster01:/mnt/disk3/vmware2 >>>> >>>>>> >>>> >>>>>> Brick10: gluster03:/mnt/disk3/vmware2 >>>> >>>>>> >>>> >>>>>> Brick11: gluster02:/mnt/disk3/vmware2 >>>> >>>>>> >>>> >>>>>> Brick12: gluster04:/mnt/disk3/vmware2 >>>> >>>>>> >>>> >>>>>> Brick13: gluster01:/mnt/disk4/vmware2 >>>> >>>>>> >>>> >>>>>> Brick14: gluster03:/mnt/disk4/vmware2 >>>> >>>>>> >>>> >>>>>> Brick15: gluster02:/mnt/disk4/vmware2 >>>> >>>>>> >>>> >>>>>> Brick16: gluster04:/mnt/disk4/vmware2 >>>> >>>>>> >>>> >>>>>> Brick17: gluster01:/mnt/disk5/vmware2 >>>> >>>>>> >>>> >>>>>> Brick18: gluster03:/mnt/disk5/vmware2 >>>> >>>>>> >>>> >>>>>> Brick19: gluster02:/mnt/disk5/vmware2 >>>> >>>>>> >>>> >>>>>> Brick20: gluster04:/mnt/disk5/vmware2 >>>> >>>>>> >>>> >>>>>> Brick21: gluster01:/mnt/disk6/vmware2 >>>> >>>>>> >>>> >>>>>> Brick22: gluster03:/mnt/disk6/vmware2 >>>> >>>>>> >>>> >>>>>> Brick23: gluster02:/mnt/disk6/vmware2 >>>> >>>>>> >>>> >>>>>> Brick24: gluster04:/mnt/disk6/vmware2 >>>> >>>>>> >>>> >>>>>> Brick25: gluster01:/mnt/disk7/vmware2 >>>> >>>>>> >>>> >>>>>> Brick26: gluster03:/mnt/disk7/vmware2 >>>> >>>>>> >>>> >>>>>> Brick27: gluster02:/mnt/disk7/vmware2 >>>> >>>>>> >>>> >>>>>> Brick28: gluster04:/mnt/disk7/vmware2 >>>> >>>>>> >>>> >>>>>> Brick29: gluster01:/mnt/disk8/vmware2 >>>> >>>>>> >>>> >>>>>> Brick30: gluster03:/mnt/disk8/vmware2 >>>> >>>>>> >>>> >>>>>> Brick31: gluster02:/mnt/disk8/vmware2 >>>> >>>>>> >>>> >>>>>> Brick32: gluster04:/mnt/disk8/vmware2 >>>> >>>>>> >>>> >>>>>> Brick33: gluster01:/mnt/disk9/vmware2 >>>> >>>>>> >>>> >>>>>> Brick34: gluster03:/mnt/disk9/vmware2 >>>> >>>>>> >>>> >>>>>> Brick35: gluster02:/mnt/disk9/vmware2 >>>> >>>>>> >>>> >>>>>> Brick36: gluster04:/mnt/disk9/vmware2 >>>> >>>>>> >>>> >>>>>> Brick37: gluster01:/mnt/disk10/vmware2 >>>> >>>>>> >>>> >>>>>> Brick38: gluster03:/mnt/disk10/vmware2 >>>> >>>>>> >>>> >>>>>> Brick39: gluster02:/mnt/disk10/vmware2 >>>> >>>>>> >>>> >>>>>> Brick40: gluster04:/mnt/disk10/vmware2 >>>> >>>>>> >>>> >>>>>> Brick41: gluster01:/mnt/disk11/vmware2 >>>> >>>>>> >>>> >>>>>> Brick42: gluster03:/mnt/disk11/vmware2 >>>> >>>>>> >>>> >>>>>> Brick43: gluster02:/mnt/disk11/vmware2 >>>> >>>>>> >>>> >>>>>> Brick44: gluster04:/mnt/disk11/vmware2 >>>> >>>>>> >>>> >>>>>> Options Reconfigured: >>>> >>>>>> >>>> >>>>>> cluster.server-quorum-type: server >>>> >>>>>> >>>> >>>>>> nfs.disable: on >>>> >>>>>> >>>> >>>>>> performance.readdir-ahead: on >>>> >>>>>> >>>> >>>>>> transport.address-family: inet >>>> >>>>>> >>>> >>>>>> performance.quick-read: off >>>> >>>>>> >>>> >>>>>> performance.read-ahead: off >>>> >>>>>> >>>> >>>>>> performance.io-cache: off >>>> >>>>>> >>>> >>>>>> performance.stat-prefetch: off >>>> >>>>>> >>>> >>>>>> cluster.eager-lock: enable >>>> >>>>>> >>>> >>>>>> network.remote-dio: enable >>>> >>>>>> >>>> >>>>>> features.shard: on >>>> >>>>>> >>>> >>>>>> cluster.data-self-heal-algorithm: full >>>> >>>>>> >>>> >>>>>> features.cache-invalidation: on >>>> >>>>>> >>>> >>>>>> ganesha.enable: on >>>> >>>>>> >>>> >>>>>> features.shard-block-size: 256MB >>>> >>>>>> >>>> >>>>>> client.event-threads: 2 >>>> >>>>>> >>>> >>>>>> server.event-threads: 2 >>>> >>>>>> >>>> >>>>>> cluster.favorite-child-policy: size >>>> >>>>>> >>>> >>>>>> storage.build-pgfid: off >>>> >>>>>> >>>> >>>>>> network.ping-timeout: 5 >>>> >>>>>> >>>> >>>>>> cluster.enable-shared-storage: enable >>>> >>>>>> >>>> >>>>>> nfs-ganesha: enable >>>> >>>>>> >>>> >>>>>> cluster.server-quorum-ratio: 51% >>>> >>>>>> >>>> >>>>>> Adding bricks: >>>> >>>>>> >>>> >>>>>> gluster volume add-brick vmware2 replica 2 >>>> >>>>>> gluster01:/mnt/disk11/vmware2 gluster03:/mnt/disk11/vmware2 >>>> >>>>>> gluster02:/mnt/disk11/vmware2 gluster04:/mnt/disk11/vmware2 >>>> >>>>>> >>>> >>>>>> starting fix layout: >>>> >>>>>> >>>> >>>>>> gluster volume rebalance vmware2 fix-layout start >>>> >>>>>> >>>> >>>>>> Starting rebalance: >>>> >>>>>> >>>> >>>>>> gluster volume rebalance vmware2 start >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> -- >>>> >>>>>> >>>> >>>>>> Respectfully >>>> >>>>>> Mahdi A. Mahdi >>>> >>>>>> >>>> >>>>>> _______________________________________________ >>>> >>>>>> Gluster-users mailing list >>>> >>>>>> Gluster-users at gluster.org >>>> >>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> _______________________________________________ >>>> >>>>> Gluster-users mailing list >>>> >>>>> Gluster-users at gluster.org >>>> >>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Pranith >>>> >>> >>>> >>> >>>> >>> >>>> >>> _______________________________________________ >>>> >>> Gluster-users mailing list >>>> >>> Gluster-users at gluster.org >>>> >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> > >>>> > >>>> >>> >> >> >> -- >> Pranith >> >-- Pranith -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170427/15f5c9fc/attachment.html>
Serkan Çoban
2017-Apr-27 11:21 UTC
[Gluster-users] Gluster 3.8.10 rebalance VMs corruption
I think this is he fix Gandalf asking for: https://github.com/gluster/glusterfs/commit/6e3054b42f9aef1e35b493fbb002ec47e1ba27ce On Thu, Apr 27, 2017 at 2:03 PM, Pranith Kumar Karampuri <pkarampu at redhat.com> wrote:> I am very positive about the two things I told you. These are the latest > things that happened for VM corruption with rebalance. > > On Thu, Apr 27, 2017 at 4:30 PM, Gandalf Corvotempesta > <gandalf.corvotempesta at gmail.com> wrote: >> >> I think we are talking about a different bug. >> >> Il 27 apr 2017 12:58 PM, "Pranith Kumar Karampuri" <pkarampu at redhat.com> >> ha scritto: >>> >>> I am not a DHT developer, so some of what I say could be a little wrong. >>> But this is what I gather. >>> I think they found 2 classes of bugs in dht >>> 1) Graceful fop failover when rebalance is in progress is missing for >>> some fops, that lead to VM pause. >>> >>> I see that https://review.gluster.org/17085 got merged on 24th on master >>> for this. I see patches are posted for 3.8.x for this one. >>> >>> 2) I think there is some work needs to be done for dht_[f]xattrop. I >>> believe this is the next step that is underway. >>> >>> >>> On Thu, Apr 27, 2017 at 12:13 PM, Gandalf Corvotempesta >>> <gandalf.corvotempesta at gmail.com> wrote: >>>> >>>> Updates on this critical bug ? >>>> >>>> Il 18 apr 2017 8:24 PM, "Gandalf Corvotempesta" >>>> <gandalf.corvotempesta at gmail.com> ha scritto: >>>>> >>>>> Any update ? >>>>> In addition, if this is a different bug but the "workflow" is the same >>>>> as the previous one, how is possible that fixing the previous bug >>>>> triggered this new one ? >>>>> >>>>> Is possible to have some details ? >>>>> >>>>> 2017-04-04 16:11 GMT+02:00 Krutika Dhananjay <kdhananj at redhat.com>: >>>>> > Nope. This is a different bug. >>>>> > >>>>> > -Krutika >>>>> > >>>>> > On Mon, Apr 3, 2017 at 5:03 PM, Gandalf Corvotempesta >>>>> > <gandalf.corvotempesta at gmail.com> wrote: >>>>> >> >>>>> >> This is a good news >>>>> >> Is this related to the previously fixed bug? >>>>> >> >>>>> >> Il 3 apr 2017 10:22 AM, "Krutika Dhananjay" <kdhananj at redhat.com> ha >>>>> >> scritto: >>>>> >>> >>>>> >>> So Raghavendra has an RCA for this issue. >>>>> >>> >>>>> >>> Copy-pasting his comment here: >>>>> >>> >>>>> >>> <RCA> >>>>> >>> >>>>> >>> Following is a rough algorithm of shard_writev: >>>>> >>> >>>>> >>> 1. Based on the offset, calculate the shards touched by current >>>>> >>> write. >>>>> >>> 2. Look for inodes corresponding to these shard files in itable. >>>>> >>> 3. If one or more inodes are missing from itable, issue mknod for >>>>> >>> corresponding shard files and ignore EEXIST in cbk. >>>>> >>> 4. resume writes on respective shards. >>>>> >>> >>>>> >>> Now, imagine a write which falls to an existing "shard_file". For >>>>> >>> the >>>>> >>> sake of discussion lets consider a distribute of three subvols - >>>>> >>> s1, s2, s3 >>>>> >>> >>>>> >>> 1. "shard_file" hashes to subvolume s2 and is present on s2 >>>>> >>> 2. add a subvolume s4 and initiate a fix layout. The layout of >>>>> >>> ".shard" >>>>> >>> is fixed to include s4 and hash ranges are changed. >>>>> >>> 3. write that touches "shard_file" is issued. >>>>> >>> 4. The inode for "shard_file" is not present in itable after a >>>>> >>> graph >>>>> >>> switch and features/shard issues an mknod. >>>>> >>> 5. With new layout of .shard, lets say "shard_file" hashes to s3 >>>>> >>> and >>>>> >>> mknod (shard_file) on s3 succeeds. But, the shard_file is already >>>>> >>> present on >>>>> >>> s2. >>>>> >>> >>>>> >>> So, we have two files on two different subvols of dht representing >>>>> >>> same >>>>> >>> shard and this will lead to corruption. >>>>> >>> >>>>> >>> </RCA> >>>>> >>> >>>>> >>> Raghavendra will be sending out a patch in DHT to fix this issue. >>>>> >>> >>>>> >>> -Krutika >>>>> >>> >>>>> >>> >>>>> >>> On Tue, Mar 28, 2017 at 11:49 PM, Pranith Kumar Karampuri >>>>> >>> <pkarampu at redhat.com> wrote: >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> On Mon, Mar 27, 2017 at 11:29 PM, Mahdi Adnan >>>>> >>>> <mahdi.adnan at outlook.com> >>>>> >>>> wrote: >>>>> >>>>> >>>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Do you guys have any update regarding this issue ? >>>>> >>>> >>>>> >>>> I do not actively work on this issue so I do not have an accurate >>>>> >>>> update, but from what I heard from Krutika and Raghavendra(works >>>>> >>>> on DHT) is: >>>>> >>>> Krutika debugged initially and found that the issue seems more >>>>> >>>> likely to be >>>>> >>>> in DHT, Satheesaran who helped us recreate this issue in lab found >>>>> >>>> that just >>>>> >>>> fix-layout without rebalance also caused the corruption 1 out of 3 >>>>> >>>> times. >>>>> >>>> Raghavendra came up with a possible RCA for why this can happen. >>>>> >>>> Raghavendra(CCed) would be the right person to provide accurate >>>>> >>>> update. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> >>>>> Respectfully >>>>> >>>>> Mahdi A. Mahdi >>>>> >>>>> >>>>> >>>>> ________________________________ >>>>> >>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>>> >>>>> Sent: Tuesday, March 21, 2017 3:02:55 PM >>>>> >>>>> To: Mahdi Adnan >>>>> >>>>> Cc: Nithya Balachandran; Gowdappa, Raghavendra; Susant Palai; >>>>> >>>>> gluster-users at gluster.org List >>>>> >>>>> >>>>> >>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>>> >>>>> corruption >>>>> >>>>> >>>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> So it looks like Satheesaran managed to recreate this issue. We >>>>> >>>>> will be >>>>> >>>>> seeking his help in debugging this. It will be easier that way. >>>>> >>>>> >>>>> >>>>> -Krutika >>>>> >>>>> >>>>> >>>>> On Tue, Mar 21, 2017 at 1:35 PM, Mahdi Adnan >>>>> >>>>> <mahdi.adnan at outlook.com> >>>>> >>>>> wrote: >>>>> >>>>>> >>>>> >>>>>> Hello and thank you for your email. >>>>> >>>>>> Actually no, i didn't check the gfid of the vms. >>>>> >>>>>> If this will help, i can setup a new test cluster and get all >>>>> >>>>>> the data >>>>> >>>>>> you need. >>>>> >>>>>> >>>>> >>>>>> Get Outlook for Android >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> From: Nithya Balachandran >>>>> >>>>>> Sent: Monday, March 20, 20:57 >>>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>>> >>>>>> corruption >>>>> >>>>>> To: Krutika Dhananjay >>>>> >>>>>> Cc: Mahdi Adnan, Gowdappa, Raghavendra, Susant Palai, >>>>> >>>>>> gluster-users at gluster.org List >>>>> >>>>>> >>>>> >>>>>> Hi, >>>>> >>>>>> >>>>> >>>>>> Do you know the GFIDs of the VM images which were corrupted? >>>>> >>>>>> >>>>> >>>>>> Regards, >>>>> >>>>>> >>>>> >>>>>> Nithya >>>>> >>>>>> >>>>> >>>>>> On 20 March 2017 at 20:37, Krutika Dhananjay >>>>> >>>>>> <kdhananj at redhat.com> >>>>> >>>>>> wrote: >>>>> >>>>>> >>>>> >>>>>> I looked at the logs. >>>>> >>>>>> >>>>> >>>>>> From the time the new graph (since the add-brick command you >>>>> >>>>>> shared >>>>> >>>>>> where bricks 41 through 44 are added) is switched to (line 3011 >>>>> >>>>>> onwards in >>>>> >>>>>> nfs-gfapi.log), I see the following kinds of errors: >>>>> >>>>>> >>>>> >>>>>> 1. Lookups to a bunch of files failed with ENOENT on both >>>>> >>>>>> replicas >>>>> >>>>>> which protocol/client converts to ESTALE. I am guessing these >>>>> >>>>>> entries got >>>>> >>>>>> migrated to >>>>> >>>>>> >>>>> >>>>>> other subvolumes leading to 'No such file or directory' errors. >>>>> >>>>>> >>>>> >>>>>> DHT and thereafter shard get the same error code and log the >>>>> >>>>>> following: >>>>> >>>>>> >>>>> >>>>>> 0 [2017-03-17 14:04:26.353444] E [MSGID: 109040] >>>>> >>>>>> [dht-helper.c:1198:dht_migration_complete_check_task] >>>>> >>>>>> 17-vmware2-dht: >>>>> >>>>>> <gfid:a68ce411-e381-46a3-93cd-d2af6a7c3532>: failed to >>>>> >>>>>> lookup the file >>>>> >>>>>> on vmware2-dht [Stale file handle] >>>>> >>>>>> 1 [2017-03-17 14:04:26.353528] E [MSGID: 133014] >>>>> >>>>>> [shard.c:1253:shard_common_stat_cbk] 17-vmware2-shard: stat >>>>> >>>>>> failed: >>>>> >>>>>> a68ce411-e381-46a3-93cd-d2af6a7c3532 [Stale file handle] >>>>> >>>>>> >>>>> >>>>>> which is fine. >>>>> >>>>>> >>>>> >>>>>> 2. The other kind are from AFR logging of possible split-brain >>>>> >>>>>> which I >>>>> >>>>>> suppose are harmless too. >>>>> >>>>>> [2017-03-17 14:23:36.968883] W [MSGID: 108008] >>>>> >>>>>> [afr-read-txn.c:228:afr_read_txn] 17-vmware2-replicate-13: >>>>> >>>>>> Unreadable >>>>> >>>>>> subvolume -1 found with event generation 2 for gfid >>>>> >>>>>> 74d49288-8452-40d4-893e-ff4672557ff9. (Possible split-brain) >>>>> >>>>>> >>>>> >>>>>> Since you are saying the bug is hit only on VMs that are >>>>> >>>>>> undergoing IO >>>>> >>>>>> while rebalance is running (as opposed to those that remained >>>>> >>>>>> powered off), >>>>> >>>>>> >>>>> >>>>>> rebalance + IO could be causing some issues. >>>>> >>>>>> >>>>> >>>>>> CC'ing DHT devs >>>>> >>>>>> >>>>> >>>>>> Raghavendra/Nithya/Susant, >>>>> >>>>>> >>>>> >>>>>> Could you take a look? >>>>> >>>>>> >>>>> >>>>>> -Krutika >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> On Sun, Mar 19, 2017 at 4:55 PM, Mahdi Adnan >>>>> >>>>>> <mahdi.adnan at outlook.com> >>>>> >>>>>> wrote: >>>>> >>>>>> >>>>> >>>>>> Thank you for your email mate. >>>>> >>>>>> >>>>> >>>>>> Yes, im aware of this but, to save costs i chose replica 2, this >>>>> >>>>>> cluster is all flash. >>>>> >>>>>> >>>>> >>>>>> In version 3.7.x i had issues with ping timeout, if one hosts >>>>> >>>>>> went >>>>> >>>>>> down for few seconds the whole cluster hangs and become >>>>> >>>>>> unavailable, to >>>>> >>>>>> avoid this i adjusted the ping timeout to 5 seconds. >>>>> >>>>>> >>>>> >>>>>> As for choosing Ganesha over gfapi, VMWare does not support >>>>> >>>>>> Gluster >>>>> >>>>>> (FUSE or gfapi) im stuck with NFS for this volume. >>>>> >>>>>> >>>>> >>>>>> The other volume is mounted using gfapi in oVirt cluster. >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> -- >>>>> >>>>>> >>>>> >>>>>> Respectfully >>>>> >>>>>> Mahdi A. Mahdi >>>>> >>>>>> >>>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>>> >>>>>> Sent: Sunday, March 19, 2017 2:01:49 PM >>>>> >>>>>> >>>>> >>>>>> To: Mahdi Adnan >>>>> >>>>>> Cc: gluster-users at gluster.org >>>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>>> >>>>>> corruption >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> While I'm still going through the logs, just wanted to point out >>>>> >>>>>> a >>>>> >>>>>> couple of things: >>>>> >>>>>> >>>>> >>>>>> 1. It is recommended that you use 3-way replication (replica >>>>> >>>>>> count 3) >>>>> >>>>>> for VM store use case >>>>> >>>>>> >>>>> >>>>>> 2. network.ping-timeout at 5 seconds is way too low. Please >>>>> >>>>>> change it >>>>> >>>>>> to 30. >>>>> >>>>>> >>>>> >>>>>> Is there any specific reason for using NFS-Ganesha over >>>>> >>>>>> gfapi/FUSE? >>>>> >>>>>> >>>>> >>>>>> Will get back with anything else I might find or more questions >>>>> >>>>>> if I >>>>> >>>>>> have any. >>>>> >>>>>> >>>>> >>>>>> -Krutika >>>>> >>>>>> >>>>> >>>>>> On Sun, Mar 19, 2017 at 2:36 PM, Mahdi Adnan >>>>> >>>>>> <mahdi.adnan at outlook.com> >>>>> >>>>>> wrote: >>>>> >>>>>> >>>>> >>>>>> Thanks mate, >>>>> >>>>>> >>>>> >>>>>> Kindly, check the attachment. >>>>> >>>>>> >>>>> >>>>>> -- >>>>> >>>>>> >>>>> >>>>>> Respectfully >>>>> >>>>>> Mahdi A. Mahdi >>>>> >>>>>> >>>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>>> >>>>>> Sent: Sunday, March 19, 2017 10:00:22 AM >>>>> >>>>>> >>>>> >>>>>> To: Mahdi Adnan >>>>> >>>>>> Cc: gluster-users at gluster.org >>>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>>> >>>>>> corruption >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> In that case could you share the ganesha-gfapi logs? >>>>> >>>>>> >>>>> >>>>>> -Krutika >>>>> >>>>>> >>>>> >>>>>> On Sun, Mar 19, 2017 at 12:13 PM, Mahdi Adnan >>>>> >>>>>> <mahdi.adnan at outlook.com> wrote: >>>>> >>>>>> >>>>> >>>>>> I have two volumes, one is mounted using libgfapi for ovirt >>>>> >>>>>> mount, the >>>>> >>>>>> other one is exported via NFS-Ganesha for VMWare which is the >>>>> >>>>>> one im testing >>>>> >>>>>> now. >>>>> >>>>>> >>>>> >>>>>> -- >>>>> >>>>>> >>>>> >>>>>> Respectfully >>>>> >>>>>> Mahdi A. Mahdi >>>>> >>>>>> >>>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>>> >>>>>> Sent: Sunday, March 19, 2017 8:02:19 AM >>>>> >>>>>> >>>>> >>>>>> To: Mahdi Adnan >>>>> >>>>>> Cc: gluster-users at gluster.org >>>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>>> >>>>>> corruption >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> On Sat, Mar 18, 2017 at 10:36 PM, Mahdi Adnan >>>>> >>>>>> <mahdi.adnan at outlook.com> wrote: >>>>> >>>>>> >>>>> >>>>>> Kindly, check the attached new log file, i dont know if it's >>>>> >>>>>> helpful >>>>> >>>>>> or not but, i couldn't find the log with the name you just >>>>> >>>>>> described. >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> No. Are you using FUSE or libgfapi for accessing the volume? Or >>>>> >>>>>> is it >>>>> >>>>>> NFS? >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> -Krutika >>>>> >>>>>> >>>>> >>>>>> -- >>>>> >>>>>> >>>>> >>>>>> Respectfully >>>>> >>>>>> Mahdi A. Mahdi >>>>> >>>>>> >>>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>>> >>>>>> Sent: Saturday, March 18, 2017 6:10:40 PM >>>>> >>>>>> >>>>> >>>>>> To: Mahdi Adnan >>>>> >>>>>> Cc: gluster-users at gluster.org >>>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>>> >>>>>> corruption >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> mnt-disk11-vmware2.log seems like a brick log. Could you attach >>>>> >>>>>> the >>>>> >>>>>> fuse mount logs? It should be right under /var/log/glusterfs/ >>>>> >>>>>> directory >>>>> >>>>>> >>>>> >>>>>> named after the mount point name, only hyphenated. >>>>> >>>>>> >>>>> >>>>>> -Krutika >>>>> >>>>>> >>>>> >>>>>> On Sat, Mar 18, 2017 at 7:27 PM, Mahdi Adnan >>>>> >>>>>> <mahdi.adnan at outlook.com> >>>>> >>>>>> wrote: >>>>> >>>>>> >>>>> >>>>>> Hello Krutika, >>>>> >>>>>> >>>>> >>>>>> Kindly, check the attached logs. >>>>> >>>>>> >>>>> >>>>>> -- >>>>> >>>>>> >>>>> >>>>>> Respectfully >>>>> >>>>>> Mahdi A. Mahdi >>>>> >>>>>> >>>>> >>>>>> From: Krutika Dhananjay <kdhananj at redhat.com> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> Sent: Saturday, March 18, 2017 3:29:03 PM >>>>> >>>>>> To: Mahdi Adnan >>>>> >>>>>> Cc: gluster-users at gluster.org >>>>> >>>>>> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs >>>>> >>>>>> corruption >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> Hi Mahdi, >>>>> >>>>>> >>>>> >>>>>> Could you attach mount, brick and rebalance logs? >>>>> >>>>>> >>>>> >>>>>> -Krutika >>>>> >>>>>> >>>>> >>>>>> On Sat, Mar 18, 2017 at 12:14 AM, Mahdi Adnan >>>>> >>>>>> <mahdi.adnan at outlook.com> wrote: >>>>> >>>>>> >>>>> >>>>>> Hi, >>>>> >>>>>> >>>>> >>>>>> I have upgraded to Gluster 3.8.10 today and ran the add-brick >>>>> >>>>>> procedure in a volume contains few VMs. >>>>> >>>>>> >>>>> >>>>>> After the completion of rebalance, i have rebooted the VMs, some >>>>> >>>>>> of >>>>> >>>>>> ran just fine, and others just crashed. >>>>> >>>>>> >>>>> >>>>>> Windows boot to recovery mode and Linux throw xfs errors and >>>>> >>>>>> does not >>>>> >>>>>> boot. >>>>> >>>>>> >>>>> >>>>>> I ran the test again and it happened just as the first one, but >>>>> >>>>>> i have >>>>> >>>>>> noticed only VMs doing disk IOs are affected by this bug. >>>>> >>>>>> >>>>> >>>>>> The VMs in power off mode started fine and even md5 of the disk >>>>> >>>>>> file >>>>> >>>>>> did not change after the rebalance. >>>>> >>>>>> >>>>> >>>>>> anyone else can confirm this ? >>>>> >>>>>> >>>>> >>>>>> Volume info: >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> Volume Name: vmware2 >>>>> >>>>>> >>>>> >>>>>> Type: Distributed-Replicate >>>>> >>>>>> >>>>> >>>>>> Volume ID: 02328d46-a285-4533-aa3a-fb9bfeb688bf >>>>> >>>>>> >>>>> >>>>>> Status: Started >>>>> >>>>>> >>>>> >>>>>> Snapshot Count: 0 >>>>> >>>>>> >>>>> >>>>>> Number of Bricks: 22 x 2 = 44 >>>>> >>>>>> >>>>> >>>>>> Transport-type: tcp >>>>> >>>>>> >>>>> >>>>>> Bricks: >>>>> >>>>>> >>>>> >>>>>> Brick1: gluster01:/mnt/disk1/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick2: gluster03:/mnt/disk1/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick3: gluster02:/mnt/disk1/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick4: gluster04:/mnt/disk1/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick5: gluster01:/mnt/disk2/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick6: gluster03:/mnt/disk2/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick7: gluster02:/mnt/disk2/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick8: gluster04:/mnt/disk2/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick9: gluster01:/mnt/disk3/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick10: gluster03:/mnt/disk3/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick11: gluster02:/mnt/disk3/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick12: gluster04:/mnt/disk3/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick13: gluster01:/mnt/disk4/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick14: gluster03:/mnt/disk4/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick15: gluster02:/mnt/disk4/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick16: gluster04:/mnt/disk4/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick17: gluster01:/mnt/disk5/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick18: gluster03:/mnt/disk5/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick19: gluster02:/mnt/disk5/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick20: gluster04:/mnt/disk5/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick21: gluster01:/mnt/disk6/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick22: gluster03:/mnt/disk6/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick23: gluster02:/mnt/disk6/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick24: gluster04:/mnt/disk6/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick25: gluster01:/mnt/disk7/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick26: gluster03:/mnt/disk7/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick27: gluster02:/mnt/disk7/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick28: gluster04:/mnt/disk7/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick29: gluster01:/mnt/disk8/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick30: gluster03:/mnt/disk8/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick31: gluster02:/mnt/disk8/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick32: gluster04:/mnt/disk8/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick33: gluster01:/mnt/disk9/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick34: gluster03:/mnt/disk9/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick35: gluster02:/mnt/disk9/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick36: gluster04:/mnt/disk9/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick37: gluster01:/mnt/disk10/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick38: gluster03:/mnt/disk10/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick39: gluster02:/mnt/disk10/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick40: gluster04:/mnt/disk10/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick41: gluster01:/mnt/disk11/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick42: gluster03:/mnt/disk11/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick43: gluster02:/mnt/disk11/vmware2 >>>>> >>>>>> >>>>> >>>>>> Brick44: gluster04:/mnt/disk11/vmware2 >>>>> >>>>>> >>>>> >>>>>> Options Reconfigured: >>>>> >>>>>> >>>>> >>>>>> cluster.server-quorum-type: server >>>>> >>>>>> >>>>> >>>>>> nfs.disable: on >>>>> >>>>>> >>>>> >>>>>> performance.readdir-ahead: on >>>>> >>>>>> >>>>> >>>>>> transport.address-family: inet >>>>> >>>>>> >>>>> >>>>>> performance.quick-read: off >>>>> >>>>>> >>>>> >>>>>> performance.read-ahead: off >>>>> >>>>>> >>>>> >>>>>> performance.io-cache: off >>>>> >>>>>> >>>>> >>>>>> performance.stat-prefetch: off >>>>> >>>>>> >>>>> >>>>>> cluster.eager-lock: enable >>>>> >>>>>> >>>>> >>>>>> network.remote-dio: enable >>>>> >>>>>> >>>>> >>>>>> features.shard: on >>>>> >>>>>> >>>>> >>>>>> cluster.data-self-heal-algorithm: full >>>>> >>>>>> >>>>> >>>>>> features.cache-invalidation: on >>>>> >>>>>> >>>>> >>>>>> ganesha.enable: on >>>>> >>>>>> >>>>> >>>>>> features.shard-block-size: 256MB >>>>> >>>>>> >>>>> >>>>>> client.event-threads: 2 >>>>> >>>>>> >>>>> >>>>>> server.event-threads: 2 >>>>> >>>>>> >>>>> >>>>>> cluster.favorite-child-policy: size >>>>> >>>>>> >>>>> >>>>>> storage.build-pgfid: off >>>>> >>>>>> >>>>> >>>>>> network.ping-timeout: 5 >>>>> >>>>>> >>>>> >>>>>> cluster.enable-shared-storage: enable >>>>> >>>>>> >>>>> >>>>>> nfs-ganesha: enable >>>>> >>>>>> >>>>> >>>>>> cluster.server-quorum-ratio: 51% >>>>> >>>>>> >>>>> >>>>>> Adding bricks: >>>>> >>>>>> >>>>> >>>>>> gluster volume add-brick vmware2 replica 2 >>>>> >>>>>> gluster01:/mnt/disk11/vmware2 gluster03:/mnt/disk11/vmware2 >>>>> >>>>>> gluster02:/mnt/disk11/vmware2 gluster04:/mnt/disk11/vmware2 >>>>> >>>>>> >>>>> >>>>>> starting fix layout: >>>>> >>>>>> >>>>> >>>>>> gluster volume rebalance vmware2 fix-layout start >>>>> >>>>>> >>>>> >>>>>> Starting rebalance: >>>>> >>>>>> >>>>> >>>>>> gluster volume rebalance vmware2 start >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> -- >>>>> >>>>>> >>>>> >>>>>> Respectfully >>>>> >>>>>> Mahdi A. Mahdi >>>>> >>>>>> >>>>> >>>>>> _______________________________________________ >>>>> >>>>>> Gluster-users mailing list >>>>> >>>>>> Gluster-users at gluster.org >>>>> >>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> >>>>> Gluster-users mailing list >>>>> >>>>> Gluster-users at gluster.org >>>>> >>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> -- >>>>> >>>> Pranith >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> _______________________________________________ >>>>> >>> Gluster-users mailing list >>>>> >>> Gluster-users at gluster.org >>>>> >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>> > >>>>> > >>> >>> >>> >>> >>> -- >>> Pranith > > > > > -- > Pranith > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users