thr3ads.net - Gluster users - [Gluster-users] add-brick: failed: Commit failed [May 2019]

If this information is useful, please help other people find it:
Share via:

Ravishankar N

2019-May-22 06:02 UTC

[Gluster-users] add-brick: failed: Commit failed

On 22/05/19 11:23 AM, David Cunningham wrote:> Hi Ravi,
>
> I'd already done exactly that before, where step 3 was a simple 'rm
> -rf /nodirectwritedata/gluster/gvol0'. Have you another suggestion on 
> what the cleanup or reformat should be?`rm -rf /nodirectwritedata/gluster/gvol0` does look okay to me David. 
Basically, '/nodirectwritedata/gluster/gvol0' must be empty and must not
have any extended attributes set on it. Why fuse_first_lookup() is 
failing is a bit of a mystery to me at this point. :-(
Regards,
Ravi>
> Thank you.
>
>
> On Wed, 22 May 2019 at 13:56, Ravishankar N <ravishankar at redhat.com 
> <mailto:ravishankar at redhat.com>> wrote:
>
>     Hmm, so the volume info seems to indicate that the add-brick was
>     successful but the gfid xattr is missing on the new brick (as are
>     the actual files, barring the .glusterfs folder, according to your
>     previous mail).
>
>     Do you want to try removing and adding it again?
>
>     1. `gluster volume remove-brick gvol0 replica 2
>     gfs3:/nodirectwritedata/gluster/gvol0 force` from gfs1
>
>     2. Check that gluster volume info is now back to a 1x2 volume on
>     all nodes and `gluster peer status` is connected on all nodes.
>
>     3. Cleanup or reformat '/nodirectwritedata/gluster/gvol0' on
gfs3.
>
>     4. `gluster volume add-brick gvol0 replica 3 arbiter 1
>     gfs3:/nodirectwritedata/gluster/gvol0` from gfs1.
>
>     5. Check that the files are getting healed on to the new brick.
>
>     Thanks,
>     Ravi
>     On 22/05/19 6:50 AM, David Cunningham wrote:
>>     Hi Ravi,
>>
>>     Certainly. On the existing two nodes:
>>
>>     gfs1 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
>>     getfattr: Removing leading '/' from absolute path names
>>     # file: nodirectwritedata/gluster/gvol0
>>     trusted.afr.dirty=0x000000000000000000000000
>>     trusted.afr.gvol0-client-2=0x000000000000000000000000
>>     trusted.gfid=0x00000000000000000000000000000001
>>     trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>     trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
>>
>>     gfs2 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
>>     getfattr: Removing leading '/' from absolute path names
>>     # file: nodirectwritedata/gluster/gvol0
>>     trusted.afr.dirty=0x000000000000000000000000
>>     trusted.afr.gvol0-client-0=0x000000000000000000000000
>>     trusted.afr.gvol0-client-2=0x000000000000000000000000
>>     trusted.gfid=0x00000000000000000000000000000001
>>     trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>     trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
>>
>>     On the new node:
>>
>>     gfs3 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
>>     getfattr: Removing leading '/' from absolute path names
>>     # file: nodirectwritedata/gluster/gvol0
>>     trusted.afr.dirty=0x000000000000000000000001
>>     trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
>>
>>     Output of "gluster volume info" is the same on all 3
nodes and is:
>>
>>     # gluster volume info
>>
>>     Volume Name: gvol0
>>     Type: Replicate
>>     Volume ID: fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
>>     Status: Started
>>     Snapshot Count: 0
>>     Number of Bricks: 1 x (2 + 1) = 3
>>     Transport-type: tcp
>>     Bricks:
>>     Brick1: gfs1:/nodirectwritedata/gluster/gvol0
>>     Brick2: gfs2:/nodirectwritedata/gluster/gvol0
>>     Brick3: gfs3:/nodirectwritedata/gluster/gvol0 (arbiter)
>>     Options Reconfigured:
>>     performance.client-io-threads: off
>>     nfs.disable: on
>>     transport.address-family: inet
>>
>>
>>     On Wed, 22 May 2019 at 12:43, Ravishankar N
>>     <ravishankar at redhat.com <mailto:ravishankar at
redhat.com>> wrote:
>>
>>         Hi David,
>>         Could you provide the `getfattr -d -m. -e hex
>>         /nodirectwritedata/gluster/gvol0` output of all bricks and
>>         the output of `gluster volume info`?
>>
>>         Thanks,
>>         Ravi
>>         On 22/05/19 4:57 AM, David Cunningham wrote:
>>>         Hi Sanju,
>>>
>>>         Here's what glusterd.log says on the new arbiter server
when
>>>         trying to add the node:
>>>
>>>         [2019-05-22 00:15:05.963059] I [run.c:242:runner_log]
>>>        
(-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0x3b2cd)
>>>         [0x7fe4ca9102cd]
>>>        
-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0xe6b85)
>>>         [0x7fe4ca9bbb85]
>>>         -->/lib64/libglusterfs.so.0(runner_log+0x115)
>>>         [0x7fe4d5ecc955] ) 0-management: Ran script:
>>>        
/var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
>>>         --volname=gvol0 --version=1 --volume-op=add-brick
>>>         --gd-workdir=/var/lib/glusterd
>>>         [2019-05-22 00:15:05.963177] I [MSGID: 106578]
>>>         [glusterd-brick-ops.c:1355:glusterd_op_perform_add_bricks]
>>>         0-management: replica-count is set 3
>>>         [2019-05-22 00:15:05.963228] I [MSGID: 106578]
>>>         [glusterd-brick-ops.c:1360:glusterd_op_perform_add_bricks]
>>>         0-management: arbiter-count is set 1
>>>         [2019-05-22 00:15:05.963257] I [MSGID: 106578]
>>>         [glusterd-brick-ops.c:1364:glusterd_op_perform_add_bricks]
>>>         0-management: type is set 0, need to change it
>>>         [2019-05-22 00:15:17.015268] E [MSGID: 106053]
>>>        
[glusterd-utils.c:13942:glusterd_handle_replicate_brick_ops]
>>>         0-management: Failed to set extended attribute
>>>         trusted.add-brick : Transport endpoint is not connected
>>>         [Transport endpoint is not connected]
>>>         [2019-05-22 00:15:17.036479] E [MSGID: 106073]
>>>         [glusterd-brick-ops.c:2595:glusterd_op_add_brick]
>>>         0-glusterd: Unable to add bricks
>>>         [2019-05-22 00:15:17.036595] E [MSGID: 106122]
>>>         [glusterd-mgmt.c:299:gd_mgmt_v3_commit_fn] 0-management:
>>>         Add-brick commit failed.
>>>         [2019-05-22 00:15:17.036710] E [MSGID: 106122]
>>>         [glusterd-mgmt-handler.c:594:glusterd_handle_commit_fn]
>>>         0-management: commit failed on operation Add brick
>>>
>>>         As before gvol0-add-brick-mount.log said:
>>>
>>>         [2019-05-22 00:15:17.005695] I
>>>         [fuse-bridge.c:4267:fuse_init] 0-glusterfs-fuse: FUSE
inited
>>>         with protocol versions: glusterfs 7.24 kernel 7.22
>>>         [2019-05-22 00:15:17.005749] I
>>>         [fuse-bridge.c:4878:fuse_graph_sync] 0-fuse: switched to
graph 0
>>>         [2019-05-22 00:15:17.010101] E
>>>         [fuse-bridge.c:4336:fuse_first_lookup] 0-fuse: first lookup
>>>         on root failed (Transport endpoint is not connected)
>>>         [2019-05-22 00:15:17.014217] W
>>>         [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 2:
>>>         LOOKUP() / => -1 (Transport endpoint is not connected)
>>>         [2019-05-22 00:15:17.015097] W
>>>         [fuse-resolve.c:127:fuse_resolve_gfid_cbk] 0-fuse:
>>>         00000000-0000-0000-0000-000000000001: failed to resolve
>>>         (Transport endpoint is not connected)
>>>         [2019-05-22 00:15:17.015158] W
>>>         [fuse-bridge.c:3294:fuse_setxattr_resume] 0-glusterfs-fuse:
>>>         3: SETXATTR 00000000-0000-0000-0000-000000000001/1
>>>         (trusted.add-brick) resolution failed
>>>         [2019-05-22 00:15:17.035636] I
>>>         [fuse-bridge.c:5144:fuse_thread_proc] 0-fuse: initating
>>>         unmount of /tmp/mntYGNbj9
>>>         [2019-05-22 00:15:17.035854] W
>>>         [glusterfsd.c:1500:cleanup_and_exit]
>>>         (-->/lib64/libpthread.so.0(+0x7dd5) [0x7f7745ccedd5]
>>>         -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
>>>         [0x55c81b63de75]
>>>         -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
>>>         [0x55c81b63dceb] ) 0-: received signum (15), shutting down
>>>         [2019-05-22 00:15:17.035942] I [fuse-bridge.c:5914:fini]
>>>         0-fuse: Unmounting '/tmp/mntYGNbj9'.
>>>         [2019-05-22 00:15:17.035966] I [fuse-bridge.c:5919:fini]
>>>         0-fuse: Closing fuse connection to
'/tmp/mntYGNbj9'.
>>>
>>>         Here are the processes running on the new arbiter server:
>>>         # ps -ef | grep gluster
>>>         root????? 3466???? 1? 0 20:13 ???????? 00:00:00
>>>         /usr/sbin/glusterfs -s localhost --volfile-id
>>>         gluster/glustershd -p
>>>         /var/run/gluster/glustershd/glustershd.pid -l
>>>         /var/log/glusterfs/glustershd.log -S
>>>         /var/run/gluster/24c12b09f93eec8e.socket --xlator-option
>>>         *replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
>>>         --process-name glustershd
>>>         root????? 6832???? 1? 0 May16 ???????? 00:02:10
>>>         /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level
INFO
>>>         root???? 17841???? 1? 0 May16 ???????? 00:00:58
>>>         /usr/sbin/glusterfs --process-name fuse
>>>         --volfile-server=gfs1 --volfile-id=/gvol0 /mnt/glusterfs
>>>
>>>         Here are the files created on the new arbiter server:
>>>         # find /nodirectwritedata/gluster/gvol0 | xargs ls -ald
>>>         drwxr-xr-x 3 root root 4096 May 21 20:15
>>>         /nodirectwritedata/gluster/gvol0
>>>         drw------- 2 root root 4096 May 21 20:15
>>>         /nodirectwritedata/gluster/gvol0/.glusterfs
>>>
>>>         Thank you for your help!
>>>
>>>
>>>         On Tue, 21 May 2019 at 00:10, Sanju Rakonde
>>>         <srakonde at redhat.com <mailto:srakonde at
redhat.com>> wrote:
>>>
>>>             David,
>>>
>>>             can you please attach glusterd.logs? As the error
>>>             message says, Commit failed on the arbitar node, we
>>>             might be able to find some issue on that node.
>>>
>>>             On Mon, May 20, 2019 at 10:10 AM Nithya Balachandran
>>>             <nbalacha at redhat.com <mailto:nbalacha at
redhat.com>> wrote:
>>>
>>>
>>>
>>>                 On Fri, 17 May 2019 at 06:01, David Cunningham
>>>                 <dcunningham at voisonics.com
>>>                 <mailto:dcunningham at voisonics.com>>
wrote:
>>>
>>>                     Hello,
>>>
>>>                     We're adding an arbiter node to an existing
>>>                     volume and having an issue. Can anyone help?
The
>>>                     root cause error appears to be
>>>                     "00000000-0000-0000-0000-000000000001:
failed to
>>>                     resolve (Transport endpoint is not
connected)",
>>>                     as below.
>>>
>>>                     We are running glusterfs 5.6.1. Thanks in
>>>                     advance for any assistance!
>>>
>>>                     On existing node gfs1, trying to add new
arbiter
>>>                     node gfs3:
>>>
>>>                     # gluster volume add-brick gvol0 replica 3
>>>                     arbiter 1 gfs3:/nodirectwritedata/gluster/gvol0
>>>                     volume add-brick: failed: Commit failed on
gfs3.
>>>                     Please check log file for details.
>>>
>>>
>>>                 This looks like a glusterd issue. Please check the
>>>                 glusterd logs for more info.
>>>                 Adding the glusterd dev to this thread. Sanju, can
>>>                 you take a look?
>>>                 Regards,
>>>                 Nithya
>>>
>>>
>>>                     On new node gfs3 in gvol0-add-brick-mount.log:
>>>
>>>                     [2019-05-17 01:20:22.689721] I
>>>                     [fuse-bridge.c:4267:fuse_init]
0-glusterfs-fuse:
>>>                     FUSE inited with protocol versions: glusterfs
>>>                     7.24 kernel 7.22
>>>                     [2019-05-17 01:20:22.689778] I
>>>                     [fuse-bridge.c:4878:fuse_graph_sync] 0-fuse:
>>>                     switched to graph 0
>>>                     [2019-05-17 01:20:22.694897] E
>>>                     [fuse-bridge.c:4336:fuse_first_lookup] 0-fuse:
>>>                     first lookup on root failed (Transport endpoint
>>>                     is not connected)
>>>                     [2019-05-17 01:20:22.699770] W
>>>                     [fuse-resolve.c:127:fuse_resolve_gfid_cbk]
>>>                     0-fuse: 00000000-0000-0000-0000-000000000001:
>>>                     failed to resolve (Transport endpoint is not
>>>                     connected)
>>>                     [2019-05-17 01:20:22.699834] W
>>>                     [fuse-bridge.c:3294:fuse_setxattr_resume]
>>>                     0-glusterfs-fuse: 2: SETXATTR
>>>                     00000000-0000-0000-0000-000000000001/1
>>>                     (trusted.add-brick) resolution failed
>>>                     [2019-05-17 01:20:22.715656] I
>>>                     [fuse-bridge.c:5144:fuse_thread_proc] 0-fuse:
>>>                     initating unmount of /tmp/mntQAtu3f
>>>                     [2019-05-17 01:20:22.715865] W
>>>                     [glusterfsd.c:1500:cleanup_and_exit]
>>>                     (-->/lib64/libpthread.so.0(+0x7dd5)
>>>                     [0x7fb223bf6dd5]
>>>                    
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
>>>                     [0x560886581e75]
>>>                    
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
>>>                     [0x560886581ceb] ) 0-: received signum (15),
>>>                     shutting down
>>>                     [2019-05-17 01:20:22.715926] I
>>>                     [fuse-bridge.c:5914:fini] 0-fuse: Unmounting
>>>                     '/tmp/mntQAtu3f'.
>>>                     [2019-05-17 01:20:22.715953] I
>>>                     [fuse-bridge.c:5919:fini] 0-fuse: Closing fuse
>>>                     connection to '/tmp/mntQAtu3f'.
>>>
>>>                     Processes running on new node gfs3:
>>>
>>>                     # ps -ef | grep gluster
>>>                     root????? 6832 1? 0 20:17 ???????? 00:00:00
>>>                     /usr/sbin/glusterd -p /var/run/glusterd.pid
>>>                     --log-level INFO
>>>                     root???? 15799 1? 0 20:17 ???????? 00:00:00
>>>                     /usr/sbin/glusterfs -s localhost --volfile-id
>>>                     gluster/glustershd -p
>>>                     /var/run/gluster/glustershd/glustershd.pid -l
>>>                     /var/log/glusterfs/glustershd.log -S
>>>                     /var/run/gluster/24c12b09f93eec8e.socket
>>>                     --xlator-option
>>>                    
*replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
>>>                     --process-name glustershd
>>>                     root???? 16856 16735? 0 21:21 pts/0??? 00:00:00
>>>                     grep --color=auto gluster
>>>
>>>                     -- 
>>>                     David Cunningham, Voisonics Limited
>>>                     http://voisonics.com/
>>>                     USA: +1 213 221 1092
>>>                     New Zealand: +64 (0)28 2558 3782
>>>                     _______________________________________________
>>>                     Gluster-users mailing list
>>>                     Gluster-users at gluster.org
>>>                     <mailto:Gluster-users at gluster.org>
>>>                    
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>>             -- 
>>>             Thanks,
>>>             Sanju
>>>
>>>
>>>
>>>         -- 
>>>         David Cunningham, Voisonics Limited
>>>         http://voisonics.com/
>>>         USA: +1 213 221 1092
>>>         New Zealand: +64 (0)28 2558 3782
>>>
>>>         _______________________________________________
>>>         Gluster-users mailing list
>>>         Gluster-users at gluster.org  <mailto:Gluster-users at
gluster.org>
>>>         https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>>     -- 
>>     David Cunningham, Voisonics Limited
>>     http://voisonics.com/
>>     USA: +1 213 221 1092
>>     New Zealand: +64 (0)28 2558 3782
>
>
>
> -- 
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190522/ddb0bd91/attachment.html>

Ravishankar N

2019-May-22 06:06 UTC

head link

[Gluster-users] add-brick: failed: Commit failed

If you are trying this again, please 'gluster volume set $volname 
client-log-level DEBUG`before attempting the add-brick and attach the 
gvol0-add-brick-mount.log here. After that, you can change the 
client-log-level back to INFO.

-Ravi

On 22/05/19 11:32 AM, Ravishankar N wrote:>
>
> On 22/05/19 11:23 AM, David Cunningham wrote:
>> Hi Ravi,
>>
>> I'd already done exactly that before, where step 3 was a simple
'rm
>> -rf /nodirectwritedata/gluster/gvol0'. Have you another suggestion
on
>> what the cleanup or reformat should be?
> `rm -rf /nodirectwritedata/gluster/gvol0` does look okay to me David. 
> Basically, '/nodirectwritedata/gluster/gvol0' must be empty and
must
> not have any extended attributes set on it. Why fuse_first_lookup() is 
> failing is a bit of a mystery to me at this point. :-(
> Regards,
> Ravi
>>
>> Thank you.
>>
>>
>> On Wed, 22 May 2019 at 13:56, Ravishankar N <ravishankar at
redhat.com
>> <mailto:ravishankar at redhat.com>> wrote:
>>
>>     Hmm, so the volume info seems to indicate that the add-brick was
>>     successful but the gfid xattr is missing on the new brick (as are
>>     the actual files, barring the .glusterfs folder, according to
>>     your previous mail).
>>
>>     Do you want to try removing and adding it again?
>>
>>     1. `gluster volume remove-brick gvol0 replica 2
>>     gfs3:/nodirectwritedata/gluster/gvol0 force` from gfs1
>>
>>     2. Check that gluster volume info is now back to a 1x2 volume on
>>     all nodes and `gluster peer status` is connected on all nodes.
>>
>>     3. Cleanup or reformat '/nodirectwritedata/gluster/gvol0'
on gfs3.
>>
>>     4. `gluster volume add-brick gvol0 replica 3 arbiter 1
>>     gfs3:/nodirectwritedata/gluster/gvol0` from gfs1.
>>
>>     5. Check that the files are getting healed on to the new brick.
>>
>>     Thanks,
>>     Ravi
>>     On 22/05/19 6:50 AM, David Cunningham wrote:
>>>     Hi Ravi,
>>>
>>>     Certainly. On the existing two nodes:
>>>
>>>     gfs1 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
>>>     getfattr: Removing leading '/' from absolute path names
>>>     # file: nodirectwritedata/gluster/gvol0
>>>     trusted.afr.dirty=0x000000000000000000000000
>>>     trusted.afr.gvol0-client-2=0x000000000000000000000000
>>>     trusted.gfid=0x00000000000000000000000000000001
>>>     trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>>     trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
>>>
>>>     gfs2 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
>>>     getfattr: Removing leading '/' from absolute path names
>>>     # file: nodirectwritedata/gluster/gvol0
>>>     trusted.afr.dirty=0x000000000000000000000000
>>>     trusted.afr.gvol0-client-0=0x000000000000000000000000
>>>     trusted.afr.gvol0-client-2=0x000000000000000000000000
>>>     trusted.gfid=0x00000000000000000000000000000001
>>>     trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>>     trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
>>>
>>>     On the new node:
>>>
>>>     gfs3 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
>>>     getfattr: Removing leading '/' from absolute path names
>>>     # file: nodirectwritedata/gluster/gvol0
>>>     trusted.afr.dirty=0x000000000000000000000001
>>>     trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
>>>
>>>     Output of "gluster volume info" is the same on all 3
nodes and is:
>>>
>>>     # gluster volume info
>>>
>>>     Volume Name: gvol0
>>>     Type: Replicate
>>>     Volume ID: fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
>>>     Status: Started
>>>     Snapshot Count: 0
>>>     Number of Bricks: 1 x (2 + 1) = 3
>>>     Transport-type: tcp
>>>     Bricks:
>>>     Brick1: gfs1:/nodirectwritedata/gluster/gvol0
>>>     Brick2: gfs2:/nodirectwritedata/gluster/gvol0
>>>     Brick3: gfs3:/nodirectwritedata/gluster/gvol0 (arbiter)
>>>     Options Reconfigured:
>>>     performance.client-io-threads: off
>>>     nfs.disable: on
>>>     transport.address-family: inet
>>>
>>>
>>>     On Wed, 22 May 2019 at 12:43, Ravishankar N
>>>     <ravishankar at redhat.com <mailto:ravishankar at
redhat.com>> wrote:
>>>
>>>         Hi David,
>>>         Could you provide the `getfattr -d -m. -e hex
>>>         /nodirectwritedata/gluster/gvol0` output of all bricks and
>>>         the output of `gluster volume info`?
>>>
>>>         Thanks,
>>>         Ravi
>>>         On 22/05/19 4:57 AM, David Cunningham wrote:
>>>>         Hi Sanju,
>>>>
>>>>         Here's what glusterd.log says on the new arbiter
server
>>>>         when trying to add the node:
>>>>
>>>>         [2019-05-22 00:15:05.963059] I [run.c:242:runner_log]
>>>>        
(-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0x3b2cd)
>>>>         [0x7fe4ca9102cd]
>>>>        
-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0xe6b85)
>>>>         [0x7fe4ca9bbb85]
>>>>         -->/lib64/libglusterfs.so.0(runner_log+0x115)
>>>>         [0x7fe4d5ecc955] ) 0-management: Ran script:
>>>>        
/var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
>>>>         --volname=gvol0 --version=1 --volume-op=add-brick
>>>>         --gd-workdir=/var/lib/glusterd
>>>>         [2019-05-22 00:15:05.963177] I [MSGID: 106578]
>>>>        
[glusterd-brick-ops.c:1355:glusterd_op_perform_add_bricks]
>>>>         0-management: replica-count is set 3
>>>>         [2019-05-22 00:15:05.963228] I [MSGID: 106578]
>>>>        
[glusterd-brick-ops.c:1360:glusterd_op_perform_add_bricks]
>>>>         0-management: arbiter-count is set 1
>>>>         [2019-05-22 00:15:05.963257] I [MSGID: 106578]
>>>>        
[glusterd-brick-ops.c:1364:glusterd_op_perform_add_bricks]
>>>>         0-management: type is set 0, need to change it
>>>>         [2019-05-22 00:15:17.015268] E [MSGID: 106053]
>>>>        
[glusterd-utils.c:13942:glusterd_handle_replicate_brick_ops]
>>>>         0-management: Failed to set extended attribute
>>>>         trusted.add-brick : Transport endpoint is not connected
>>>>         [Transport endpoint is not connected]
>>>>         [2019-05-22 00:15:17.036479] E [MSGID: 106073]
>>>>         [glusterd-brick-ops.c:2595:glusterd_op_add_brick]
>>>>         0-glusterd: Unable to add bricks
>>>>         [2019-05-22 00:15:17.036595] E [MSGID: 106122]
>>>>         [glusterd-mgmt.c:299:gd_mgmt_v3_commit_fn]
0-management:
>>>>         Add-brick commit failed.
>>>>         [2019-05-22 00:15:17.036710] E [MSGID: 106122]
>>>>         [glusterd-mgmt-handler.c:594:glusterd_handle_commit_fn]
>>>>         0-management: commit failed on operation Add brick
>>>>
>>>>         As before gvol0-add-brick-mount.log said:
>>>>
>>>>         [2019-05-22 00:15:17.005695] I
>>>>         [fuse-bridge.c:4267:fuse_init] 0-glusterfs-fuse: FUSE
>>>>         inited with protocol versions: glusterfs 7.24 kernel
7.22
>>>>         [2019-05-22 00:15:17.005749] I
>>>>         [fuse-bridge.c:4878:fuse_graph_sync] 0-fuse: switched
to
>>>>         graph 0
>>>>         [2019-05-22 00:15:17.010101] E
>>>>         [fuse-bridge.c:4336:fuse_first_lookup] 0-fuse: first
lookup
>>>>         on root failed (Transport endpoint is not connected)
>>>>         [2019-05-22 00:15:17.014217] W
>>>>         [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 2:
>>>>         LOOKUP() / => -1 (Transport endpoint is not
connected)
>>>>         [2019-05-22 00:15:17.015097] W
>>>>         [fuse-resolve.c:127:fuse_resolve_gfid_cbk] 0-fuse:
>>>>         00000000-0000-0000-0000-000000000001: failed to resolve
>>>>         (Transport endpoint is not connected)
>>>>         [2019-05-22 00:15:17.015158] W
>>>>         [fuse-bridge.c:3294:fuse_setxattr_resume]
0-glusterfs-fuse:
>>>>         3: SETXATTR 00000000-0000-0000-0000-000000000001/1
>>>>         (trusted.add-brick) resolution failed
>>>>         [2019-05-22 00:15:17.035636] I
>>>>         [fuse-bridge.c:5144:fuse_thread_proc] 0-fuse: initating
>>>>         unmount of /tmp/mntYGNbj9
>>>>         [2019-05-22 00:15:17.035854] W
>>>>         [glusterfsd.c:1500:cleanup_and_exit]
>>>>         (-->/lib64/libpthread.so.0(+0x7dd5) [0x7f7745ccedd5]
>>>>         -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
>>>>         [0x55c81b63de75]
>>>>         -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
>>>>         [0x55c81b63dceb] ) 0-: received signum (15), shutting
down
>>>>         [2019-05-22 00:15:17.035942] I
[fuse-bridge.c:5914:fini]
>>>>         0-fuse: Unmounting '/tmp/mntYGNbj9'.
>>>>         [2019-05-22 00:15:17.035966] I
[fuse-bridge.c:5919:fini]
>>>>         0-fuse: Closing fuse connection to
'/tmp/mntYGNbj9'.
>>>>
>>>>         Here are the processes running on the new arbiter
server:
>>>>         # ps -ef | grep gluster
>>>>         root????? 3466???? 1? 0 20:13 ???????? 00:00:00
>>>>         /usr/sbin/glusterfs -s localhost --volfile-id
>>>>         gluster/glustershd -p
>>>>         /var/run/gluster/glustershd/glustershd.pid -l
>>>>         /var/log/glusterfs/glustershd.log -S
>>>>         /var/run/gluster/24c12b09f93eec8e.socket
--xlator-option
>>>>        
*replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
>>>>         --process-name glustershd
>>>>         root????? 6832???? 1? 0 May16 ???????? 00:02:10
>>>>         /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level
INFO
>>>>         root???? 17841???? 1? 0 May16 ???????? 00:00:58
>>>>         /usr/sbin/glusterfs --process-name fuse
>>>>         --volfile-server=gfs1 --volfile-id=/gvol0
/mnt/glusterfs
>>>>
>>>>         Here are the files created on the new arbiter server:
>>>>         # find /nodirectwritedata/gluster/gvol0 | xargs ls -ald
>>>>         drwxr-xr-x 3 root root 4096 May 21 20:15
>>>>         /nodirectwritedata/gluster/gvol0
>>>>         drw------- 2 root root 4096 May 21 20:15
>>>>         /nodirectwritedata/gluster/gvol0/.glusterfs
>>>>
>>>>         Thank you for your help!
>>>>
>>>>
>>>>         On Tue, 21 May 2019 at 00:10, Sanju Rakonde
>>>>         <srakonde at redhat.com <mailto:srakonde at
redhat.com>> wrote:
>>>>
>>>>             David,
>>>>
>>>>             can you please attach glusterd.logs? As the error
>>>>             message says, Commit failed on the arbitar node, we
>>>>             might be able to find some issue on that node.
>>>>
>>>>             On Mon, May 20, 2019 at 10:10 AM Nithya
Balachandran
>>>>             <nbalacha at redhat.com <mailto:nbalacha at
redhat.com>> wrote:
>>>>
>>>>
>>>>
>>>>                 On Fri, 17 May 2019 at 06:01, David Cunningham
>>>>                 <dcunningham at voisonics.com
>>>>                 <mailto:dcunningham at voisonics.com>>
wrote:
>>>>
>>>>                     Hello,
>>>>
>>>>                     We're adding an arbiter node to an
existing
>>>>                     volume and having an issue. Can anyone
help?
>>>>                     The root cause error appears to be
>>>>                     "00000000-0000-0000-0000-000000000001:
failed
>>>>                     to resolve (Transport endpoint is not
>>>>                     connected)", as below.
>>>>
>>>>                     We are running glusterfs 5.6.1. Thanks in
>>>>                     advance for any assistance!
>>>>
>>>>                     On existing node gfs1, trying to add new
>>>>                     arbiter node gfs3:
>>>>
>>>>                     # gluster volume add-brick gvol0 replica 3
>>>>                     arbiter 1
gfs3:/nodirectwritedata/gluster/gvol0
>>>>                     volume add-brick: failed: Commit failed on
>>>>                     gfs3. Please check log file for details.
>>>>
>>>>
>>>>                 This looks like a glusterd issue. Please check
the
>>>>                 glusterd logs for more info.
>>>>                 Adding the glusterd dev to this thread. Sanju,
can
>>>>                 you take a look?
>>>>                 Regards,
>>>>                 Nithya
>>>>
>>>>
>>>>                     On new node gfs3 in
gvol0-add-brick-mount.log:
>>>>
>>>>                     [2019-05-17 01:20:22.689721] I
>>>>                     [fuse-bridge.c:4267:fuse_init]
>>>>                     0-glusterfs-fuse: FUSE inited with protocol
>>>>                     versions: glusterfs 7.24 kernel 7.22
>>>>                     [2019-05-17 01:20:22.689778] I
>>>>                     [fuse-bridge.c:4878:fuse_graph_sync]
0-fuse:
>>>>                     switched to graph 0
>>>>                     [2019-05-17 01:20:22.694897] E
>>>>                     [fuse-bridge.c:4336:fuse_first_lookup]
0-fuse:
>>>>                     first lookup on root failed (Transport
endpoint
>>>>                     is not connected)
>>>>                     [2019-05-17 01:20:22.699770] W
>>>>                     [fuse-resolve.c:127:fuse_resolve_gfid_cbk]
>>>>                     0-fuse:
00000000-0000-0000-0000-000000000001:
>>>>                     failed to resolve (Transport endpoint is
not
>>>>                     connected)
>>>>                     [2019-05-17 01:20:22.699834] W
>>>>                     [fuse-bridge.c:3294:fuse_setxattr_resume]
>>>>                     0-glusterfs-fuse: 2: SETXATTR
>>>>                     00000000-0000-0000-0000-000000000001/1
>>>>                     (trusted.add-brick) resolution failed
>>>>                     [2019-05-17 01:20:22.715656] I
>>>>                     [fuse-bridge.c:5144:fuse_thread_proc]
0-fuse:
>>>>                     initating unmount of /tmp/mntQAtu3f
>>>>                     [2019-05-17 01:20:22.715865] W
>>>>                     [glusterfsd.c:1500:cleanup_and_exit]
>>>>                     (-->/lib64/libpthread.so.0(+0x7dd5)
>>>>                     [0x7fb223bf6dd5]
>>>>                    
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
>>>>                     [0x560886581e75]
>>>>                    
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
>>>>                     [0x560886581ceb] ) 0-: received signum
(15),
>>>>                     shutting down
>>>>                     [2019-05-17 01:20:22.715926] I
>>>>                     [fuse-bridge.c:5914:fini] 0-fuse:
Unmounting
>>>>                     '/tmp/mntQAtu3f'.
>>>>                     [2019-05-17 01:20:22.715953] I
>>>>                     [fuse-bridge.c:5919:fini] 0-fuse: Closing
fuse
>>>>                     connection to '/tmp/mntQAtu3f'.
>>>>
>>>>                     Processes running on new node gfs3:
>>>>
>>>>                     # ps -ef | grep gluster
>>>>                     root 6832???? 1? 0 20:17 ? 00:00:00
>>>>                     /usr/sbin/glusterd -p /var/run/glusterd.pid
>>>>                     --log-level INFO
>>>>                     root 15799???? 1? 0 20:17 ? 00:00:00
>>>>                     /usr/sbin/glusterfs -s localhost
--volfile-id
>>>>                     gluster/glustershd -p
>>>>                     /var/run/gluster/glustershd/glustershd.pid
-l
>>>>                     /var/log/glusterfs/glustershd.log -S
>>>>                     /var/run/gluster/24c12b09f93eec8e.socket
>>>>                     --xlator-option
>>>>                    
*replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
>>>>                     --process-name glustershd
>>>>                     root???? 16856 16735? 0 21:21 pts/0
00:00:00
>>>>                     grep --color=auto gluster
>>>>
>>>>                     -- 
>>>>                     David Cunningham, Voisonics Limited
>>>>                     http://voisonics.com/
>>>>                     USA: +1 213 221 1092
>>>>                     New Zealand: +64 (0)28 2558 3782
>>>>                    
_______________________________________________
>>>>                     Gluster-users mailing list
>>>>                     Gluster-users at gluster.org
>>>>                     <mailto:Gluster-users at gluster.org>
>>>>                    
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             Thanks,
>>>>             Sanju
>>>>
>>>>
>>>>
>>>>         -- 
>>>>         David Cunningham, Voisonics Limited
>>>>         http://voisonics.com/
>>>>         USA: +1 213 221 1092
>>>>         New Zealand: +64 (0)28 2558 3782
>>>>
>>>>         _______________________________________________
>>>>         Gluster-users mailing list
>>>>         Gluster-users at gluster.org  <mailto:Gluster-users
at gluster.org>
>>>>        
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>>     -- 
>>>     David Cunningham, Voisonics Limited
>>>     http://voisonics.com/
>>>     USA: +1 213 221 1092
>>>     New Zealand: +64 (0)28 2558 3782
>>
>>
>>
>> -- 
>> David Cunningham, Voisonics Limited
>> http://voisonics.com/
>> USA: +1 213 221 1092
>> New Zealand: +64 (0)28 2558 3782-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190522/cf21d1a3/attachment.html>

Gluster users - May 2019 - add-brick: failed: Commit failed

[Gluster-users] add-brick: failed: Commit failed

[Gluster-users] add-brick: failed: Commit failed