Hi Sanju,
Here's what glusterd.log says on the new arbiter server when trying to add
the node:
[2019-05-22 00:15:05.963059] I [run.c:242:runner_log]
(-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0x3b2cd)
[0x7fe4ca9102cd]
-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0xe6b85)
[0x7fe4ca9bbb85] -->/lib64/libglusterfs.so.0(runner_log+0x115)
[0x7fe4d5ecc955] ) 0-management: Ran script:
/var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
--volname=gvol0 --version=1 --volume-op=add-brick
--gd-workdir=/var/lib/glusterd
[2019-05-22 00:15:05.963177] I [MSGID: 106578]
[glusterd-brick-ops.c:1355:glusterd_op_perform_add_bricks] 0-management:
replica-count is set 3
[2019-05-22 00:15:05.963228] I [MSGID: 106578]
[glusterd-brick-ops.c:1360:glusterd_op_perform_add_bricks] 0-management:
arbiter-count is set 1
[2019-05-22 00:15:05.963257] I [MSGID: 106578]
[glusterd-brick-ops.c:1364:glusterd_op_perform_add_bricks] 0-management:
type is set 0, need to change it
[2019-05-22 00:15:17.015268] E [MSGID: 106053]
[glusterd-utils.c:13942:glusterd_handle_replicate_brick_ops] 0-management:
Failed to set extended attribute trusted.add-brick : Transport endpoint is
not connected [Transport endpoint is not connected]
[2019-05-22 00:15:17.036479] E [MSGID: 106073]
[glusterd-brick-ops.c:2595:glusterd_op_add_brick] 0-glusterd: Unable to add
bricks
[2019-05-22 00:15:17.036595] E [MSGID: 106122]
[glusterd-mgmt.c:299:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit
failed.
[2019-05-22 00:15:17.036710] E [MSGID: 106122]
[glusterd-mgmt-handler.c:594:glusterd_handle_commit_fn] 0-management:
commit failed on operation Add brick
As before gvol0-add-brick-mount.log said:
[2019-05-22 00:15:17.005695] I [fuse-bridge.c:4267:fuse_init]
0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel
7.22
[2019-05-22 00:15:17.005749] I [fuse-bridge.c:4878:fuse_graph_sync] 0-fuse:
switched to graph 0
[2019-05-22 00:15:17.010101] E [fuse-bridge.c:4336:fuse_first_lookup]
0-fuse: first lookup on root failed (Transport endpoint is not connected)
[2019-05-22 00:15:17.014217] W [fuse-bridge.c:897:fuse_attr_cbk]
0-glusterfs-fuse: 2: LOOKUP() / => -1 (Transport endpoint is not connected)
[2019-05-22 00:15:17.015097] W [fuse-resolve.c:127:fuse_resolve_gfid_cbk]
0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport
endpoint is not connected)
[2019-05-22 00:15:17.015158] W [fuse-bridge.c:3294:fuse_setxattr_resume]
0-glusterfs-fuse: 3: SETXATTR 00000000-0000-0000-0000-000000000001/1
(trusted.add-brick) resolution failed
[2019-05-22 00:15:17.035636] I [fuse-bridge.c:5144:fuse_thread_proc]
0-fuse: initating unmount of /tmp/mntYGNbj9
[2019-05-22 00:15:17.035854] W [glusterfsd.c:1500:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dd5) [0x7f7745ccedd5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55c81b63de75]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55c81b63dceb] ) 0-:
received signum (15), shutting down
[2019-05-22 00:15:17.035942] I [fuse-bridge.c:5914:fini] 0-fuse: Unmounting
'/tmp/mntYGNbj9'.
[2019-05-22 00:15:17.035966] I [fuse-bridge.c:5919:fini] 0-fuse: Closing
fuse connection to '/tmp/mntYGNbj9'.
Here are the processes running on the new arbiter server:
# ps -ef | grep gluster
root 3466 1 0 20:13 ? 00:00:00 /usr/sbin/glusterfs -s
localhost --volfile-id gluster/glustershd -p
/var/run/gluster/glustershd/glustershd.pid -l
/var/log/glusterfs/glustershd.log -S
/var/run/gluster/24c12b09f93eec8e.socket --xlator-option
*replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412 --process-name
glustershd
root 6832 1 0 May16 ? 00:02:10 /usr/sbin/glusterd -p
/var/run/glusterd.pid --log-level INFO
root 17841 1 0 May16 ? 00:00:58 /usr/sbin/glusterfs
--process-name fuse --volfile-server=gfs1 --volfile-id=/gvol0 /mnt/glusterfs
Here are the files created on the new arbiter server:
# find /nodirectwritedata/gluster/gvol0 | xargs ls -ald
drwxr-xr-x 3 root root 4096 May 21 20:15 /nodirectwritedata/gluster/gvol0
drw------- 2 root root 4096 May 21 20:15
/nodirectwritedata/gluster/gvol0/.glusterfs
Thank you for your help!
On Tue, 21 May 2019 at 00:10, Sanju Rakonde <srakonde at redhat.com>
wrote:
> David,
>
> can you please attach glusterd.logs? As the error message says, Commit
> failed on the arbitar node, we might be able to find some issue on that
> node.
>
> On Mon, May 20, 2019 at 10:10 AM Nithya Balachandran <nbalacha at
redhat.com>
> wrote:
>
>>
>>
>> On Fri, 17 May 2019 at 06:01, David Cunningham <dcunningham at
voisonics.com>
>> wrote:
>>
>>> Hello,
>>>
>>> We're adding an arbiter node to an existing volume and having
an issue.
>>> Can anyone help? The root cause error appears to be
>>> "00000000-0000-0000-0000-000000000001: failed to resolve
(Transport
>>> endpoint is not connected)", as below.
>>>
>>> We are running glusterfs 5.6.1. Thanks in advance for any
assistance!
>>>
>>> On existing node gfs1, trying to add new arbiter node gfs3:
>>>
>>> # gluster volume add-brick gvol0 replica 3 arbiter 1
>>> gfs3:/nodirectwritedata/gluster/gvol0
>>> volume add-brick: failed: Commit failed on gfs3. Please check log
file
>>> for details.
>>>
>>
>> This looks like a glusterd issue. Please check the glusterd logs for
more
>> info.
>> Adding the glusterd dev to this thread. Sanju, can you take a look?
>>
>> Regards,
>> Nithya
>>
>>>
>>> On new node gfs3 in gvol0-add-brick-mount.log:
>>>
>>> [2019-05-17 01:20:22.689721] I [fuse-bridge.c:4267:fuse_init]
>>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs
7.24 kernel
>>> 7.22
>>> [2019-05-17 01:20:22.689778] I [fuse-bridge.c:4878:fuse_graph_sync]
>>> 0-fuse: switched to graph 0
>>> [2019-05-17 01:20:22.694897] E
[fuse-bridge.c:4336:fuse_first_lookup]
>>> 0-fuse: first lookup on root failed (Transport endpoint is not
connected)
>>> [2019-05-17 01:20:22.699770] W
>>> [fuse-resolve.c:127:fuse_resolve_gfid_cbk] 0-fuse:
>>> 00000000-0000-0000-0000-000000000001: failed to resolve (Transport
endpoint
>>> is not connected)
>>> [2019-05-17 01:20:22.699834] W
[fuse-bridge.c:3294:fuse_setxattr_resume]
>>> 0-glusterfs-fuse: 2: SETXATTR
00000000-0000-0000-0000-000000000001/1
>>> (trusted.add-brick) resolution failed
>>> [2019-05-17 01:20:22.715656] I
[fuse-bridge.c:5144:fuse_thread_proc]
>>> 0-fuse: initating unmount of /tmp/mntQAtu3f
>>> [2019-05-17 01:20:22.715865] W [glusterfsd.c:1500:cleanup_and_exit]
>>> (-->/lib64/libpthread.so.0(+0x7dd5) [0x7fb223bf6dd5]
>>> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
[0x560886581e75]
>>> -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x560886581ceb] )
0-:
>>> received signum (15), shutting down
>>> [2019-05-17 01:20:22.715926] I [fuse-bridge.c:5914:fini] 0-fuse:
>>> Unmounting '/tmp/mntQAtu3f'.
>>> [2019-05-17 01:20:22.715953] I [fuse-bridge.c:5919:fini] 0-fuse:
Closing
>>> fuse connection to '/tmp/mntQAtu3f'.
>>>
>>> Processes running on new node gfs3:
>>>
>>> # ps -ef | grep gluster
>>> root 6832 1 0 20:17 ? 00:00:00 /usr/sbin/glusterd
-p
>>> /var/run/glusterd.pid --log-level INFO
>>> root 15799 1 0 20:17 ? 00:00:00 /usr/sbin/glusterfs
-s
>>> localhost --volfile-id gluster/glustershd -p
>>> /var/run/gluster/glustershd/glustershd.pid -l
>>> /var/log/glusterfs/glustershd.log -S
>>> /var/run/gluster/24c12b09f93eec8e.socket --xlator-option
>>> *replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
--process-name
>>> glustershd
>>> root 16856 16735 0 21:21 pts/0 00:00:00 grep --color=auto
gluster
>>>
>>> --
>>> David Cunningham, Voisonics Limited
>>> http://voisonics.com/
>>> USA: +1 213 221 1092
>>> New Zealand: +64 (0)28 2558 3782
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
> --
> Thanks,
> Sanju
>
--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190522/a216bad8/attachment.html>