thr3ads.net - Gluster users - [Gluster-users] add-brick: failed: Commit failed [May 2019]

If this information is useful, please help other people find it:
Share via:

David Cunningham

2019-May-22 01:20 UTC

[Gluster-users] add-brick: failed: Commit failed

Hi Ravi,

Certainly. On the existing two nodes:

gfs1 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
getfattr: Removing leading '/' from absolute path names
# file: nodirectwritedata/gluster/gvol0
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gvol0-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6

gfs2 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
getfattr: Removing leading '/' from absolute path names
# file: nodirectwritedata/gluster/gvol0
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gvol0-client-0=0x000000000000000000000000
trusted.afr.gvol0-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6

On the new node:

gfs3 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
getfattr: Removing leading '/' from absolute path names
# file: nodirectwritedata/gluster/gvol0
trusted.afr.dirty=0x000000000000000000000001
trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6

Output of "gluster volume info" is the same on all 3 nodes and is:

# gluster volume info

Volume Name: gvol0
Type: Replicate
Volume ID: fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gfs1:/nodirectwritedata/gluster/gvol0
Brick2: gfs2:/nodirectwritedata/gluster/gvol0
Brick3: gfs3:/nodirectwritedata/gluster/gvol0 (arbiter)
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet


On Wed, 22 May 2019 at 12:43, Ravishankar N <ravishankar at redhat.com>
wrote:
> Hi David,
> Could you provide the `getfattr -d -m. -e hex
> /nodirectwritedata/gluster/gvol0` output of all bricks and the output of
> `gluster volume info`?
>
> Thanks,
> Ravi
> On 22/05/19 4:57 AM, David Cunningham wrote:
>
> Hi Sanju,
>
> Here's what glusterd.log says on the new arbiter server when trying to
add
> the node:
>
> [2019-05-22 00:15:05.963059] I [run.c:242:runner_log]
> (-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0x3b2cd)
> [0x7fe4ca9102cd]
> -->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0xe6b85)
> [0x7fe4ca9bbb85] -->/lib64/libglusterfs.so.0(runner_log+0x115)
> [0x7fe4d5ecc955] ) 0-management: Ran script:
> /var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
> --volname=gvol0 --version=1 --volume-op=add-brick
> --gd-workdir=/var/lib/glusterd
> [2019-05-22 00:15:05.963177] I [MSGID: 106578]
> [glusterd-brick-ops.c:1355:glusterd_op_perform_add_bricks] 0-management:
> replica-count is set 3
> [2019-05-22 00:15:05.963228] I [MSGID: 106578]
> [glusterd-brick-ops.c:1360:glusterd_op_perform_add_bricks] 0-management:
> arbiter-count is set 1
> [2019-05-22 00:15:05.963257] I [MSGID: 106578]
> [glusterd-brick-ops.c:1364:glusterd_op_perform_add_bricks] 0-management:
> type is set 0, need to change it
> [2019-05-22 00:15:17.015268] E [MSGID: 106053]
> [glusterd-utils.c:13942:glusterd_handle_replicate_brick_ops] 0-management:
> Failed to set extended attribute trusted.add-brick : Transport endpoint is
> not connected [Transport endpoint is not connected]
> [2019-05-22 00:15:17.036479] E [MSGID: 106073]
> [glusterd-brick-ops.c:2595:glusterd_op_add_brick] 0-glusterd: Unable to add
> bricks
> [2019-05-22 00:15:17.036595] E [MSGID: 106122]
> [glusterd-mgmt.c:299:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit
> failed.
> [2019-05-22 00:15:17.036710] E [MSGID: 106122]
> [glusterd-mgmt-handler.c:594:glusterd_handle_commit_fn] 0-management:
> commit failed on operation Add brick
>
> As before gvol0-add-brick-mount.log said:
>
> [2019-05-22 00:15:17.005695] I [fuse-bridge.c:4267:fuse_init]
> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel
> 7.22
> [2019-05-22 00:15:17.005749] I [fuse-bridge.c:4878:fuse_graph_sync]
> 0-fuse: switched to graph 0
> [2019-05-22 00:15:17.010101] E [fuse-bridge.c:4336:fuse_first_lookup]
> 0-fuse: first lookup on root failed (Transport endpoint is not connected)
> [2019-05-22 00:15:17.014217] W [fuse-bridge.c:897:fuse_attr_cbk]
> 0-glusterfs-fuse: 2: LOOKUP() / => -1 (Transport endpoint is not
connected)
> [2019-05-22 00:15:17.015097] W [fuse-resolve.c:127:fuse_resolve_gfid_cbk]
> 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport
> endpoint is not connected)
> [2019-05-22 00:15:17.015158] W [fuse-bridge.c:3294:fuse_setxattr_resume]
> 0-glusterfs-fuse: 3: SETXATTR 00000000-0000-0000-0000-000000000001/1
> (trusted.add-brick) resolution failed
> [2019-05-22 00:15:17.035636] I [fuse-bridge.c:5144:fuse_thread_proc]
> 0-fuse: initating unmount of /tmp/mntYGNbj9
> [2019-05-22 00:15:17.035854] W [glusterfsd.c:1500:cleanup_and_exit]
> (-->/lib64/libpthread.so.0(+0x7dd5) [0x7f7745ccedd5]
> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55c81b63de75]
> -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55c81b63dceb] ) 0-:
> received signum (15), shutting down
> [2019-05-22 00:15:17.035942] I [fuse-bridge.c:5914:fini] 0-fuse:
> Unmounting '/tmp/mntYGNbj9'.
> [2019-05-22 00:15:17.035966] I [fuse-bridge.c:5919:fini] 0-fuse: Closing
> fuse connection to '/tmp/mntYGNbj9'.
>
> Here are the processes running on the new arbiter server:
> # ps -ef | grep gluster
> root      3466     1  0 20:13 ?        00:00:00 /usr/sbin/glusterfs -s
> localhost --volfile-id gluster/glustershd -p
> /var/run/gluster/glustershd/glustershd.pid -l
> /var/log/glusterfs/glustershd.log -S
> /var/run/gluster/24c12b09f93eec8e.socket --xlator-option
> *replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412 --process-name
> glustershd
> root      6832     1  0 May16 ?        00:02:10 /usr/sbin/glusterd -p
> /var/run/glusterd.pid --log-level INFO
> root     17841     1  0 May16 ?        00:00:58 /usr/sbin/glusterfs
> --process-name fuse --volfile-server=gfs1 --volfile-id=/gvol0
/mnt/glusterfs
>
> Here are the files created on the new arbiter server:
> # find /nodirectwritedata/gluster/gvol0 | xargs ls -ald
> drwxr-xr-x 3 root root 4096 May 21 20:15 /nodirectwritedata/gluster/gvol0
> drw------- 2 root root 4096 May 21 20:15
> /nodirectwritedata/gluster/gvol0/.glusterfs
>
> Thank you for your help!
>
>
> On Tue, 21 May 2019 at 00:10, Sanju Rakonde <srakonde at redhat.com>
wrote:
>
>> David,
>>
>> can you please attach glusterd.logs? As the error message says, Commit
>> failed on the arbitar node, we might be able to find some issue on that
>> node.
>>
>> On Mon, May 20, 2019 at 10:10 AM Nithya Balachandran <nbalacha at
redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, 17 May 2019 at 06:01, David Cunningham <
>>> dcunningham at voisonics.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> We're adding an arbiter node to an existing volume and
having an issue.
>>>> Can anyone help? The root cause error appears to be
>>>> "00000000-0000-0000-0000-000000000001: failed to resolve
(Transport
>>>> endpoint is not connected)", as below.
>>>>
>>>> We are running glusterfs 5.6.1. Thanks in advance for any
assistance!
>>>>
>>>> On existing node gfs1, trying to add new arbiter node gfs3:
>>>>
>>>> # gluster volume add-brick gvol0 replica 3 arbiter 1
>>>> gfs3:/nodirectwritedata/gluster/gvol0
>>>> volume add-brick: failed: Commit failed on gfs3. Please check
log file
>>>> for details.
>>>>
>>>
>>> This looks like a glusterd issue. Please check the glusterd logs
for
>>> more info.
>>> Adding the glusterd dev to this thread. Sanju, can you take a look?
>>>
>>> Regards,
>>> Nithya
>>>
>>>>
>>>> On new node gfs3 in gvol0-add-brick-mount.log:
>>>>
>>>> [2019-05-17 01:20:22.689721] I [fuse-bridge.c:4267:fuse_init]
>>>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs
7.24 kernel
>>>> 7.22
>>>> [2019-05-17 01:20:22.689778] I
[fuse-bridge.c:4878:fuse_graph_sync]
>>>> 0-fuse: switched to graph 0
>>>> [2019-05-17 01:20:22.694897] E
[fuse-bridge.c:4336:fuse_first_lookup]
>>>> 0-fuse: first lookup on root failed (Transport endpoint is not
connected)
>>>> [2019-05-17 01:20:22.699770] W
>>>> [fuse-resolve.c:127:fuse_resolve_gfid_cbk] 0-fuse:
>>>> 00000000-0000-0000-0000-000000000001: failed to resolve
(Transport endpoint
>>>> is not connected)
>>>> [2019-05-17 01:20:22.699834] W
>>>> [fuse-bridge.c:3294:fuse_setxattr_resume] 0-glusterfs-fuse: 2:
SETXATTR
>>>> 00000000-0000-0000-0000-000000000001/1 (trusted.add-brick)
resolution failed
>>>> [2019-05-17 01:20:22.715656] I
[fuse-bridge.c:5144:fuse_thread_proc]
>>>> 0-fuse: initating unmount of /tmp/mntQAtu3f
>>>> [2019-05-17 01:20:22.715865] W
[glusterfsd.c:1500:cleanup_and_exit]
>>>> (-->/lib64/libpthread.so.0(+0x7dd5) [0x7fb223bf6dd5]
>>>> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
[0x560886581e75]
>>>> -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
[0x560886581ceb] ) 0-:
>>>> received signum (15), shutting down
>>>> [2019-05-17 01:20:22.715926] I [fuse-bridge.c:5914:fini]
0-fuse:
>>>> Unmounting '/tmp/mntQAtu3f'.
>>>> [2019-05-17 01:20:22.715953] I [fuse-bridge.c:5919:fini]
0-fuse:
>>>> Closing fuse connection to '/tmp/mntQAtu3f'.
>>>>
>>>> Processes running on new node gfs3:
>>>>
>>>> # ps -ef | grep gluster
>>>> root      6832     1  0 20:17 ?        00:00:00
/usr/sbin/glusterd -p
>>>> /var/run/glusterd.pid --log-level INFO
>>>> root     15799     1  0 20:17 ?        00:00:00
/usr/sbin/glusterfs -s
>>>> localhost --volfile-id gluster/glustershd -p
>>>> /var/run/gluster/glustershd/glustershd.pid -l
>>>> /var/log/glusterfs/glustershd.log -S
>>>> /var/run/gluster/24c12b09f93eec8e.socket --xlator-option
>>>> *replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
--process-name
>>>> glustershd
>>>> root     16856 16735  0 21:21 pts/0    00:00:00 grep
--color=auto
>>>> gluster
>>>>
>>>> --
>>>> David Cunningham, Voisonics Limited
>>>> http://voisonics.com/
>>>> USA: +1 213 221 1092
>>>> New Zealand: +64 (0)28 2558 3782
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>
>> --
>> Thanks,
>> Sanju
>>
>
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
>
> _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>
>
-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190522/19f228b1/attachment.html>

Ravishankar N

2019-May-22 01:55 UTC

head link

[Gluster-users] add-brick: failed: Commit failed

Hmm, so the volume info seems to indicate that the add-brick was 
successful but the gfid xattr is missing on the new brick (as are the 
actual files, barring the .glusterfs folder, according to your previous 
mail).

Do you want to try removing and adding it again?

1. `gluster volume remove-brick gvol0 replica 2 
gfs3:/nodirectwritedata/gluster/gvol0 force` from gfs1

2. Check that gluster volume info is now back to a 1x2 volume on all 
nodes and `gluster peer status` is? connected on all nodes.

3. Cleanup or reformat '/nodirectwritedata/gluster/gvol0' on gfs3.

4. `gluster volume add-brick gvol0 replica 3 arbiter 1 
gfs3:/nodirectwritedata/gluster/gvol0` from gfs1.

5. Check that the files are getting healed on to the new brick.

Thanks,
Ravi
On 22/05/19 6:50 AM, David Cunningham wrote:> Hi Ravi,
>
> Certainly. On the existing two nodes:
>
> gfs1 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
> getfattr: Removing leading '/' from absolute path names
> # file: nodirectwritedata/gluster/gvol0
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gvol0-client-2=0x000000000000000000000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
>
> gfs2 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
> getfattr: Removing leading '/' from absolute path names
> # file: nodirectwritedata/gluster/gvol0
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.gvol0-client-0=0x000000000000000000000000
> trusted.afr.gvol0-client-2=0x000000000000000000000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
>
> On the new node:
>
> gfs3 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
> getfattr: Removing leading '/' from absolute path names
> # file: nodirectwritedata/gluster/gvol0
> trusted.afr.dirty=0x000000000000000000000001
> trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
>
> Output of "gluster volume info" is the same on all 3 nodes and
is:
>
> # gluster volume info
>
> Volume Name: gvol0
> Type: Replicate
> Volume ID: fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gfs1:/nodirectwritedata/gluster/gvol0
> Brick2: gfs2:/nodirectwritedata/gluster/gvol0
> Brick3: gfs3:/nodirectwritedata/gluster/gvol0 (arbiter)
> Options Reconfigured:
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
>
>
> On Wed, 22 May 2019 at 12:43, Ravishankar N <ravishankar at redhat.com 
> <mailto:ravishankar at redhat.com>> wrote:
>
>     Hi David,
>     Could you provide the `getfattr -d -m. -e hex
>     /nodirectwritedata/gluster/gvol0` output of all bricks and the
>     output of `gluster volume info`?
>
>     Thanks,
>     Ravi
>     On 22/05/19 4:57 AM, David Cunningham wrote:
>>     Hi Sanju,
>>
>>     Here's what glusterd.log says on the new arbiter server when
>>     trying to add the node:
>>
>>     [2019-05-22 00:15:05.963059] I [run.c:242:runner_log]
>>     (-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0x3b2cd)
>>     [0x7fe4ca9102cd]
>>     -->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0xe6b85)
>>     [0x7fe4ca9bbb85] -->/lib64/libglusterfs.so.0(runner_log+0x115)
>>     [0x7fe4d5ecc955] ) 0-management: Ran script:
>>    
/var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
>>     --volname=gvol0 --version=1 --volume-op=add-brick
>>     --gd-workdir=/var/lib/glusterd
>>     [2019-05-22 00:15:05.963177] I [MSGID: 106578]
>>     [glusterd-brick-ops.c:1355:glusterd_op_perform_add_bricks]
>>     0-management: replica-count is set 3
>>     [2019-05-22 00:15:05.963228] I [MSGID: 106578]
>>     [glusterd-brick-ops.c:1360:glusterd_op_perform_add_bricks]
>>     0-management: arbiter-count is set 1
>>     [2019-05-22 00:15:05.963257] I [MSGID: 106578]
>>     [glusterd-brick-ops.c:1364:glusterd_op_perform_add_bricks]
>>     0-management: type is set 0, need to change it
>>     [2019-05-22 00:15:17.015268] E [MSGID: 106053]
>>     [glusterd-utils.c:13942:glusterd_handle_replicate_brick_ops]
>>     0-management: Failed to set extended attribute trusted.add-brick
>>     : Transport endpoint is not connected [Transport endpoint is not
>>     connected]
>>     [2019-05-22 00:15:17.036479] E [MSGID: 106073]
>>     [glusterd-brick-ops.c:2595:glusterd_op_add_brick] 0-glusterd:
>>     Unable to add bricks
>>     [2019-05-22 00:15:17.036595] E [MSGID: 106122]
>>     [glusterd-mgmt.c:299:gd_mgmt_v3_commit_fn] 0-management:
>>     Add-brick commit failed.
>>     [2019-05-22 00:15:17.036710] E [MSGID: 106122]
>>     [glusterd-mgmt-handler.c:594:glusterd_handle_commit_fn]
>>     0-management: commit failed on operation Add brick
>>
>>     As before gvol0-add-brick-mount.log said:
>>
>>     [2019-05-22 00:15:17.005695] I [fuse-bridge.c:4267:fuse_init]
>>     0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs
>>     7.24 kernel 7.22
>>     [2019-05-22 00:15:17.005749] I
>>     [fuse-bridge.c:4878:fuse_graph_sync] 0-fuse: switched to graph 0
>>     [2019-05-22 00:15:17.010101] E
>>     [fuse-bridge.c:4336:fuse_first_lookup] 0-fuse: first lookup on
>>     root failed (Transport endpoint is not connected)
>>     [2019-05-22 00:15:17.014217] W [fuse-bridge.c:897:fuse_attr_cbk]
>>     0-glusterfs-fuse: 2: LOOKUP() / => -1 (Transport endpoint is not
>>     connected)
>>     [2019-05-22 00:15:17.015097] W
>>     [fuse-resolve.c:127:fuse_resolve_gfid_cbk] 0-fuse:
>>     00000000-0000-0000-0000-000000000001: failed to resolve
>>     (Transport endpoint is not connected)
>>     [2019-05-22 00:15:17.015158] W
>>     [fuse-bridge.c:3294:fuse_setxattr_resume] 0-glusterfs-fuse: 3:
>>     SETXATTR 00000000-0000-0000-0000-000000000001/1
>>     (trusted.add-brick) resolution failed
>>     [2019-05-22 00:15:17.035636] I
>>     [fuse-bridge.c:5144:fuse_thread_proc] 0-fuse: initating unmount
>>     of /tmp/mntYGNbj9
>>     [2019-05-22 00:15:17.035854] W
>>     [glusterfsd.c:1500:cleanup_and_exit]
>>     (-->/lib64/libpthread.so.0(+0x7dd5) [0x7f7745ccedd5]
>>     -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
[0x55c81b63de75]
>>     -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55c81b63dceb] )
>>     0-: received signum (15), shutting down
>>     [2019-05-22 00:15:17.035942] I [fuse-bridge.c:5914:fini] 0-fuse:
>>     Unmounting '/tmp/mntYGNbj9'.
>>     [2019-05-22 00:15:17.035966] I [fuse-bridge.c:5919:fini] 0-fuse:
>>     Closing fuse connection to '/tmp/mntYGNbj9'.
>>
>>     Here are the processes running on the new arbiter server:
>>     # ps -ef | grep gluster
>>     root????? 3466???? 1? 0 20:13 ? 00:00:00 /usr/sbin/glusterfs -s
>>     localhost --volfile-id gluster/glustershd -p
>>     /var/run/gluster/glustershd/glustershd.pid -l
>>     /var/log/glusterfs/glustershd.log -S
>>     /var/run/gluster/24c12b09f93eec8e.socket --xlator-option
>>     *replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
>>     --process-name glustershd
>>     root????? 6832???? 1? 0 May16 ? 00:02:10 /usr/sbin/glusterd -p
>>     /var/run/glusterd.pid --log-level INFO
>>     root???? 17841???? 1? 0 May16 ? 00:00:58 /usr/sbin/glusterfs
>>     --process-name fuse --volfile-server=gfs1 --volfile-id=/gvol0
>>     /mnt/glusterfs
>>
>>     Here are the files created on the new arbiter server:
>>     # find /nodirectwritedata/gluster/gvol0 | xargs ls -ald
>>     drwxr-xr-x 3 root root 4096 May 21 20:15
>>     /nodirectwritedata/gluster/gvol0
>>     drw------- 2 root root 4096 May 21 20:15
>>     /nodirectwritedata/gluster/gvol0/.glusterfs
>>
>>     Thank you for your help!
>>
>>
>>     On Tue, 21 May 2019 at 00:10, Sanju Rakonde <srakonde at
redhat.com
>>     <mailto:srakonde at redhat.com>> wrote:
>>
>>         David,
>>
>>         can you please attach glusterd.logs? As the error message
>>         says, Commit failed on the arbitar node, we might be able to
>>         find some issue on that node.
>>
>>         On Mon, May 20, 2019 at 10:10 AM Nithya Balachandran
>>         <nbalacha at redhat.com <mailto:nbalacha at
redhat.com>> wrote:
>>
>>
>>
>>             On Fri, 17 May 2019 at 06:01, David Cunningham
>>             <dcunningham at voisonics.com
>>             <mailto:dcunningham at voisonics.com>> wrote:
>>
>>                 Hello,
>>
>>                 We're adding an arbiter node to an existing volume
>>                 and having an issue. Can anyone help? The root cause
>>                 error appears to be
>>                 "00000000-0000-0000-0000-000000000001: failed to
>>                 resolve (Transport endpoint is not connected)", as
below.
>>
>>                 We are running glusterfs 5.6.1. Thanks in advance for
>>                 any assistance!
>>
>>                 On existing node gfs1, trying to add new arbiter node
>>                 gfs3:
>>
>>                 # gluster volume add-brick gvol0 replica 3 arbiter 1
>>                 gfs3:/nodirectwritedata/gluster/gvol0
>>                 volume add-brick: failed: Commit failed on gfs3.
>>                 Please check log file for details.
>>
>>
>>             This looks like a glusterd issue. Please check the
>>             glusterd logs for more info.
>>             Adding the glusterd dev to this thread. Sanju, can you
>>             take a look?
>>             Regards,
>>             Nithya
>>
>>
>>                 On new node gfs3 in gvol0-add-brick-mount.log:
>>
>>                 [2019-05-17 01:20:22.689721] I
>>                 [fuse-bridge.c:4267:fuse_init] 0-glusterfs-fuse: FUSE
>>                 inited with protocol versions: glusterfs 7.24 kernel
7.22
>>                 [2019-05-17 01:20:22.689778] I
>>                 [fuse-bridge.c:4878:fuse_graph_sync] 0-fuse: switched
>>                 to graph 0
>>                 [2019-05-17 01:20:22.694897] E
>>                 [fuse-bridge.c:4336:fuse_first_lookup] 0-fuse: first
>>                 lookup on root failed (Transport endpoint is not
>>                 connected)
>>                 [2019-05-17 01:20:22.699770] W
>>                 [fuse-resolve.c:127:fuse_resolve_gfid_cbk] 0-fuse:
>>                 00000000-0000-0000-0000-000000000001: failed to
>>                 resolve (Transport endpoint is not connected)
>>                 [2019-05-17 01:20:22.699834] W
>>                 [fuse-bridge.c:3294:fuse_setxattr_resume]
>>                 0-glusterfs-fuse: 2: SETXATTR
>>                 00000000-0000-0000-0000-000000000001/1
>>                 (trusted.add-brick) resolution failed
>>                 [2019-05-17 01:20:22.715656] I
>>                 [fuse-bridge.c:5144:fuse_thread_proc] 0-fuse:
>>                 initating unmount of /tmp/mntQAtu3f
>>                 [2019-05-17 01:20:22.715865] W
>>                 [glusterfsd.c:1500:cleanup_and_exit]
>>                 (-->/lib64/libpthread.so.0(+0x7dd5) [0x7fb223bf6dd5]
>>                 -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
>>                 [0x560886581e75]
>>                 -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
>>                 [0x560886581ceb] ) 0-: received signum (15), shutting
>>                 down
>>                 [2019-05-17 01:20:22.715926] I
>>                 [fuse-bridge.c:5914:fini] 0-fuse: Unmounting
>>                 '/tmp/mntQAtu3f'.
>>                 [2019-05-17 01:20:22.715953] I
>>                 [fuse-bridge.c:5919:fini] 0-fuse: Closing fuse
>>                 connection to '/tmp/mntQAtu3f'.
>>
>>                 Processes running on new node gfs3:
>>
>>                 # ps -ef | grep gluster
>>                 root????? 6832???? 1? 0 20:17 ???????? 00:00:00
>>                 /usr/sbin/glusterd -p /var/run/glusterd.pid
>>                 --log-level INFO
>>                 root???? 15799???? 1? 0 20:17 ???????? 00:00:00
>>                 /usr/sbin/glusterfs -s localhost --volfile-id
>>                 gluster/glustershd -p
>>                 /var/run/gluster/glustershd/glustershd.pid -l
>>                 /var/log/glusterfs/glustershd.log -S
>>                 /var/run/gluster/24c12b09f93eec8e.socket
>>                 --xlator-option
>>                
*replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
>>                 --process-name glustershd
>>                 root???? 16856 16735? 0 21:21 pts/0??? 00:00:00 grep
>>                 --color=auto gluster
>>
>>                 -- 
>>                 David Cunningham, Voisonics Limited
>>                 http://voisonics.com/
>>                 USA: +1 213 221 1092
>>                 New Zealand: +64 (0)28 2558 3782
>>                 _______________________________________________
>>                 Gluster-users mailing list
>>                 Gluster-users at gluster.org
>>                 <mailto:Gluster-users at gluster.org>
>>                
https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>>         -- 
>>         Thanks,
>>         Sanju
>>
>>
>>
>>     -- 
>>     David Cunningham, Voisonics Limited
>>     http://voisonics.com/
>>     USA: +1 213 221 1092
>>     New Zealand: +64 (0)28 2558 3782
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org  <mailto:Gluster-users at
gluster.org>
>>     https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> -- 
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190522/fe6dce71/attachment.html>

Gluster users - May 2019 - add-brick: failed: Commit failed

[Gluster-users] add-brick: failed: Commit failed

[Gluster-users] add-brick: failed: Commit failed