Failed to send a copy to the list:
Hi Xavi,
I stopped glusterd and killall glusterd glusterfs glusterfsd
and started glusterd again.
The only log that is not empty is glusterd.log, I attach the log
from the restart time. The brick log, glustershd.log and glfsheal-gds-common.log
is empty.
This are the errors in the log:
[2023-02-20 07:23:46.235263 +0000] E [MSGID: 106061]
[glusterd.c:597:glusterd_crt_georep_folders] 0-glusterd: Dict get failed
[{Key=log-group}, {errno=2}, {error=No such file or directory}]
[2023-02-20 07:23:47.359917 +0000] E [MSGID: 106010]
[glusterd-utils.c:3542:glusterd_compare_friend_volume] 0-management: Version of
Cksums gds-common differ. local cksum = 3017846959, remote cksum = 2065453698 on
peer urd-gds-031
[2023-02-20 07:23:47.438052 +0000] E [MSGID: 106010]
[glusterd-utils.c:3542:glusterd_compare_friend_volume] 0-management: Version of
Cksums gds-common differ. local cksum = 3017846959, remote cksum = 2065453698 on
peer urd-gds-032
Geo replication is not setup so I guess there is nothing strange that there is
an error regarding georep.
The checksum error seems natural to be there as the other nodes are still on
version 10.
My previous exprience with upgrades is that the local bricks starts and
gluster is up and running. No connection with the other nodes until they are
upgraded as well.
gluster peer status, gives the output:
Number of Peers: 2
Hostname: urd-gds-032
Uuid: e6f96ad2-0fea-4d80-bd42-8236dd0f8439
State: Peer Rejected (Connected)
Hostname: urd-gds-031
Uuid: 2d7c0ad7-dfcf-4eaf-9210-f879c7b406bf
State: Peer Rejected (Connected)
I suppose and guess that this is due to that the arbiter is version 11
and the other 2 nodes are version 10.
Please let me know if I can provide any other information
to try to solve this issue.
Many thanks!
Marcus
On Mon, Feb 20, 2023 at 07:29:20AM +0100, Xavi Hernandez
wrote:> CAUTION: This email originated from outside of the organization. Do not
click links or open attachments unless you recognize the sender and know the
content is safe.
>
>
> Hi Marcus,
>
> these errors shouldn't prevent the bricks from starting. Isn't
there any other error or warning ?
>
> Regards,
>
> Xavi
>
> On Fri, Feb 17, 2023 at 3:06 PM Marcus Peders?n <marcus.pedersen at
slu.se<mailto:marcus.pedersen at slu.se>> wrote:
> Hi all,
> I started an upgrade to gluster 11.0 from 10.3 on one of my clusters.
> OS: Debian bullseye
>
> Volume Name: gds-common
> Type: Replicate
> Volume ID: 42c9fa00-2d57-4a58-b5ae-c98c349cfcb6
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: urd-gds-031:/urd-gds/gds-common
> Brick2: urd-gds-032:/urd-gds/gds-common
> Brick3: urd-gds-030:/urd-gds/gds-common (arbiter)
> Options Reconfigured:
> cluster.granular-entry-heal: on
> storage.fips-mode-rchecksum: on
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
> I started with the arbiter node, stopped all of gluster
> upgraded to 11.0 and all went fine.
> After upgrade I was able to see the other nodes and
> all nodes were connected.
> After a reboot on the arbiter nothing works the way it should.
> Both brick1 and brick2 has connection but no connection
> with the arbiter.
> On the arbiter glusterd has started and is listening on port 24007,
> the problem seems to be glusterfsd, it never starts!
>
> If I run: gluster volume status
>
> Status of volume: gds-common
> Gluster process TCP Port RDMA Port Online
Pid
>
------------------------------------------------------------------------------
> Brick urd-gds-030:/urd-gds/gds-common N/A N/A N
N/A
> Self-heal Daemon on localhost N/A N/A N
N/A
>
> Task Status of Volume gds-common
>
------------------------------------------------------------------------------
> There are no active volume tasks
>
>
> In glusterd.log I find the following errors (arbiter node):
> [2023-02-17 12:30:40.519585 +0000] E [gf-io-uring.c:404:gf_io_uring_setup]
0-io: [MSGID:101240] Function call failed <{function=io_uring_setup()},
{error=12 (Cannot allocate memory)}>
> [2023-02-17 12:30:40.678031 +0000] E [MSGID: 106061]
[glusterd.c:597:glusterd_crt_georep_folders] 0-glusterd: Dict get failed
[{Key=log-group}, {errno=2}, {error=No such file or directory}]
>
> In brick/urd-gds-gds-common.log I find the following error:
> [2023-02-17 12:30:43.550753 +0000] E [gf-io-uring.c:404:gf_io_uring_setup]
0-io: [MSGID:101240] Function call failed <{function=io_uring_setup()},
{error=12 (Cannot allocate memory)}>
>
> I enclose both logfiles.
>
> How do I resolve this issue??
>
> Many thanks in advance!!
>
> Marcus
> ---
> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
[2023-02-20 07:23:22.343689 +0000] W [glusterfsd.c:1427:cleanup_and_exit]
(-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7) [0x7fe540933ea7]
-->/usr/sbin/glusterd(+0x125f5) [0x562911f155f5]
-->/usr/sbin/glusterd(cleanup_and_exit+0x57) [0x562911f0dd77] ) 0-: received
signum (15), shutting down
[2023-02-20 07:23:46.161159 +0000] I [MSGID: 100030] [glusterfsd.c:2872:main]
0-/usr/sbin/glusterd: Started running version [{arg=/usr/sbin/glusterd},
{version=11.0}, {cmdlinestr=/usr/sbin/glusterd -p /var/run/glusterd.pid
--log-level INFO}]
[2023-02-20 07:23:46.161529 +0000] I [glusterfsd.c:2562:daemonize] 0-glusterfs:
Pid of current running process is 291401
[2023-02-20 07:23:46.163500 +0000] I [MSGID: 0] [glusterfsd.c:1597:volfile_init]
0-glusterfsd-mgmt: volume not found, continuing with init
[2023-02-20 07:23:46.186377 +0000] I [MSGID: 106479] [glusterd.c:1660:init]
0-management: Using /var/lib/glusterd as working directory
[2023-02-20 07:23:46.186419 +0000] I [MSGID: 106479] [glusterd.c:1664:init]
0-management: Using /var/run/gluster as pid file working directory
[2023-02-20 07:23:46.191506 +0000] I [socket.c:973:__socket_server_bind]
0-socket.management: process started listening on port (24007)
[2023-02-20 07:23:46.192171 +0000] I [socket.c:916:__socket_server_bind]
0-socket.management: closing (AF_UNIX) reuse check socket 13
[2023-02-20 07:23:46.192350 +0000] I [MSGID: 106059] [glusterd.c:1923:init]
0-management: max-port override: 60999
[2023-02-20 07:23:46.235263 +0000] E [MSGID: 106061]
[glusterd.c:597:glusterd_crt_georep_folders] 0-glusterd: Dict get failed
[{Key=log-group}, {errno=2}, {error=No such file or directory}]
[2023-02-20 07:23:47.237899 +0000] I [MSGID: 106513]
[glusterd-store.c:2198:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 100000
[2023-02-20 07:23:47.245819 +0000] W [MSGID: 106204]
[glusterd-store.c:3273:glusterd_store_update_volinfo] 0-management: Unknown key:
tier-enabled
[2023-02-20 07:23:47.245877 +0000] W [MSGID: 106204]
[glusterd-store.c:3273:glusterd_store_update_volinfo] 0-management: Unknown key:
brick-0
[2023-02-20 07:23:47.245892 +0000] W [MSGID: 106204]
[glusterd-store.c:3273:glusterd_store_update_volinfo] 0-management: Unknown key:
brick-1
[2023-02-20 07:23:47.245904 +0000] W [MSGID: 106204]
[glusterd-store.c:3273:glusterd_store_update_volinfo] 0-management: Unknown key:
brick-2
[2023-02-20 07:23:47.250826 +0000] I [MSGID: 106544]
[glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
7290862d-4e05-4ff7-ae4d-5f36b1c933bc
[2023-02-20 07:23:47.286862 +0000] I [MSGID: 106498]
[glusterd-handler.c:3794:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2023-02-20 07:23:47.291579 +0000] I [MSGID: 106498]
[glusterd-handler.c:3794:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2023-02-20 07:23:47.291640 +0000] W [MSGID: 106061]
[glusterd-handler.c:3589:glusterd_transport_inet_options_build] 0-glusterd:
Failed to get tcp-user-timeout
[2023-02-20 07:23:47.291704 +0000] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2023-02-20 07:23:47.293158 +0000] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
Final graph:
+------------------------------------------------------------------------------+
1: volume management
2: type mgmt/glusterd
3: option rpc-auth.auth-glusterfs on
4: option rpc-auth.auth-unix on
5: option rpc-auth.auth-null on
6: option rpc-auth-allow-insecure on
7: option transport.listen-backlog 1024
8: option max-port 60999
9: option event-threads 1
10: option ping-timeout 0
11: option transport.socket.listen-port 24007
12: option transport.socket.read-fail-log off
13: option transport.socket.keepalive-interval 2
14: option transport.socket.keepalive-time 10
15: option transport-type socket
16: option working-directory /var/lib/glusterd
17: end-volume
18:
+------------------------------------------------------------------------------+
[2023-02-20 07:23:47.293147 +0000] W [MSGID: 106061]
[glusterd-handler.c:3589:glusterd_transport_inet_options_build] 0-glusterd:
Failed to get tcp-user-timeout
[2023-02-20 07:23:47.297871 +0000] I [MSGID: 101188]
[event-epoll.c:643:event_dispatch_epoll_worker] 0-epoll: Started thread with
index [{index=0}]
[2023-02-20 07:23:47.299977 +0000] I [MSGID: 106163]
[glusterd-handshake.c:1493:__glusterd_mgmt_hndsk_versions_ack] 0-management:
using the op-version 100000
[2023-02-20 07:23:47.350835 +0000] I [MSGID: 106163]
[glusterd-handshake.c:1493:__glusterd_mgmt_hndsk_versions_ack] 0-management:
using the op-version 100000
[2023-02-20 07:23:47.359667 +0000] I [MSGID: 106490]
[glusterd-handler.c:2691:__glusterd_handle_incoming_friend_req] 0-glusterd:
Received probe from uuid: 2d7c0ad7-dfcf-4eaf-9210-f879c7b406bf
[2023-02-20 07:23:47.359917 +0000] E [MSGID: 106010]
[glusterd-utils.c:3542:glusterd_compare_friend_volume] 0-management: Version of
Cksums gds-common differ. local cksum = 3017846959, remote cksum = 2065453698 on
peer urd-gds-031
[2023-02-20 07:23:47.360263 +0000] I [MSGID: 106493]
[glusterd-handler.c:3982:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to
urd-gds-031 (0), ret: 0, op_ret: -1
[2023-02-20 07:23:47.377631 +0000] I [MSGID: 106493]
[glusterd-rpc-ops.c:461:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from
uuid: e6f96ad2-0fea-4d80-bd42-8236dd0f8439, host: urd-gds-032, port: 0
[2023-02-20 07:23:47.386520 +0000] I [MSGID: 106493]
[glusterd-rpc-ops.c:461:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from
uuid: 2d7c0ad7-dfcf-4eaf-9210-f879c7b406bf, host: urd-gds-031, port: 0
[2023-02-20 07:23:47.437850 +0000] I [MSGID: 106490]
[glusterd-handler.c:2691:__glusterd_handle_incoming_friend_req] 0-glusterd:
Received probe from uuid: e6f96ad2-0fea-4d80-bd42-8236dd0f8439
[2023-02-20 07:23:47.438052 +0000] E [MSGID: 106010]
[glusterd-utils.c:3542:glusterd_compare_friend_volume] 0-management: Version of
Cksums gds-common differ. local cksum = 3017846959, remote cksum = 2065453698 on
peer urd-gds-032
[2023-02-20 07:23:47.438328 +0000] I [MSGID: 106493]
[glusterd-handler.c:3982:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to
urd-gds-032 (0), ret: 0, op_ret: -1
[2023-02-20 07:23:57.076740 +0000] I [MSGID: 106061]
[glusterd-utils.c:9577:glusterd_volume_status_copy_to_op_ctx_dict] 0-management:
Dict get failed [{Key=count}]
[2023-02-20 07:23:57.076978 +0000] I [MSGID: 106499]
[glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received
status volume req for volume gds-common
[2023-02-20 07:25:22.608430 +0000] I [MSGID: 106487]
[glusterd-handler.c:1452:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req
[2023-02-20 07:26:39.156882 +0000] I [MSGID: 106061]
[glusterd-utils.c:9577:glusterd_volume_status_copy_to_op_ctx_dict] 0-management:
Dict get failed [{Key=count}]
[2023-02-20 07:26:39.157119 +0000] I [MSGID: 106499]
[glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received
status volume req for volume gds-common
[2023-02-20 07:27:22.923216 +0000] I [MSGID: 106487]
[glusterd-handler.c:1452:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req
---
N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>