Amudhan P
2019-Jan-17 06:04 UTC
[Gluster-users] glusterfs 4.1.6 error in starting glusterd service
I have created the folder in the path as said but still, service failed to start below is the error msg in glusterd.log [2019-01-16 14:50:14.555742] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid) [2019-01-16 14:50:14.559835] I [MSGID: 106478] [glusterd.c:1423:init] 0-management: Maximum allowed open file descriptors set to 65536 [2019-01-16 14:50:14.559894] I [MSGID: 106479] [glusterd.c:1481:init] 0-management: Using /var/lib/glusterd as working directory [2019-01-16 14:50:14.559912] I [MSGID: 106479] [glusterd.c:1486:init] 0-management: Using /var/run/gluster as pid file working directory [2019-01-16 14:50:14.563834] W [MSGID: 103071] [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device] [2019-01-16 14:50:14.563867] W [MSGID: 103055] [rdma.c:4938:init] 0-rdma.management: Failed to initialize IB Device [2019-01-16 14:50:14.563882] W [rpc-transport.c:351:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2019-01-16 14:50:14.563957] W [rpcsvc.c:1781:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed [2019-01-16 14:50:14.563974] E [MSGID: 106244] [glusterd.c:1764:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2019-01-16 14:50:15.565868] I [MSGID: 106513] [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 40100 [2019-01-16 14:50:15.642532] I [MSGID: 106544] [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: d6bf51a7-c296-492f-8dac-e81efa9dd22d [2019-01-16 14:50:15.675333] I [MSGID: 106498] [glusterd-handler.c:3614:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2019-01-16 14:50:15.675421] W [MSGID: 106061] [glusterd-handler.c:3408:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout [2019-01-16 14:50:15.675451] I [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 *[2019-01-16 14:50:15.676912] E [MSGID: 106187] [glusterd-store.c:4662:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore* *[2019-01-16 14:50:15.676956] E [MSGID: 101019] [xlator.c:720:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again* [2019-01-16 14:50:15.676973] E [MSGID: 101066] [graph.c:367:glusterfs_graph_init] 0-management: initializing translator failed [2019-01-16 14:50:15.676986] E [MSGID: 101176] [graph.c:738:glusterfs_graph_activate] 0-graph: init failed [2019-01-16 14:50:15.677479] W [glusterfsd.c:1514:cleanup_and_exit] (-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52] -->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41] -->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-: received signum (-1), shutting down On Thu, Jan 17, 2019 at 8:06 AM Atin Mukherjee <amukherj at redhat.com> wrote:> If gluster volume info/status shows the brick to be /media/disk4/brick4 > then you'd need to mount the same path and hence you'd need to create the > brick4 directory explicitly. I fail to understand the rationale how only > /media/disk4 can be used as the mount path for the brick. > > On Wed, Jan 16, 2019 at 5:24 PM Amudhan P <amudhan83 at gmail.com> wrote: > >> Yes, I did mount bricks but the folder 'brick4' was still not created >> inside the brick. >> Do I need to create this folder because when I run replace-brick it will >> create folder inside the brick. I have seen this behavior before when >> running replace-brick or heal begins. >> >> On Wed, Jan 16, 2019 at 5:05 PM Atin Mukherjee <amukherj at redhat.com> >> wrote: >> >>> >>> >>> On Wed, Jan 16, 2019 at 5:02 PM Amudhan P <amudhan83 at gmail.com> wrote: >>> >>>> Atin, >>>> I have copied the content of 'gfs-tst' from vol folder in another node. >>>> when starting service again fails with error msg in glusterd.log file. >>>> >>>> [2019-01-15 20:16:59.513023] I [MSGID: 100030] [glusterfsd.c:2741:main] >>>> 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd >>>> version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid) >>>> [2019-01-15 20:16:59.517164] I [MSGID: 106478] [glusterd.c:1423:init] >>>> 0-management: Maximum allowed open file descriptors set to 65536 >>>> [2019-01-15 20:16:59.517264] I [MSGID: 106479] [glusterd.c:1481:init] >>>> 0-management: Using /var/lib/glusterd as working directory >>>> [2019-01-15 20:16:59.517283] I [MSGID: 106479] [glusterd.c:1486:init] >>>> 0-management: Using /var/run/gluster as pid file working directory >>>> [2019-01-15 20:16:59.521508] W [MSGID: 103071] >>>> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>>> channel creation failed [No such device] >>>> [2019-01-15 20:16:59.521544] W [MSGID: 103055] [rdma.c:4938:init] >>>> 0-rdma.management: Failed to initialize IB Device >>>> [2019-01-15 20:16:59.521562] W [rpc-transport.c:351:rpc_transport_load] >>>> 0-rpc-transport: 'rdma' initialization failed >>>> [2019-01-15 20:16:59.521629] W [rpcsvc.c:1781:rpcsvc_create_listener] >>>> 0-rpc-service: cannot create listener, initing the transport failed >>>> [2019-01-15 20:16:59.521648] E [MSGID: 106244] [glusterd.c:1764:init] >>>> 0-management: creation of 1 listeners failed, continuing with succeeded >>>> transport >>>> [2019-01-15 20:17:00.529390] I [MSGID: 106513] >>>> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved >>>> op-version: 40100 >>>> [2019-01-15 20:17:00.608354] I [MSGID: 106544] >>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: >>>> d6bf51a7-c296-492f-8dac-e81efa9dd22d >>>> [2019-01-15 20:17:00.650911] W [MSGID: 106425] >>>> [glusterd-store.c:2643:glusterd_store_retrieve_bricks] 0-management: failed >>>> to get statfs() call on brick /media/disk4/brick4 [No such file or >>>> directory] >>>> >>> >>> This means that underlying brick /media/disk4/brick4 doesn't exist. You >>> already mentioned that you had replaced the faulty disk, but have you not >>> mounted it yet? >>> >>> >>>> [2019-01-15 20:17:00.691240] I [MSGID: 106498] >>>> [glusterd-handler.c:3614:glusterd_friend_add_from_peerinfo] 0-management: >>>> connect returned 0 >>>> [2019-01-15 20:17:00.691307] W [MSGID: 106061] >>>> [glusterd-handler.c:3408:glusterd_transport_inet_options_build] 0-glusterd: >>>> Failed to get tcp-user-timeout >>>> [2019-01-15 20:17:00.691331] I >>>> [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting >>>> frame-timeout to 600 >>>> [2019-01-15 20:17:00.692547] E [MSGID: 106187] >>>> [glusterd-store.c:4662:glusterd_resolve_all_bricks] 0-glusterd: resolve >>>> brick failed in restore >>>> [2019-01-15 20:17:00.692582] E [MSGID: 101019] >>>> [xlator.c:720:xlator_init] 0-management: Initialization of volume >>>> 'management' failed, review your volfile again >>>> [2019-01-15 20:17:00.692597] E [MSGID: 101066] >>>> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator >>>> failed >>>> [2019-01-15 20:17:00.692607] E [MSGID: 101176] >>>> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed >>>> [2019-01-15 20:17:00.693004] W [glusterfsd.c:1514:cleanup_and_exit] >>>> (-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52] >>>> -->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41] >>>> -->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-: >>>> received signum (-1), shutting down >>>> >>>> >>>> On Wed, Jan 16, 2019 at 4:34 PM Atin Mukherjee <amukherj at redhat.com> >>>> wrote: >>>> >>>>> This is a case of partial write of a transaction and as the host ran >>>>> out of space for the root partition where all the glusterd related >>>>> configurations are persisted, the transaction couldn't be written and hence >>>>> the new (replaced) brick's information wasn't persisted in the >>>>> configuration. The workaround for this is to copy the content of >>>>> /var/lib/glusterd/vols/gfs-tst/ from one of the nodes in the trusted >>>>> storage pool to the node where glusterd service fails to come up and post >>>>> that restarting the glusterd service should be able to make peer status >>>>> reporting all nodes healthy and connected. >>>>> >>>>> On Wed, Jan 16, 2019 at 3:49 PM Amudhan P <amudhan83 at gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> In short, when I started glusterd service I am getting following >>>>>> error msg in the glusterd.log file in one server. >>>>>> what needs to be done? >>>>>> >>>>>> error logged in glusterd.log >>>>>> >>>>>> [2019-01-15 17:50:13.956053] I [MSGID: 100030] >>>>>> [glusterfsd.c:2741:main] 0-/usr/local/sbin/glusterd: Started running >>>>>> /usr/local/sbin/glusterd version 4.1.6 (args: /usr/local/sbin/glusterd -p >>>>>> /var/run/glusterd.pid) >>>>>> [2019-01-15 17:50:13.960131] I [MSGID: 106478] [glusterd.c:1423:init] >>>>>> 0-management: Maximum allowed open file descriptors set to 65536 >>>>>> [2019-01-15 17:50:13.960193] I [MSGID: 106479] [glusterd.c:1481:init] >>>>>> 0-management: Using /var/lib/glusterd as working directory >>>>>> [2019-01-15 17:50:13.960212] I [MSGID: 106479] [glusterd.c:1486:init] >>>>>> 0-management: Using /var/run/gluster as pid file working directory >>>>>> [2019-01-15 17:50:13.964437] W [MSGID: 103071] >>>>>> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>>>>> channel creation failed [No such device] >>>>>> [2019-01-15 17:50:13.964474] W [MSGID: 103055] [rdma.c:4938:init] >>>>>> 0-rdma.management: Failed to initialize IB Device >>>>>> [2019-01-15 17:50:13.964491] W >>>>>> [rpc-transport.c:351:rpc_transport_load] 0-rpc-transport: 'rdma' >>>>>> initialization failed >>>>>> [2019-01-15 17:50:13.964560] W [rpcsvc.c:1781:rpcsvc_create_listener] >>>>>> 0-rpc-service: cannot create listener, initing the transport failed >>>>>> [2019-01-15 17:50:13.964579] E [MSGID: 106244] [glusterd.c:1764:init] >>>>>> 0-management: creation of 1 listeners failed, continuing with succeeded >>>>>> transport >>>>>> [2019-01-15 17:50:14.967681] I [MSGID: 106513] >>>>>> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved >>>>>> op-version: 40100 >>>>>> [2019-01-15 17:50:14.973931] I [MSGID: 106544] >>>>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: >>>>>> d6bf51a7-c296-492f-8dac-e81efa9dd22d >>>>>> [2019-01-15 17:50:15.046620] E [MSGID: 101032] >>>>>> [store.c:441:gf_store_handle_retrieve] 0-: Path corresponding to >>>>>> /var/lib/glusterd/vols/gfs-tst/bricks/IP.3:-media-disk3-brick3. [No such >>>>>> file or directory] >>>>>> [2019-01-15 17:50:15.046685] E [MSGID: 106201] >>>>>> [glusterd-store.c:3384:glusterd_store_retrieve_volumes] 0-management: >>>>>> Unable to restore volume: gfs-tst >>>>>> [2019-01-15 17:50:15.046718] E [MSGID: 101019] >>>>>> [xlator.c:720:xlator_init] 0-management: Initialization of volume >>>>>> 'management' failed, review your volfile again >>>>>> [2019-01-15 17:50:15.046732] E [MSGID: 101066] >>>>>> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator >>>>>> failed >>>>>> [2019-01-15 17:50:15.046741] E [MSGID: 101176] >>>>>> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed >>>>>> [2019-01-15 17:50:15.047171] W [glusterfsd.c:1514:cleanup_and_exit] >>>>>> (-->/usr/local/sbin/glusterd(glusterfs_volumes >>>>>> >>>>>> >>>>>> >>>>>> In long, I am trying to simulate a situation. where volume stoped >>>>>> abnormally and >>>>>> entire cluster restarted with some missing disks. >>>>>> >>>>>> My test cluster is set up with 3 nodes and each has four disks, I >>>>>> have setup a volume with disperse 4+2. >>>>>> In Node-3 2 disks have failed, to replace I have shutdown all system >>>>>> >>>>>> below are the steps done. >>>>>> >>>>>> 1. umount from client machine >>>>>> 2. shutdown all system by running `shutdown -h now` command ( without >>>>>> stopping volume and stop service) >>>>>> 3. replace faulty disk in Node-3 >>>>>> 4. powered ON all system >>>>>> 5. format replaced drives, and mount all drives >>>>>> 6. start glusterd service in all node (success) >>>>>> 7. Now running `voulume status` command from node-3 >>>>>> output : [2019-01-15 16:52:17.718422] : v status : FAILED : Staging >>>>>> failed on 0083ec0c-40bf-472a-a128-458924e56c96. Please check log file for >>>>>> details. >>>>>> 8. running `voulume start gfs-tst` command from node-3 >>>>>> output : [2019-01-15 16:53:19.410252] : v start gfs-tst : FAILED : >>>>>> Volume gfs-tst already started >>>>>> >>>>>> 9. running `gluster v status` in other node. showing all brick >>>>>> available but 'self-heal daemon' not running >>>>>> @gfstst-node2:~$ sudo gluster v status >>>>>> Status of volume: gfs-tst >>>>>> Gluster process TCP Port RDMA Port >>>>>> Online Pid >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Brick IP.2:/media/disk1/brick1 49152 0 Y >>>>>> 1517 >>>>>> Brick IP.4:/media/disk1/brick1 49152 0 Y >>>>>> 1668 >>>>>> Brick IP.2:/media/disk2/brick2 49153 0 Y >>>>>> 1522 >>>>>> Brick IP.4:/media/disk2/brick2 49153 0 Y >>>>>> 1678 >>>>>> Brick IP.2:/media/disk3/brick3 49154 0 Y >>>>>> 1527 >>>>>> Brick IP.4:/media/disk3/brick3 49154 0 Y >>>>>> 1677 >>>>>> Brick IP.2:/media/disk4/brick4 49155 0 Y >>>>>> 1541 >>>>>> Brick IP.4:/media/disk4/brick4 49155 0 Y >>>>>> 1683 >>>>>> Self-heal Daemon on localhost N/A N/A Y >>>>>> 2662 >>>>>> Self-heal Daemon on IP.4 N/A N/A Y >>>>>> 2786 >>>>>> >>>>>> 10. in the above output 'volume already started'. so, running >>>>>> `reset-brick` command >>>>>> v reset-brick gfs-tst IP.3:/media/disk3/brick3 >>>>>> IP.3:/media/disk3/brick3 commit force >>>>>> >>>>>> output : [2019-01-15 16:57:37.916942] : v reset-brick gfs-tst >>>>>> IP.3:/media/disk3/brick3 IP.3:/media/disk3/brick3 commit force : FAILED : >>>>>> /media/disk3/brick3 is already part of a volume >>>>>> >>>>>> 11. reset-brick command was not working, so, tried stopping volume >>>>>> and start with force command >>>>>> output : [2019-01-15 17:01:04.570794] : v start gfs-tst force : >>>>>> FAILED : Pre-validation failed on localhost. Please check log file for >>>>>> details >>>>>> >>>>>> 12. now stopped service in all node and tried starting again. except >>>>>> node-3 other nodes service started successfully without any issues. >>>>>> >>>>>> in node-3 receiving following message. >>>>>> >>>>>> sudo service glusterd start >>>>>> * Starting glusterd service glusterd >>>>>> >>>>>> [fail] >>>>>> /usr/local/sbin/glusterd: option requires an argument -- 'f' >>>>>> Try `glusterd --help' or `glusterd --usage' for more information. >>>>>> >>>>>> 13. checking glusterd log file found that OS drive was running out of >>>>>> space >>>>>> output : [2019-01-15 16:51:37.210792] W [MSGID: 101012] >>>>>> [store.c:372:gf_store_save_value] 0-management: fflush failed. [No space >>>>>> left on device] >>>>>> [2019-01-15 16:51:37.210874] E [MSGID: 106190] >>>>>> [glusterd-store.c:1058:glusterd_volume_exclude_options_write] 0-management: >>>>>> Unable to write volume values for gfs-tst >>>>>> >>>>>> 14. cleared some space in OS drive but still, service is not running. >>>>>> below is the error logged in glusterd.log >>>>>> >>>>>> [2019-01-15 17:50:13.956053] I [MSGID: 100030] >>>>>> [glusterfsd.c:2741:main] 0-/usr/local/sbin/glusterd: Started running >>>>>> /usr/local/sbin/glusterd version 4.1.6 (args: /usr/local/sbin/glusterd -p >>>>>> /var/run/glusterd.pid) >>>>>> [2019-01-15 17:50:13.960131] I [MSGID: 106478] [glusterd.c:1423:init] >>>>>> 0-management: Maximum allowed open file descriptors set to 65536 >>>>>> [2019-01-15 17:50:13.960193] I [MSGID: 106479] [glusterd.c:1481:init] >>>>>> 0-management: Using /var/lib/glusterd as working directory >>>>>> [2019-01-15 17:50:13.960212] I [MSGID: 106479] [glusterd.c:1486:init] >>>>>> 0-management: Using /var/run/gluster as pid file working directory >>>>>> [2019-01-15 17:50:13.964437] W [MSGID: 103071] >>>>>> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>>>>> channel creation failed [No such device] >>>>>> [2019-01-15 17:50:13.964474] W [MSGID: 103055] [rdma.c:4938:init] >>>>>> 0-rdma.management: Failed to initialize IB Device >>>>>> [2019-01-15 17:50:13.964491] W >>>>>> [rpc-transport.c:351:rpc_transport_load] 0-rpc-transport: 'rdma' >>>>>> initialization failed >>>>>> [2019-01-15 17:50:13.964560] W [rpcsvc.c:1781:rpcsvc_create_listener] >>>>>> 0-rpc-service: cannot create listener, initing the transport failed >>>>>> [2019-01-15 17:50:13.964579] E [MSGID: 106244] [glusterd.c:1764:init] >>>>>> 0-management: creation of 1 listeners failed, continuing with succeeded >>>>>> transport >>>>>> [2019-01-15 17:50:14.967681] I [MSGID: 106513] >>>>>> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved >>>>>> op-version: 40100 >>>>>> [2019-01-15 17:50:14.973931] I [MSGID: 106544] >>>>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: >>>>>> d6bf51a7-c296-492f-8dac-e81efa9dd22d >>>>>> [2019-01-15 17:50:15.046620] E [MSGID: 101032] >>>>>> [store.c:441:gf_store_handle_retrieve] 0-: Path corresponding to >>>>>> /var/lib/glusterd/vols/gfs-tst/bricks/IP.3:-media-disk3-brick3. [No such >>>>>> file or directory] >>>>>> [2019-01-15 17:50:15.046685] E [MSGID: 106201] >>>>>> [glusterd-store.c:3384:glusterd_store_retrieve_volumes] 0-management: >>>>>> Unable to restore volume: gfs-tst >>>>>> [2019-01-15 17:50:15.046718] E [MSGID: 101019] >>>>>> [xlator.c:720:xlator_init] 0-management: Initialization of volume >>>>>> 'management' failed, review your volfile again >>>>>> [2019-01-15 17:50:15.046732] E [MSGID: 101066] >>>>>> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator >>>>>> failed >>>>>> [2019-01-15 17:50:15.046741] E [MSGID: 101176] >>>>>> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed >>>>>> [2019-01-15 17:50:15.047171] W [glusterfsd.c:1514:cleanup_and_exit] >>>>>> (-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52] >>>>>> -->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41] >>>>>> -->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-: >>>>>> received signum (-1), shutting down >>>>>> >>>>>> >>>>>> 15. In other node running `volume status' still shows bricks node3 is >>>>>> live >>>>>> but 'peer status' showing node-3 disconnected >>>>>> >>>>>> @gfstst-node2:~$ sudo gluster v status >>>>>> Status of volume: gfs-tst >>>>>> Gluster process TCP Port RDMA Port >>>>>> Online Pid >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Brick IP.2:/media/disk1/brick1 49152 0 Y >>>>>> 1517 >>>>>> Brick IP.4:/media/disk1/brick1 49152 0 Y >>>>>> 1668 >>>>>> Brick IP.2:/media/disk2/brick2 49153 0 Y >>>>>> 1522 >>>>>> Brick IP.4:/media/disk2/brick2 49153 0 Y >>>>>> 1678 >>>>>> Brick IP.2:/media/disk3/brick3 49154 0 Y >>>>>> 1527 >>>>>> Brick IP.4:/media/disk3/brick3 49154 0 Y >>>>>> 1677 >>>>>> Brick IP.2:/media/disk4/brick4 49155 0 Y >>>>>> 1541 >>>>>> Brick IP.4:/media/disk4/brick4 49155 0 Y >>>>>> 1683 >>>>>> Self-heal Daemon on localhost N/A N/A Y >>>>>> 2662 >>>>>> Self-heal Daemon on IP.4 N/A N/A Y >>>>>> 2786 >>>>>> >>>>>> Task Status of Volume gfs-tst >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> There are no active volume tasks >>>>>> >>>>>> >>>>>> root at gfstst-node2:~$ sudo gluster pool list >>>>>> UUID Hostname State >>>>>> d6bf51a7-c296-492f-8dac-e81efa9dd22d IP.3 Disconnected >>>>>> c1cbb58e-3ceb-4637-9ba3-3d28ef20b143 IP.4 Connected >>>>>> 0083ec0c-40bf-472a-a128-458924e56c96 localhost Connected >>>>>> >>>>>> root at gfstst-node2:~$ sudo gluster peer status >>>>>> Number of Peers: 2 >>>>>> >>>>>> Hostname: IP.3 >>>>>> Uuid: d6bf51a7-c296-492f-8dac-e81efa9dd22d >>>>>> State: Peer in Cluster (Disconnected) >>>>>> >>>>>> Hostname: IP.4 >>>>>> Uuid: c1cbb58e-3ceb-4637-9ba3-3d28ef20b143 >>>>>> State: Peer in Cluster (Connected) >>>>>> >>>>>> >>>>>> regards >>>>>> Amudhan >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190117/99df05f2/attachment.html>
Atin Mukherjee
2019-Jan-17 10:13 UTC
[Gluster-users] glusterfs 4.1.6 error in starting glusterd service
Can you please run 'glusterd -LDEBUG' and share back the glusterd.log? Instead of doing too many back and forth I suggest you to share the content of /var/lib/glusterd from all the nodes. Also do mention which particular node the glusterd service is unable to come up. On Thu, Jan 17, 2019 at 11:34 AM Amudhan P <amudhan83 at gmail.com> wrote:> I have created the folder in the path as said but still, service failed to > start below is the error msg in glusterd.log > > [2019-01-16 14:50:14.555742] I [MSGID: 100030] [glusterfsd.c:2741:main] > 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd > version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid) > [2019-01-16 14:50:14.559835] I [MSGID: 106478] [glusterd.c:1423:init] > 0-management: Maximum allowed open file descriptors set to 65536 > [2019-01-16 14:50:14.559894] I [MSGID: 106479] [glusterd.c:1481:init] > 0-management: Using /var/lib/glusterd as working directory > [2019-01-16 14:50:14.559912] I [MSGID: 106479] [glusterd.c:1486:init] > 0-management: Using /var/run/gluster as pid file working directory > [2019-01-16 14:50:14.563834] W [MSGID: 103071] > [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event > channel creation failed [No such device] > [2019-01-16 14:50:14.563867] W [MSGID: 103055] [rdma.c:4938:init] > 0-rdma.management: Failed to initialize IB Device > [2019-01-16 14:50:14.563882] W [rpc-transport.c:351:rpc_transport_load] > 0-rpc-transport: 'rdma' initialization failed > [2019-01-16 14:50:14.563957] W [rpcsvc.c:1781:rpcsvc_create_listener] > 0-rpc-service: cannot create listener, initing the transport failed > [2019-01-16 14:50:14.563974] E [MSGID: 106244] [glusterd.c:1764:init] > 0-management: creation of 1 listeners failed, continuing with succeeded > transport > [2019-01-16 14:50:15.565868] I [MSGID: 106513] > [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved > op-version: 40100 > [2019-01-16 14:50:15.642532] I [MSGID: 106544] > [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: > d6bf51a7-c296-492f-8dac-e81efa9dd22d > [2019-01-16 14:50:15.675333] I [MSGID: 106498] > [glusterd-handler.c:3614:glusterd_friend_add_from_peerinfo] 0-management: > connect returned 0 > [2019-01-16 14:50:15.675421] W [MSGID: 106061] > [glusterd-handler.c:3408:glusterd_transport_inet_options_build] 0-glusterd: > Failed to get tcp-user-timeout > [2019-01-16 14:50:15.675451] I [rpc-clnt.c:1059:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > *[2019-01-16 14:50:15.676912] E [MSGID: 106187] > [glusterd-store.c:4662:glusterd_resolve_all_bricks] 0-glusterd: resolve > brick failed in restore* > *[2019-01-16 14:50:15.676956] E [MSGID: 101019] [xlator.c:720:xlator_init] > 0-management: Initialization of volume 'management' failed, review your > volfile again* > [2019-01-16 14:50:15.676973] E [MSGID: 101066] > [graph.c:367:glusterfs_graph_init] 0-management: initializing translator > failed > [2019-01-16 14:50:15.676986] E [MSGID: 101176] > [graph.c:738:glusterfs_graph_activate] 0-graph: init failed > [2019-01-16 14:50:15.677479] W [glusterfsd.c:1514:cleanup_and_exit] > (-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52] > -->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41] > -->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-: > received signum (-1), shutting down > > > On Thu, Jan 17, 2019 at 8:06 AM Atin Mukherjee <amukherj at redhat.com> > wrote: > >> If gluster volume info/status shows the brick to be /media/disk4/brick4 >> then you'd need to mount the same path and hence you'd need to create the >> brick4 directory explicitly. I fail to understand the rationale how only >> /media/disk4 can be used as the mount path for the brick. >> >> On Wed, Jan 16, 2019 at 5:24 PM Amudhan P <amudhan83 at gmail.com> wrote: >> >>> Yes, I did mount bricks but the folder 'brick4' was still not created >>> inside the brick. >>> Do I need to create this folder because when I run replace-brick it will >>> create folder inside the brick. I have seen this behavior before when >>> running replace-brick or heal begins. >>> >>> On Wed, Jan 16, 2019 at 5:05 PM Atin Mukherjee <amukherj at redhat.com> >>> wrote: >>> >>>> >>>> >>>> On Wed, Jan 16, 2019 at 5:02 PM Amudhan P <amudhan83 at gmail.com> wrote: >>>> >>>>> Atin, >>>>> I have copied the content of 'gfs-tst' from vol folder in another >>>>> node. when starting service again fails with error msg in glusterd.log file. >>>>> >>>>> [2019-01-15 20:16:59.513023] I [MSGID: 100030] >>>>> [glusterfsd.c:2741:main] 0-/usr/local/sbin/glusterd: Started running >>>>> /usr/local/sbin/glusterd version 4.1.6 (args: /usr/local/sbin/glusterd -p >>>>> /var/run/glusterd.pid) >>>>> [2019-01-15 20:16:59.517164] I [MSGID: 106478] [glusterd.c:1423:init] >>>>> 0-management: Maximum allowed open file descriptors set to 65536 >>>>> [2019-01-15 20:16:59.517264] I [MSGID: 106479] [glusterd.c:1481:init] >>>>> 0-management: Using /var/lib/glusterd as working directory >>>>> [2019-01-15 20:16:59.517283] I [MSGID: 106479] [glusterd.c:1486:init] >>>>> 0-management: Using /var/run/gluster as pid file working directory >>>>> [2019-01-15 20:16:59.521508] W [MSGID: 103071] >>>>> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>>>> channel creation failed [No such device] >>>>> [2019-01-15 20:16:59.521544] W [MSGID: 103055] [rdma.c:4938:init] >>>>> 0-rdma.management: Failed to initialize IB Device >>>>> [2019-01-15 20:16:59.521562] W >>>>> [rpc-transport.c:351:rpc_transport_load] 0-rpc-transport: 'rdma' >>>>> initialization failed >>>>> [2019-01-15 20:16:59.521629] W [rpcsvc.c:1781:rpcsvc_create_listener] >>>>> 0-rpc-service: cannot create listener, initing the transport failed >>>>> [2019-01-15 20:16:59.521648] E [MSGID: 106244] [glusterd.c:1764:init] >>>>> 0-management: creation of 1 listeners failed, continuing with succeeded >>>>> transport >>>>> [2019-01-15 20:17:00.529390] I [MSGID: 106513] >>>>> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved >>>>> op-version: 40100 >>>>> [2019-01-15 20:17:00.608354] I [MSGID: 106544] >>>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: >>>>> d6bf51a7-c296-492f-8dac-e81efa9dd22d >>>>> [2019-01-15 20:17:00.650911] W [MSGID: 106425] >>>>> [glusterd-store.c:2643:glusterd_store_retrieve_bricks] 0-management: failed >>>>> to get statfs() call on brick /media/disk4/brick4 [No such file or >>>>> directory] >>>>> >>>> >>>> This means that underlying brick /media/disk4/brick4 doesn't exist. You >>>> already mentioned that you had replaced the faulty disk, but have you not >>>> mounted it yet? >>>> >>>> >>>>> [2019-01-15 20:17:00.691240] I [MSGID: 106498] >>>>> [glusterd-handler.c:3614:glusterd_friend_add_from_peerinfo] 0-management: >>>>> connect returned 0 >>>>> [2019-01-15 20:17:00.691307] W [MSGID: 106061] >>>>> [glusterd-handler.c:3408:glusterd_transport_inet_options_build] 0-glusterd: >>>>> Failed to get tcp-user-timeout >>>>> [2019-01-15 20:17:00.691331] I >>>>> [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting >>>>> frame-timeout to 600 >>>>> [2019-01-15 20:17:00.692547] E [MSGID: 106187] >>>>> [glusterd-store.c:4662:glusterd_resolve_all_bricks] 0-glusterd: resolve >>>>> brick failed in restore >>>>> [2019-01-15 20:17:00.692582] E [MSGID: 101019] >>>>> [xlator.c:720:xlator_init] 0-management: Initialization of volume >>>>> 'management' failed, review your volfile again >>>>> [2019-01-15 20:17:00.692597] E [MSGID: 101066] >>>>> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator >>>>> failed >>>>> [2019-01-15 20:17:00.692607] E [MSGID: 101176] >>>>> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed >>>>> [2019-01-15 20:17:00.693004] W [glusterfsd.c:1514:cleanup_and_exit] >>>>> (-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52] >>>>> -->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41] >>>>> -->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-: >>>>> received signum (-1), shutting down >>>>> >>>>> >>>>> On Wed, Jan 16, 2019 at 4:34 PM Atin Mukherjee <amukherj at redhat.com> >>>>> wrote: >>>>> >>>>>> This is a case of partial write of a transaction and as the host ran >>>>>> out of space for the root partition where all the glusterd related >>>>>> configurations are persisted, the transaction couldn't be written and hence >>>>>> the new (replaced) brick's information wasn't persisted in the >>>>>> configuration. The workaround for this is to copy the content of >>>>>> /var/lib/glusterd/vols/gfs-tst/ from one of the nodes in the trusted >>>>>> storage pool to the node where glusterd service fails to come up and post >>>>>> that restarting the glusterd service should be able to make peer status >>>>>> reporting all nodes healthy and connected. >>>>>> >>>>>> On Wed, Jan 16, 2019 at 3:49 PM Amudhan P <amudhan83 at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> In short, when I started glusterd service I am getting following >>>>>>> error msg in the glusterd.log file in one server. >>>>>>> what needs to be done? >>>>>>> >>>>>>> error logged in glusterd.log >>>>>>> >>>>>>> [2019-01-15 17:50:13.956053] I [MSGID: 100030] >>>>>>> [glusterfsd.c:2741:main] 0-/usr/local/sbin/glusterd: Started running >>>>>>> /usr/local/sbin/glusterd version 4.1.6 (args: /usr/local/sbin/glusterd -p >>>>>>> /var/run/glusterd.pid) >>>>>>> [2019-01-15 17:50:13.960131] I [MSGID: 106478] >>>>>>> [glusterd.c:1423:init] 0-management: Maximum allowed open file descriptors >>>>>>> set to 65536 >>>>>>> [2019-01-15 17:50:13.960193] I [MSGID: 106479] >>>>>>> [glusterd.c:1481:init] 0-management: Using /var/lib/glusterd as working >>>>>>> directory >>>>>>> [2019-01-15 17:50:13.960212] I [MSGID: 106479] >>>>>>> [glusterd.c:1486:init] 0-management: Using /var/run/gluster as pid file >>>>>>> working directory >>>>>>> [2019-01-15 17:50:13.964437] W [MSGID: 103071] >>>>>>> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>>>>>> channel creation failed [No such device] >>>>>>> [2019-01-15 17:50:13.964474] W [MSGID: 103055] [rdma.c:4938:init] >>>>>>> 0-rdma.management: Failed to initialize IB Device >>>>>>> [2019-01-15 17:50:13.964491] W >>>>>>> [rpc-transport.c:351:rpc_transport_load] 0-rpc-transport: 'rdma' >>>>>>> initialization failed >>>>>>> [2019-01-15 17:50:13.964560] W >>>>>>> [rpcsvc.c:1781:rpcsvc_create_listener] 0-rpc-service: cannot create >>>>>>> listener, initing the transport failed >>>>>>> [2019-01-15 17:50:13.964579] E [MSGID: 106244] >>>>>>> [glusterd.c:1764:init] 0-management: creation of 1 listeners failed, >>>>>>> continuing with succeeded transport >>>>>>> [2019-01-15 17:50:14.967681] I [MSGID: 106513] >>>>>>> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved >>>>>>> op-version: 40100 >>>>>>> [2019-01-15 17:50:14.973931] I [MSGID: 106544] >>>>>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: >>>>>>> d6bf51a7-c296-492f-8dac-e81efa9dd22d >>>>>>> [2019-01-15 17:50:15.046620] E [MSGID: 101032] >>>>>>> [store.c:441:gf_store_handle_retrieve] 0-: Path corresponding to >>>>>>> /var/lib/glusterd/vols/gfs-tst/bricks/IP.3:-media-disk3-brick3. [No such >>>>>>> file or directory] >>>>>>> [2019-01-15 17:50:15.046685] E [MSGID: 106201] >>>>>>> [glusterd-store.c:3384:glusterd_store_retrieve_volumes] 0-management: >>>>>>> Unable to restore volume: gfs-tst >>>>>>> [2019-01-15 17:50:15.046718] E [MSGID: 101019] >>>>>>> [xlator.c:720:xlator_init] 0-management: Initialization of volume >>>>>>> 'management' failed, review your volfile again >>>>>>> [2019-01-15 17:50:15.046732] E [MSGID: 101066] >>>>>>> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator >>>>>>> failed >>>>>>> [2019-01-15 17:50:15.046741] E [MSGID: 101176] >>>>>>> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed >>>>>>> [2019-01-15 17:50:15.047171] W [glusterfsd.c:1514:cleanup_and_exit] >>>>>>> (-->/usr/local/sbin/glusterd(glusterfs_volumes >>>>>>> >>>>>>> >>>>>>> >>>>>>> In long, I am trying to simulate a situation. where volume stoped >>>>>>> abnormally and >>>>>>> entire cluster restarted with some missing disks. >>>>>>> >>>>>>> My test cluster is set up with 3 nodes and each has four disks, I >>>>>>> have setup a volume with disperse 4+2. >>>>>>> In Node-3 2 disks have failed, to replace I have shutdown all system >>>>>>> >>>>>>> below are the steps done. >>>>>>> >>>>>>> 1. umount from client machine >>>>>>> 2. shutdown all system by running `shutdown -h now` command ( >>>>>>> without stopping volume and stop service) >>>>>>> 3. replace faulty disk in Node-3 >>>>>>> 4. powered ON all system >>>>>>> 5. format replaced drives, and mount all drives >>>>>>> 6. start glusterd service in all node (success) >>>>>>> 7. Now running `voulume status` command from node-3 >>>>>>> output : [2019-01-15 16:52:17.718422] : v status : FAILED : Staging >>>>>>> failed on 0083ec0c-40bf-472a-a128-458924e56c96. Please check log file for >>>>>>> details. >>>>>>> 8. running `voulume start gfs-tst` command from node-3 >>>>>>> output : [2019-01-15 16:53:19.410252] : v start gfs-tst : FAILED : >>>>>>> Volume gfs-tst already started >>>>>>> >>>>>>> 9. running `gluster v status` in other node. showing all brick >>>>>>> available but 'self-heal daemon' not running >>>>>>> @gfstst-node2:~$ sudo gluster v status >>>>>>> Status of volume: gfs-tst >>>>>>> Gluster process TCP Port RDMA Port >>>>>>> Online Pid >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> Brick IP.2:/media/disk1/brick1 49152 0 Y >>>>>>> 1517 >>>>>>> Brick IP.4:/media/disk1/brick1 49152 0 Y >>>>>>> 1668 >>>>>>> Brick IP.2:/media/disk2/brick2 49153 0 Y >>>>>>> 1522 >>>>>>> Brick IP.4:/media/disk2/brick2 49153 0 Y >>>>>>> 1678 >>>>>>> Brick IP.2:/media/disk3/brick3 49154 0 Y >>>>>>> 1527 >>>>>>> Brick IP.4:/media/disk3/brick3 49154 0 Y >>>>>>> 1677 >>>>>>> Brick IP.2:/media/disk4/brick4 49155 0 Y >>>>>>> 1541 >>>>>>> Brick IP.4:/media/disk4/brick4 49155 0 Y >>>>>>> 1683 >>>>>>> Self-heal Daemon on localhost N/A N/A Y >>>>>>> 2662 >>>>>>> Self-heal Daemon on IP.4 N/A N/A Y >>>>>>> 2786 >>>>>>> >>>>>>> 10. in the above output 'volume already started'. so, running >>>>>>> `reset-brick` command >>>>>>> v reset-brick gfs-tst IP.3:/media/disk3/brick3 >>>>>>> IP.3:/media/disk3/brick3 commit force >>>>>>> >>>>>>> output : [2019-01-15 16:57:37.916942] : v reset-brick gfs-tst >>>>>>> IP.3:/media/disk3/brick3 IP.3:/media/disk3/brick3 commit force : FAILED : >>>>>>> /media/disk3/brick3 is already part of a volume >>>>>>> >>>>>>> 11. reset-brick command was not working, so, tried stopping volume >>>>>>> and start with force command >>>>>>> output : [2019-01-15 17:01:04.570794] : v start gfs-tst force : >>>>>>> FAILED : Pre-validation failed on localhost. Please check log file for >>>>>>> details >>>>>>> >>>>>>> 12. now stopped service in all node and tried starting again. except >>>>>>> node-3 other nodes service started successfully without any issues. >>>>>>> >>>>>>> in node-3 receiving following message. >>>>>>> >>>>>>> sudo service glusterd start >>>>>>> * Starting glusterd service glusterd >>>>>>> >>>>>>> [fail] >>>>>>> /usr/local/sbin/glusterd: option requires an argument -- 'f' >>>>>>> Try `glusterd --help' or `glusterd --usage' for more information. >>>>>>> >>>>>>> 13. checking glusterd log file found that OS drive was running out >>>>>>> of space >>>>>>> output : [2019-01-15 16:51:37.210792] W [MSGID: 101012] >>>>>>> [store.c:372:gf_store_save_value] 0-management: fflush failed. [No space >>>>>>> left on device] >>>>>>> [2019-01-15 16:51:37.210874] E [MSGID: 106190] >>>>>>> [glusterd-store.c:1058:glusterd_volume_exclude_options_write] 0-management: >>>>>>> Unable to write volume values for gfs-tst >>>>>>> >>>>>>> 14. cleared some space in OS drive but still, service is not >>>>>>> running. below is the error logged in glusterd.log >>>>>>> >>>>>>> [2019-01-15 17:50:13.956053] I [MSGID: 100030] >>>>>>> [glusterfsd.c:2741:main] 0-/usr/local/sbin/glusterd: Started running >>>>>>> /usr/local/sbin/glusterd version 4.1.6 (args: /usr/local/sbin/glusterd -p >>>>>>> /var/run/glusterd.pid) >>>>>>> [2019-01-15 17:50:13.960131] I [MSGID: 106478] >>>>>>> [glusterd.c:1423:init] 0-management: Maximum allowed open file descriptors >>>>>>> set to 65536 >>>>>>> [2019-01-15 17:50:13.960193] I [MSGID: 106479] >>>>>>> [glusterd.c:1481:init] 0-management: Using /var/lib/glusterd as working >>>>>>> directory >>>>>>> [2019-01-15 17:50:13.960212] I [MSGID: 106479] >>>>>>> [glusterd.c:1486:init] 0-management: Using /var/run/gluster as pid file >>>>>>> working directory >>>>>>> [2019-01-15 17:50:13.964437] W [MSGID: 103071] >>>>>>> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>>>>>> channel creation failed [No such device] >>>>>>> [2019-01-15 17:50:13.964474] W [MSGID: 103055] [rdma.c:4938:init] >>>>>>> 0-rdma.management: Failed to initialize IB Device >>>>>>> [2019-01-15 17:50:13.964491] W >>>>>>> [rpc-transport.c:351:rpc_transport_load] 0-rpc-transport: 'rdma' >>>>>>> initialization failed >>>>>>> [2019-01-15 17:50:13.964560] W >>>>>>> [rpcsvc.c:1781:rpcsvc_create_listener] 0-rpc-service: cannot create >>>>>>> listener, initing the transport failed >>>>>>> [2019-01-15 17:50:13.964579] E [MSGID: 106244] >>>>>>> [glusterd.c:1764:init] 0-management: creation of 1 listeners failed, >>>>>>> continuing with succeeded transport >>>>>>> [2019-01-15 17:50:14.967681] I [MSGID: 106513] >>>>>>> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved >>>>>>> op-version: 40100 >>>>>>> [2019-01-15 17:50:14.973931] I [MSGID: 106544] >>>>>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: >>>>>>> d6bf51a7-c296-492f-8dac-e81efa9dd22d >>>>>>> [2019-01-15 17:50:15.046620] E [MSGID: 101032] >>>>>>> [store.c:441:gf_store_handle_retrieve] 0-: Path corresponding to >>>>>>> /var/lib/glusterd/vols/gfs-tst/bricks/IP.3:-media-disk3-brick3. [No such >>>>>>> file or directory] >>>>>>> [2019-01-15 17:50:15.046685] E [MSGID: 106201] >>>>>>> [glusterd-store.c:3384:glusterd_store_retrieve_volumes] 0-management: >>>>>>> Unable to restore volume: gfs-tst >>>>>>> [2019-01-15 17:50:15.046718] E [MSGID: 101019] >>>>>>> [xlator.c:720:xlator_init] 0-management: Initialization of volume >>>>>>> 'management' failed, review your volfile again >>>>>>> [2019-01-15 17:50:15.046732] E [MSGID: 101066] >>>>>>> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator >>>>>>> failed >>>>>>> [2019-01-15 17:50:15.046741] E [MSGID: 101176] >>>>>>> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed >>>>>>> [2019-01-15 17:50:15.047171] W [glusterfsd.c:1514:cleanup_and_exit] >>>>>>> (-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52] >>>>>>> -->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41] >>>>>>> -->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-: >>>>>>> received signum (-1), shutting down >>>>>>> >>>>>>> >>>>>>> 15. In other node running `volume status' still shows bricks node3 >>>>>>> is live >>>>>>> but 'peer status' showing node-3 disconnected >>>>>>> >>>>>>> @gfstst-node2:~$ sudo gluster v status >>>>>>> Status of volume: gfs-tst >>>>>>> Gluster process TCP Port RDMA Port >>>>>>> Online Pid >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> Brick IP.2:/media/disk1/brick1 49152 0 Y >>>>>>> 1517 >>>>>>> Brick IP.4:/media/disk1/brick1 49152 0 Y >>>>>>> 1668 >>>>>>> Brick IP.2:/media/disk2/brick2 49153 0 Y >>>>>>> 1522 >>>>>>> Brick IP.4:/media/disk2/brick2 49153 0 Y >>>>>>> 1678 >>>>>>> Brick IP.2:/media/disk3/brick3 49154 0 Y >>>>>>> 1527 >>>>>>> Brick IP.4:/media/disk3/brick3 49154 0 Y >>>>>>> 1677 >>>>>>> Brick IP.2:/media/disk4/brick4 49155 0 Y >>>>>>> 1541 >>>>>>> Brick IP.4:/media/disk4/brick4 49155 0 Y >>>>>>> 1683 >>>>>>> Self-heal Daemon on localhost N/A N/A Y >>>>>>> 2662 >>>>>>> Self-heal Daemon on IP.4 N/A N/A Y >>>>>>> 2786 >>>>>>> >>>>>>> Task Status of Volume gfs-tst >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> There are no active volume tasks >>>>>>> >>>>>>> >>>>>>> root at gfstst-node2:~$ sudo gluster pool list >>>>>>> UUID Hostname State >>>>>>> d6bf51a7-c296-492f-8dac-e81efa9dd22d IP.3 Disconnected >>>>>>> c1cbb58e-3ceb-4637-9ba3-3d28ef20b143 IP.4 Connected >>>>>>> 0083ec0c-40bf-472a-a128-458924e56c96 localhost Connected >>>>>>> >>>>>>> root at gfstst-node2:~$ sudo gluster peer status >>>>>>> Number of Peers: 2 >>>>>>> >>>>>>> Hostname: IP.3 >>>>>>> Uuid: d6bf51a7-c296-492f-8dac-e81efa9dd22d >>>>>>> State: Peer in Cluster (Disconnected) >>>>>>> >>>>>>> Hostname: IP.4 >>>>>>> Uuid: c1cbb58e-3ceb-4637-9ba3-3d28ef20b143 >>>>>>> State: Peer in Cluster (Connected) >>>>>>> >>>>>>> >>>>>>> regards >>>>>>> Amudhan >>>>>>> _______________________________________________ >>>>>>> Gluster-users mailing list >>>>>>> Gluster-users at gluster.org >>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>>>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190117/16185d5c/attachment.html>