Anuradha Talur
2016-Feb-16 10:53 UTC
[Gluster-users] question about sync replicate volume after rebooting one node
----- Original Message -----> From: "songxin" <songxin_1980 at 126.com> > To: gluster-users at gluster.org > Sent: Tuesday, February 16, 2016 3:59:50 PM > Subject: [Gluster-users] question about sync replicate volume after rebooting one node > > Hi, > I have a question about how to sync volume between two bricks after one node > is reboot. > > There are two node, A node and B node.A node ip is 128.124.10.1 and B node ip > is 128.124.10.2. > > operation steps on A node as below > 1. gluster peer probe 128.124.10.2 > 2. mkdir -p /data/brick/gv0 > 3.gluster volume create gv0 replica 2 128.124.10.1 :/data/brick/gv0 > 128.124.10.2 :/data/brick/gv1 force > 4. gluster volume start gv0 > 5.mount -t glusterfs 128.124.10.1 :/gv0 gluster > > operation steps on B node as below > 1 . mkdir -p /data/brick/gv0 > 2.mount -t glusterfs 128.124.10.1 :/gv0 gluster > > After all steps above , there a some gluster service process, including > glusterd, glusterfs and glusterfsd, running on both A and B node. > I can see these servic by command ps aux | grep gluster and command gluster > volume status. > > Now reboot the B node.After B reboot , there are no gluster service running > on B node. > After I systemctl start glusterd , there is just glusterd service but not > glusterfs and glusterfsd on B node. > Because glusterfs and glusterfsd are not running so I can't gluster volume > heal gv0 full. > > I want to know why glusterd don't start glusterfs and glusterfsd.On starting glusterd, glusterfsd should have started by itself. Could you share glusterd and brick log (on node B) so that we know why glusterfsd didn't start? Do you still see glusterfsd service running on node A? You can try running "gluster v start <VOLNAME> force" on one of the nodes and check if all the brick processes started. gluster volume status <VOLNAME> should be able to provide you with gluster process status. On restarting the node, glusterfs process for mount won't start by itself. You will have to run step 2 on node B again for it.> How do I restart these services on B node? > How do I sync the replicate volume after one node reboot?Once the glusterfsd process starts on node B too, glustershd -- self-heal-daemon -- for replicate volume should start healing/syncing files that need to be synced. This deamon does periodic syncing of files. If you want to trigger a heal explicitly, you can run gluster volume heal <VOLNAME> on one of the servers.> > Thanks, > Xin > > > > > > > > > > > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-- Thanks, Anuradha.
songxin
2016-Feb-17 02:53 UTC
[Gluster-users] question about sync replicate volume after rebooting one node
Hi, Thank you for your immediate and detailed reply.And I have a few more question about glusterfs. A node IP is 128.224.162.163. B node IP is 128.224.162.250. 1.After reboot B node and start the glusterd service the glusterd log is as blow. ... [2015-12-07 07:54:55.743966] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2015-12-07 07:54:55.744026] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-12-07 07:54:55.744280] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30706 [2015-12-07 07:54:55.773606] I [MSGID: 106490] [glusterd-handler.c:2539:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: b6efd8fc-5eab-49d4-a537-2750de644a44 [2015-12-07 07:54:55.777994] E [MSGID: 101076] [common-utils.c:2954:gf_get_hostname_from_ip] 0-common-utils: Could not lookup hostname of 128.224.162.163 : Temporary failure in name resolution [2015-12-07 07:54:55.778290] E [MSGID: 106010] [glusterd-utils.c:2717:glusterd_compare_friend_volume] 0-management: Version of Cksums gv0 differ. local cksum = 2492237955, remote cksum = 4087388312 on peer 128.224.162.163 [2015-12-07 07:54:55.778384] I [MSGID: 106493] [glusterd-handler.c:3780:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 128.224.162.163 (0), ret: 0 [2015-12-07 07:54:55.928774] I [MSGID: 106493] [glusterd-rpc-ops.c:480:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: b6efd8fc-5eab-49d4-a537-2750de644a44, host: 128.224.162.163, port: 0 ... When I run gluster peer status on B node it show as below. Number of Peers: 1 Hostname: 128.224.162.163 Uuid: b6efd8fc-5eab-49d4-a537-2750de644a44 State: Peer Rejected (Connected) When I run "gluster volume status" on A node it show as below. Status of volume: gv0 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 128.224.162.163:/home/wrsadmin/work/t mp/data/brick/gv0 49152 0 Y 13019 NFS Server on localhost N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 13045 Task Status of Volume gv0 ------------------------------------------------------------------------------ There are no active volume tasks It looks like the glusterfsd service is ok on A node. If because the peer state is Rejected so gluterd didn't start the glusterfsd?What causes this problem? 2. Is glustershd(self-heal-daemon) the process as below? root 497 0.8 0.0 432520 18104 ? Ssl 08:07 0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/gluster .. If it is? I want to know if the glustershd is also the bin glusterfsd? just like glusterd and glusterfs. Thanks, Xin At 2016-02-16 18:53:03, "Anuradha Talur" <atalur at redhat.com> wrote:> > >----- Original Message ----- >> From: "songxin" <songxin_1980 at 126.com> >> To: gluster-users at gluster.org >> Sent: Tuesday, February 16, 2016 3:59:50 PM >> Subject: [Gluster-users] question about sync replicate volume after rebooting one node >> >> Hi, >> I have a question about how to sync volume between two bricks after one node >> is reboot. >> >> There are two node, A node and B node.A node ip is 128.124.10.1 and B node ip >> is 128.124.10.2. >> >> operation steps on A node as below >> 1. gluster peer probe 128.124.10.2 >> 2. mkdir -p /data/brick/gv0 >> 3.gluster volume create gv0 replica 2 128.124.10.1 :/data/brick/gv0 >> 128.124.10.2 :/data/brick/gv1 force >> 4. gluster volume start gv0 >> 5.mount -t glusterfs 128.124.10.1 :/gv0 gluster >> >> operation steps on B node as below >> 1 . mkdir -p /data/brick/gv0 >> 2.mount -t glusterfs 128.124.10.1 :/gv0 gluster >> >> After all steps above , there a some gluster service process, including >> glusterd, glusterfs and glusterfsd, running on both A and B node. >> I can see these servic by command ps aux | grep gluster and command gluster >> volume status. >> >> Now reboot the B node.After B reboot , there are no gluster service running >> on B node. >> After I systemctl start glusterd , there is just glusterd service but not >> glusterfs and glusterfsd on B node. >> Because glusterfs and glusterfsd are not running so I can't gluster volume >> heal gv0 full. >> >> I want to know why glusterd don't start glusterfs and glusterfsd. > >On starting glusterd, glusterfsd should have started by itself. >Could you share glusterd and brick log (on node B) so that we know why glusterfsd >didn't start? > >Do you still see glusterfsd service running on node A? You can try running "gluster v start <VOLNAME> force" >on one of the nodes and check if all the brick processes started. > >gluster volume status <VOLNAME> should be able to provide you with gluster process status. > >On restarting the node, glusterfs process for mount won't start by itself. You will have to run >step 2 on node B again for it. > >> How do I restart these services on B node? >> How do I sync the replicate volume after one node reboot? > >Once the glusterfsd process starts on node B too, glustershd -- self-heal-daemon -- for replicate volume >should start healing/syncing files that need to be synced. This deamon does periodic syncing of files. > >If you want to trigger a heal explicitly, you can run gluster volume heal <VOLNAME> on one of the servers. >> >> Thanks, >> Xin >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users > >-- >Thanks, >Anuradha.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160217/db181b00/attachment.html>