gandalf istari
2013-Nov-26 07:17 UTC
[Gluster-users] Resync or how to force the replication
Hi have setup a two node replication glusterfs. After the initial installation the "master" node was put into the datacenter and after two week we moved the second one also to the datacenter. But the sync has not started yet. On the "master" gluster> volume info all Volume Name: datastore1 Type: Replicate Volume ID: fdff5190-85ef-4cba-9056-a6bbbd8d6863 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: nas-01-data:/datastore Brick2: nas-02-data:/datastore gluster> peer status Number of Peers: 1 Hostname: nas-02-data Uuid: 71df9f86-a87b-481d-896c-c0d4ab679cfa State: Peer in Cluster (Connected) On the "slave" gluster> peer status Number of Peers: 1 Hostname: 192.168.70.6 Uuid: 97ef0154-ad7b-402a-b0cb-22be09134a3c State: Peer in Cluster (Connected) gluster> volume status all Status of volume: datastore1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick nas-01-data:/datastore 49152 Y 2130 Brick nas-02-data:/datastore N/A N N/A NFS Server on localhost 2049 Y 8064 Self-heal Daemon on localhost N/A Y 8073 NFS Server on 192.168.70.6 2049 Y 3379 Self-heal Daemon on 192.168.70.6 N/A Y 3384 There are no active volume tasks I would like to run on the "slave" gluster volume sync nas-01-data datastore1 But then the virtual machines hosted will be unavailible is there another way to start the replication ? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131126/c26ee591/attachment.html>
M S Vishwanath Bhat
2013-Nov-26 13:46 UTC
[Gluster-users] Resync or how to force the replication
On 26/11/13 12:47, gandalf istari wrote:> Hi have setup a two node replication glusterfs. After the initial > installation the "master" node was put into the datacenter and after > two week we moved the second one also to the datacenter. > > But the sync has not started yet. > > On the "master" > > gluster> volume info all > > Volume Name: datastore1 > > Type: Replicate > > Volume ID: fdff5190-85ef-4cba-9056-a6bbbd8d6863 > > Status: Started > > Number of Bricks: 1 x 2 = 2 > > Transport-type: tcp > > Bricks: > > Brick1: nas-01-data:/datastore > > Brick2: nas-02-data:/datastore > > gluster> peer status > > Number of Peers: 1 > > > Hostname: nas-02-data > > Uuid: 71df9f86-a87b-481d-896c-c0d4ab679cfa > > State: Peer in Cluster (Connected) > > > On the "slave" > > gluster> peer status > > Number of Peers: 1 > > Hostname: 192.168.70.6 > > Uuid: 97ef0154-ad7b-402a-b0cb-22be09134a3c > > State: Peer in Cluster (Connected) > > > gluster> volume status all > > Status of volume: datastore1 > > Gluster processPortOnlinePid > > ------------------------------------------------------------------------------ > > Brick nas-01-data:/datastore49152Y2130 > > Brick nas-02-data:/datastoreN/ANN/A > > NFS Server on localhost2049Y8064 > > Self-heal Daemon on localhostN/AY8073 > > NFS Server on 192.168.70.62049Y3379 > > Self-heal Daemon on 192.168.70.6N/AY3384 >Which version of glusterfs are you running? volume status suggests that the second brick (nas-02-data:/datastore) is not running. Can you run "gluster volume start <volname> force" in any of these two nodes and try again? Then you would also required to run `find . | xargs stat` on the mountpoint of the volume. That should trigger the self heal.> > There are no active volume tasks > > > I would like to run on the "slave" gluster volume sync nas-01-data > datastore1 >BTW, There is no concept of "master" and "slave" in afr (replication). However there is concept of "master volume" and "slave volume" in gluster geo-replication.> > But then the virtual machines hosted will be unavailible is there > another way to start the replication ? > > > Thanks > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131126/67294d4e/attachment.html>
gandalf istari
2013-Nov-26 14:06 UTC
[Gluster-users] Resync or how to force the replication
hi thanks for the quick answer. I'm running glusterfs 3.4.1 [root at nas-02 datastore]# gluster volume start datastore1 force volume start: datastore1: failed: Failed to get extended attribute trusted.glusterfs.volume-id for brick dir /datastore. Reason : No data available It seems that the .gluster directory is missing for some reason. volume replace-brick datastore1 nas-01-data:/datastore nas-02-data:/datastore commit force To rebuild/replace the missing brick ? I'm quite new with glusterfs Thanks On 26/11/13 12:47, gandalf istari wrote: Hi have setup a two node replication glusterfs. After the initial installation the "master" node was put into the datacenter and after two week we moved the second one also to the datacenter. But the sync has not started yet. On the "master" gluster> volume info all Volume Name: datastore1 Type: Replicate Volume ID: fdff5190-85ef-4cba-9056-a6bbbd8d6863 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: nas-01-data:/datastore Brick2: nas-02-data:/datastore gluster> peer status Number of Peers: 1 Hostname: nas-02-data Uuid: 71df9f86-a87b-481d-896c-c0d4ab679cfa State: Peer in Cluster (Connected) On the "slave" gluster> peer status Number of Peers: 1 Hostname: 192.168.70.6 Uuid: 97ef0154-ad7b-402a-b0cb-22be09134a3c State: Peer in Cluster (Connected) gluster> volume status all Status of volume: datastore1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick nas-01-data:/datastore 49152 Y 2130 Brick nas-02-data:/datastore N/A N N/A NFS Server on localhost 2049 Y 8064 Self-heal Daemon on localhost N/A Y 8073 NFS Server on 192.168.70.6 2049 Y 3379 Self-heal Daemon on 192.168.70.6 N/A Y 3384 Which version of glusterfs are you running? volume status suggests that the second brick (nas-02-data:/datastore) is not running. Can you run "gluster volume start <volname> force" in any of these two nodes and try again? Then you would also required to run `find . | xargs stat` on the mountpoint of the volume. That should trigger the self heal. There are no active volume tasks I would like to run on the "slave" gluster volume sync nas-01-data datastore1 BTW, There is no concept of "master" and "slave" in afr (replication). However there is concept of "master volume" and "slave volume" in gluster geo-replication. But then the virtual machines hosted will be unavailible is there another way to start the replication ? Thanks _______________________________________________ Gluster-users mailing listGluster-users at gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131126/81fdbc09/attachment.html>
gandalf istari
2013-Nov-27 06:53 UTC
[Gluster-users] Resync or how to force the replication
Hi Shwetha, [root at nas-01 ~]# getfattr -d -e hex -m . /datastore getfattr: Removing leading '/' from absolute path names # file: datastore trusted.afr.datastore1-client-0=0x000000000000000000000000 trusted.afr.datastore1-client-1=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0xfdff519085ef4cba9056a6bbbd8d6863 [root at nas-02 ~]# getfattr -d -e hex -m . /datastore getfattr: Removing leading '/' from absolute path names # file: datastore security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 I don't understand what happened . gr Patrick Hi Gandalf, can you run the following command on the brick path? "getfattr -d -e hex -m . /datastore" on both "nas-01-data" and "nas-02-data" nodes. This will let us know whether there is "trusted.glusterfs.volume-id" set. -Shwetha On 11/26/2013 07:36 PM, gandalf istari wrote: hi thanks for the quick answer. I'm running glusterfs 3.4.1 [root at nas-02 datastore]# gluster volume start datastore1 force volume start: datastore1: failed: Failed to get extended attribute trusted.glusterfs.volume-id for brick dir /datastore. Reason : No data available It seems that the .gluster directory is missing for some reason. volume replace-brick datastore1 nas-01-data:/datastore nas-02-data:/datastore commit force To rebuild/replace the missing brick ? I'm quite new with glusterfs Thanks On 26/11/13 12:47, gandalf istari wrote: Hi have setup a two node replication glusterfs. After the initial installation the "master" node was put into the datacenter and after two week we moved the second one also to the datacenter. But the sync has not started yet. On the "master" gluster> volume info all Volume Name: datastore1 Type: Replicate Volume ID: fdff5190-85ef-4cba-9056-a6bbbd8d6863 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: nas-01-data:/datastore Brick2: nas-02-data:/datastore gluster> peer status Number of Peers: 1 Hostname: nas-02-data Uuid: 71df9f86-a87b-481d-896c-c0d4ab679cfa State: Peer in Cluster (Connected) On the "slave" gluster> peer status Number of Peers: 1 Hostname: 192.168.70.6 Uuid: 97ef0154-ad7b-402a-b0cb-22be09134a3c State: Peer in Cluster (Connected) gluster> volume status all Status of volume: datastore1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick nas-01-data:/datastore 49152 Y 2130 Brick nas-02-data:/datastore N/A N N/A NFS Server on localhost 2049 Y 8064 Self-heal Daemon on localhost N/A Y 8073 NFS Server on 192.168.70.6 2049 Y 3379 Self-heal Daemon on 192.168.70.6 N/A Y 3384 Which version of glusterfs are you running? volume status suggests that the second brick (nas-02-data:/datastore) is not running. Can you run "gluster volume start <volname> force" in any of these two nodes and try again? Then you would also required to run `find . | xargs stat` on the mountpoint of the volume. That should trigger the self heal. There are no active volume tasks I would like to run on the "slave" gluster volume sync nas-01-data datastore1 BTW, There is no concept of "master" and "slave" in afr (replication). However there is concept of "master volume" and "slave volume" in gluster geo-replication. But then the virtual machines hosted will be unavailible is there another way to start the replication ? Thanks _______________________________________________ Gluster-users mailing listGluster-users at gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing listGluster-users at gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131127/d93753e2/attachment.html>
gandalf istari
2013-Nov-27 07:37 UTC
[Gluster-users] Resync or how to force the replication
Thank you so much, It seems to be working the directories are now created but still empty. I suppose this will take a while to sync 44GB The only change I have done was: leave out nas-02-data for your command. whats the best way now to monitor the sync process ? You couldn't force start the volume because the brick "nas-02-data:/datastore" doesn't have the "trusted.glusterfs.volume-id" .>From nas-02 node execute :1. setfattr -n trusted.glusterfs.volume-id -v 0xfdff519085ef4cba9056a6bbbd8d6863 nas-02-data:/datastore 2. gluster volume start datastore1 force. -Shwetha On 11/27/2013 12:23 PM, gandalf istari wrote: Hi Shwetha, [root at nas-01 ~]# getfattr -d -e hex -m . /datastore getfattr: Removing leading '/' from absolute path names # file: datastore trusted.afr.datastore1-client-0=0x000000000000000000000000 trusted.afr.datastore1-client-1=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0xfdff519085ef4cba9056a6bbbd8d6863 [root at nas-02 ~]# getfattr -d -e hex -m . /datastore getfattr: Removing leading '/' from absolute path names # file: datastore security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 I don't understand what happened . gr Patrick Hi Gandalf, can you run the following command on the brick path? "getfattr -d -e hex -m . /datastore" on both "nas-01-data" and "nas-02-data" nodes. This will let us know whether there is "trusted.glusterfs.volume-id" set. -Shwetha On 11/26/2013 07:36 PM, gandalf istari wrote: hi thanks for the quick answer. I'm running glusterfs 3.4.1 [root at nas-02 datastore]# gluster volume start datastore1 force volume start: datastore1: failed: Failed to get extended attribute trusted.glusterfs.volume-id for brick dir /datastore. Reason : No data available It seems that the .gluster directory is missing for some reason. volume replace-brick datastore1 nas-01-data:/datastore nas-02-data:/datastore commit force To rebuild/replace the missing brick ? I'm quite new with glusterfs Thanks On 26/11/13 12:47, gandalf istari wrote: Hi have setup a two node replication glusterfs. After the initial installation the "master" node was put into the datacenter and after two week we moved the second one also to the datacenter. But the sync has not started yet. On the "master" gluster> volume info all Volume Name: datastore1 Type: Replicate Volume ID: fdff5190-85ef-4cba-9056-a6bbbd8d6863 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: nas-01-data:/datastore Brick2: nas-02-data:/datastore gluster> peer status Number of Peers: 1 Hostname: nas-02-data Uuid: 71df9f86-a87b-481d-896c-c0d4ab679cfa State: Peer in Cluster (Connected) On the "slave" gluster> peer status Number of Peers: 1 Hostname: 192.168.70.6 Uuid: 97ef0154-ad7b-402a-b0cb-22be09134a3c State: Peer in Cluster (Connected) gluster> volume status all Status of volume: datastore1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick nas-01-data:/datastore 49152 Y 2130 Brick nas-02-data:/datastore N/A N N/A NFS Server on localhost 2049 Y 8064 Self-heal Daemon on localhost N/A Y 8073 NFS Server on 192.168.70.6 2049 Y 3379 Self-heal Daemon on 192.168.70.6 N/A Y 3384 Which version of glusterfs are you running? volume status suggests that the second brick (nas-02-data:/datastore) is not running. Can you run "gluster volume start <volname> force" in any of these two nodes and try again? Then you would also required to run `find . | xargs stat` on the mountpoint of the volume. That should trigger the self heal. There are no active volume tasks I would like to run on the "slave" gluster volume sync nas-01-data datastore1 BTW, There is no concept of "master" and "slave" in afr (replication). However there is concept of "master volume" and "slave volume" in gluster geo-replication. But then the virtual machines hosted will be unavailible is there another way to start the replication ? Thanks _______________________________________________ Gluster-users mailing listGluster-users at gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing listGluster-users at gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131127/a46d86e8/attachment.html>