Krist van Besien
2013-Nov-12 10:19 UTC
[Gluster-users] New to Gluster. Having trouble with server replacement.
Hello all, I'm new to gluster. In order to gain some knowledge, and test a few things I decided to install it on three servers and play around with it a bit. My setup: Three servers dc1-09, dc2-09, dc2-10. All with RHEL 6.4, and Gluster 3.4.0 (from RHS 2.1) Each server has three disks, mounted in /mnt/raid1, /mnt/raid2 and /mnt/raid3. I created a distributed/replicated volume, test1, with two replicas. [root at dc2-10 ~]# gluster volume info test1 Volume Name: test1 Type: Distributed-Replicate Volume ID: 59049b52-9e25-4cc9-bebd-fb3587948900 Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: dc1-09:/mnt/raid1/test1 Brick2: dc2-09:/mnt/raid2/test1 Brick3: dc2-09:/mnt/raid1/test1 Brick4: dc2-10:/mnt/raid2/test1 Brick5: dc2-10:/mnt/raid1/test1 Brick6: dc1-09:/mnt/raid2/test1 I mounted this volume on a fourth unix server, and started a small script that just keeps writing small files to it, in order to have some activity. Then I shut down one of the servers, started it again, shut down another etc... gluster proved to have no problem keeping the files available. Then I decided to just nuke one server, and just completely reinitialise it. After reinstalling OS + Gluster I had some trouble getting the server back in the pool. I followed two hints I found on the internet, and added the old UUID in to glusterd.info, and made sure the correct trusted.glusterfs.volume-id was set on all bricks. Now the new server starts storing stuff again. But it still looks a bit odd. I don't get consistent output from gluster volume status on all three servers. gluster volume info test1 gives me the same output everywhere. However the output of gluster volume status is different: [root at dc1-09 glusterd]# gluster volume status test1 Status of volume: test1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick dc1-09:/mnt/raid1/test1 49154 Y 10496 Brick dc2-09:/mnt/raid2/test1 49152 Y 7574 Brick dc2-09:/mnt/raid1/test1 49153 Y 7581 Brick dc1-09:/mnt/raid2/test1 49155 Y 10502 NFS Server on localhost 2049 Y 1039 Self-heal Daemon on localhost N/A Y 1046 NFS Server on dc2-09 2049 Y 12397 Self-heal Daemon on dc2-09 N/A Y 12444 There are no active volume tasks [root at dc2-10 /]# gluster volume status test1 Status of volume: test1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick dc2-09:/mnt/raid2/test1 49152 Y 7574 Brick dc2-09:/mnt/raid1/test1 49153 Y 7581 Brick dc2-10:/mnt/raid2/test1 49152 Y 9037 Brick dc2-10:/mnt/raid1/test1 49153 Y 9049 NFS Server on localhost 2049 Y 14266 Self-heal Daemon on localhost N/A Y 14281 NFS Server on 172.16.1.21 2049 Y 12397 Self-heal Daemon on 172.16.1.21 N/A Y 12444 There are no active volume tasks [root at dc2-09 mnt]# gluster volume status test1 Status of volume: test1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick dc1-09:/mnt/raid1/test1 49154 Y 10496 Brick dc2-09:/mnt/raid2/test1 49152 Y 7574 Brick dc2-09:/mnt/raid1/test1 49153 Y 7581 Brick dc2-10:/mnt/raid2/test1 49152 Y 9037 Brick dc2-10:/mnt/raid1/test1 49153 Y 9049 Brick dc1-09:/mnt/raid2/test1 49155 Y 10502 NFS Server on localhost 2049 Y 12397 Self-heal Daemon on localhost N/A Y 12444 NFS Server on dc2-10 2049 Y 14266 Self-heal Daemon on dc2-10 N/A Y 14281 NFS Server on dc1-09 2049 Y 1039 Self-heal Daemon on dc1-09 N/A Y 1046 There are no active volume tasks-- Why would the output of status be different on the three hosts? Is this normal, or is there still something wrong? If so, how do I fix this? Krist krist.vanbesien at gmail.com krist at vanbesien.org Bern, Switzerland