Joe Julian
2016-Oct-05 21:46 UTC
[Gluster-users] Remove a brick, rebuild it, put it back in
What I always do is just shut it down, repair (or replace) the brick, then start it up again with "... start $volname force". On October 5, 2016 11:27:36 PM GMT+02:00, Sergei Gerasenko <sgerasenko74 at gmail.com> wrote:>Hi, sorry if this has been asked before but the documentation is a bit >conflicting in various sources on what to do exactly. > >I have an 6-node, distributed replicated cluster with a replica factor >of >2. So it's 3 pairs of servers. I need to remove a server from one of >those >replica sets, rebuild it and put it back in. > >What's the tried and proven sequence of steps for this? Any pointers >would >be very useful. > >Thanks! > Sergei > > >------------------------------------------------------------------------ > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://www.gluster.org/mailman/listinfo/gluster-users-- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161005/474c8e12/attachment.html>
Sergei Gerasenko
2016-Oct-07 03:25 UTC
[Gluster-users] Remove a brick, rebuild it, put it back in
I've simulated the problem on 4 VMs in a distributed replicated setup with a 2 replica-factor. I've repeatedly torn down and brought up a VM from a snapshot in each of my tests. What has worked so far is this: 1. Make a copy of /var/lib/glusterd from the affected machine, save it elsewhere. 2. Configure your new machine (in my case I reverted to a VM snapshot). Assign the same ip and hostname! 3. Install gluster. 4. Stop the daemons if they are running. 5. Nuke the /var/lib/glusterd directory and replace it with the saved copy in step 1. 6. Create the brick directory. 7. Get the extended volume attribute from a healthy node like so: getfattr -e base64 -n trusted.glusterfs.volume-id /data/brick_dir 8. Apply the extended attribute volume id attribute like so: setfattr -n trusted.glusterfs.volume-id -v 'the_value_you_got_in_7==' /data/brick_dir 9. Start the daemons. 10. FUSE mount the gluster partition through the daemons running locally. So the /etc/fstab would contain something like: localhost:/gluster_volume /mnt/gluster glusterfs _netdev,defaults 0 0 11. On the healthy partner machine with another fuse mount point to the same volume do something like: find /mnt/fuse | xargs stat. 12. Step 8 will make files appear under the mount point on the new box but the files are not going to be physically in the brick directory -- yet. See 10. 13. Run the heal command from the same host where you ran find. That will finally sync the files to the brick. Run the heal info command periodically and the number of files being healed should eventually go down to 0. That's my experience with the VMs today. On Wed, Oct 5, 2016 at 4:46 PM, Joe Julian <joe at julianfamily.org> wrote:> What I always do is just shut it down, repair (or replace) the brick, then > start it up again with "... start $volname force". > > On October 5, 2016 11:27:36 PM GMT+02:00, Sergei Gerasenko < > sgerasenko74 at gmail.com> wrote: >> >> Hi, sorry if this has been asked before but the documentation is a bit >> conflicting in various sources on what to do exactly. >> >> I have an 6-node, distributed replicated cluster with a replica factor of >> 2. So it's 3 pairs of servers. I need to remove a server from one of those >> replica sets, rebuild it and put it back in. >> >> What's the tried and proven sequence of steps for this? Any pointers >> would be very useful. >> >> Thanks! >> Sergei >> >> ------------------------------ >> >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users >> >> > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161006/caeea5be/attachment.html>