Amar Tumballi
2012-Oct-11 18:35 UTC
[Gluster-users] 'replace-brick' - why we plan to deprecate
Hi All, When we initially came up with specs of 'glusterd', we needed an option to replace a dead brick, and few people even requested for having an option to migrate the data from the brick, when we are replacing it. The result of this is 'gluster volume replace-brick' CLI, and in the releases till 3.3.0 this was the only way to 'migrate' data off a removed brick properly. Now, with 3.3.0+ (ie, in upstream too), we have another *better* approach (technically), which is achieved by below methods: ======= 1) Distribute volume: earlier: #gluster volume replace-brick <VOL> brick1 brick2 start [1] alternative/now: #gluster volume add-brick <VOL> brick2 #gluster volume remove-brick <VOL> brick1 start (above does rebalance which is intelligent now to understand that all data in brick1 should get moved to brick2) 2) (Distributed-)Replicate Volume: earlier: #gluster volume replace-brick <VOL> brick1 brick2 start [1] now: #gluster volume replace-brick <VOL> brick1 brick2 commit force (self-heal daemon takes care of syncing data from one brick to another) 3) (Distributed-)Stripe Volume: earlier: #gluster volume replace-brick <VOL> brick1 brick2 start [1] (this would have caused brick2 to consume much more space than brick1 as it would have filled up the holes with 0s) now: if one needs data migration: # gluster volume add-brick <VOL> brickN ... brickN+M (M == number of stripe) # gluster volume remove-brick <VOL> brick1 ... brickM start (all bricks of the stripe subvol which has the brick to be removed). [1] but as we recommend stripe volume for 'only' scratch data, I personally recommend using below volume instead, if stripe is absolutely necessary: 4) (Distributed-)Stripe-Replicate Volume: earlier: I never tried (the volume type is new), but the semantic is generally same: #gluster volume replace-brick <VOL> brick1 brick2 start [1] now: # gluster volume replace-brick <VOL> brick1 brick2 commit force (self-heal daemon heals the data) =============== Let me know if anyone has objections with discarding replace-brick data migration. Regards, Amar [1] - (checking status and doing a 'commit' after 'start' is part of both replace-brick and remove-brick CLI to finish the task completely, with data migration).
hi Amar, I met across a problem when I replace a brick in a stripe-replicate volume. I used both methods you mentioned in your post. #gluster volume replace-brick <VOL> brick1 brick2 start [1] # gluster volume replace-brick <VOL> brick1 brick2 commit force (self-heal daemon heals the data) the message shows that brick1 doesn't in the volume. I checked the volume info ,and it real exists there. Any idea or advise? Thanks Jeff --------------------------------------------------------------------------------------------------- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. --------------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121017/8967948a/attachment.html>
Joe Julian
2012-Oct-22 02:55 UTC
[Gluster-users] 'replace-brick' - why we plan to deprecate
You may have noticed that I just wrote a blog article completely to the contrary. I recently tried replacing a server using self-heal. It brought the entire pool to a dead crawl and I had to pull the server offline until after hours. My VM's all ended up read-only from lack of response. The replace-brick operation, however, went quite smoothly. When admins choose to have redundancy through replication, they would be in a high risk state while waiting for the self-heal to finish. This is not acceptable to a lot of us. I'd much rather have a working system up while the data's migrating to the new one. A self-heal...full can take months to crawl on some systems. There's no indication that the self-heal daemon has completed it's crawl. I haven't checked the proposed process. Does the self-heal...info show all the files that suddenly need healed immediately after the replace-brick...commit force? A rebalance frequently takes weeks or longer for many systems, according to frequent reports on IRC. At least once every couple of weeks, someone comes into the channel telling us how it failed to complete (of course, there's never enough information to file bug reports on those complaints :/). I haven't really done any rebalancing myself. On 10/11/2012 11:35 AM, Amar Tumballi wrote:> Hi All, > > When we initially came up with specs of 'glusterd', we needed an > option to replace a dead brick, and few people even requested for > having an option to migrate the data from the brick, when we are > replacing it. > > The result of this is 'gluster volume replace-brick' CLI, and in the > releases till 3.3.0 this was the only way to 'migrate' data off a > removed brick properly. > > Now, with 3.3.0+ (ie, in upstream too), we have another *better* > approach (technically), which is achieved by below methods: > =======> > 1) Distribute volume: > > earlier: > > #gluster volume replace-brick <VOL> brick1 brick2 start [1] > > alternative/now: > > #gluster volume add-brick <VOL> brick2 > #gluster volume remove-brick <VOL> brick1 start > (above does rebalance which is intelligent now to understand that all > data in brick1 should get moved to brick2) > > 2) (Distributed-)Replicate Volume: > > earlier: > #gluster volume replace-brick <VOL> brick1 brick2 start [1] > > now: > > #gluster volume replace-brick <VOL> brick1 brick2 commit force > (self-heal daemon takes care of syncing data from one brick to another) > > 3) (Distributed-)Stripe Volume: > > earlier: > > #gluster volume replace-brick <VOL> brick1 brick2 start [1] > (this would have caused brick2 to consume much more space than brick1 > as it would have filled up the holes with 0s) > > now: > > if one needs data migration: > > # gluster volume add-brick <VOL> brickN ... brickN+M (M == number of > stripe) > # gluster volume remove-brick <VOL> brick1 ... brickM start (all > bricks of the stripe subvol which has the brick to be removed). [1] > > but as we recommend stripe volume for 'only' scratch data, I > personally recommend using below volume instead, if stripe is > absolutely necessary: > > 4) (Distributed-)Stripe-Replicate Volume: > > earlier: > > I never tried (the volume type is new), but the semantic is generally > same: > > #gluster volume replace-brick <VOL> brick1 brick2 start [1] > > > now: > > # gluster volume replace-brick <VOL> brick1 brick2 commit force > (self-heal daemon heals the data) > > > ===============> > Let me know if anyone has objections with discarding replace-brick > data migration. > > Regards, > Amar > > [1] - (checking status and doing a 'commit' after 'start' is part of > both replace-brick and remove-brick CLI to finish the task completely, > with data migration). > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users