Dan Bretherton
2011-Feb-17 09:57 UTC
[Gluster-users] "remove-brick" command SHOULD migrate data
> Date: Wed, 16 Feb 2011 12:40:35 -0500 > From: "William L. Sebok"<wls at astro.umd.edu> > Subject: [Gluster-users] "remove-brick" command SHOULD migrate data > To: Rahul C S<rahul at gluster.com> > Cc: Mark Wolfire<mwolfire at astro.umd.edu>, gluster-users at gluster.org, > Kwang-Ho Park<kpark at astro.umd.edu>, "Derek C. Richardson" > <dcr at astro.umd.edu>, Randall Perrine<perrine at astro.umd.edu> > Message-ID:<20110216174035.GA24390 at earth.astro.umd.edu> > Content-Type: text/plain; charset=us-ascii > > On Tue, Feb 15, 2011 at 11:49:06PM -0600, Rahul C S wrote: >> For the last question, >> "remove-brick" command does not migrate data, the data in that brick cannot >> be accessed from the client unlike "replace-brick" which actually migrates >> data from one brick to the another. > I strongly suggest for an enhancement a version of remove-brick that actually > does migrate the data. This would be *extremely* useful in dealing with a > distributed/replicated filesystem with a computer and/or brick that is dead or > likely to be down for an extended period (I configure bricks to be replicated > between different computers). The remove-brick command on startup > could make an estimate whether the data would fit on the remaining bricks. > If after the migration started it turned out that the data really does not all > fit there would still would not be any loss as long as the last file movement > wasn't completely committed. The command then could be aborted. > > This would be no different in concept and risks (and usefulness) than reducing > the size of a partition with gparted. I would make the remove-brick command > only remove a brick without migrating it if there were some "force" option in > effect. I have trouble seeing why one would otherwise want to use the > remove-brick command and throw away the data except in some dire emergency. > > Bill Sebok Computer Software Manager, Univ. of Maryland, Astronomy > Internet: wls at astro.umd.edu URL: http://furo.astro.umd.edu/ > > > -----------------------------Hello All- I would like to add my support to this feature request. I assumed that remove-brick did migrate data at first, until I looked into it more carefully. Unless I have got the wrong end of the stick, the only way to shrink a distributed or distributed-replicated volume at the moment is to perform the following steps. 1) Tell the users to stop using the volume, even though they will still be able to mount it and write to it 2) Remove the brick (and its mirror if appropriate) with remove-brick 3) Remove all the link files from the backend filesystem of the brick that has just been removed, using "find /brick/path -size 0b -perm 1000 -exec /bin/rm -v {} \;" or similar 4) Copy the files from the backend filesystem to the mount point of the volume 5) Tell the users it is safe to carry on using the volume again. That would be quite risky in my department because the volumes are auto-mounted on many different clients, and even if everyone has bothered to read my email telling them to stop using the volume they might accidentally leave a process running that is writing files to it. If some of those files have the same names as files that were on the removed brick, we could end up with a situation where the new files from the running process are overwritten by old versions being copied from the removed brick. To avoid this scenario, and other potential disasters I haven't thought of yet, I would have to do the following to safely shrink a volume. 1) Take the volume off line and then delete it 2) Create a new volume with a temporary name, containing all the bricks from the original volume except the brick (and its mirror if appropriate) I want to remove 3) Remove the link files from the backend filesystem of the brick that has just been removed 4) Copy the files from the backend filesystem to the mount point of the temporary volume 5) Delete the temporary volume, and re-create it using the original name so it can be auto-mounted. 6) Put the volume on line again. If there is an easier way of safely shrinking a distributed or distributed-replicated volume please let me know. Regards, -Dan.