Hi Toby,
----- Original Message -----> Hi,
> I'm getting some confusing "Incorrect brick" errors when
attempting to
> remove OR replace a brick.
>
> gluster> volume info condor
>
> Volume Name: condor
> Type: Replicate
> Volume ID: 9fef3f76-525f-4bfe-9755-151e0d8279fd
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: mel-storage01:/srv/brick/condor
> Brick2: mel-storage02:/srv/brick/condor
>
> gluster> volume remove-brick condor replica 1
> mel-storage02:/srv/brick/condor start
> Incorrect brick mel-storage02:/srv/brick/condor for volume condor
>
>
> If that is the incorrect brick, then what have I done wrong?
I agree that the error message displayed is far from helpful. The reason your
attempt to remove a brick from 1X2 replicate volume failed is because
it is not a 'legal' operation.
Here are some rules and background, that are implicit, about how to determine if
a
remove-brick operation is allowed. Some may seem debatable, but
that is how things are today. We could refine them and arrive evolve
better set of rules via discussions on the mailing lists.
1) remove-brick start variant is applicable *only* when you have the dht (or
distribute)
type volume. In 3.3, you could identify that by observing the output of
"gluster volume info <VOLNAME>".
The "Type" field would display
"Distribute-<something>". Additionally, even in a
Distribute type volume, which includes Distribute-Replicate Distribute-Stripe
and other combinations,
all the bricks belonging to the subvolume would need to be removed in one go.
For eg,
Lets assume a 2X2 volume V1, with bricks b1, b2, b3, b4, such that b1,b2 form a
pair; b3,b4 form the other pair.
If you wanted to use the remove-brick start variant, say for scaling down the
volume, you should do the following,
#gluster volume remove-brick V1 b3 b4 start
#gluster volume remove-brick V1 b3 b4 status
Once the remove-brick operation is completed,
#gluster volume remove-brick V1 b3 b4 commit
This would leave volume V1 with bricks b1,b2.
In the above workflow, the data residing in b3,b4 is migrated to
b1,b2.
2) remove-brick (without the 'start' subcommand) can be used to reduce
the replica count till 2,
in a Distribute-Replicate type volume. As of today, remove-brick doesn't
permit reducing of
replica count in a pure replicate volume. ie. 1XN, where N >= 2.
Note: There is some activity around evolving the 'right' rule. See
http://review.gluster.com/#/c/5364/
The above rules have been evolved with the thought that, no legal command must
allow the
user to shoot her foot, without a 'repair' path. Put differently, we
disallow commands
that might lead to data loss, without the user being fully aware of it.
Hope that helps,
krish
>
>
> thanks,
> Toby
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>