Joe,
Thanks for that, that was educational. Gluster docs claim that since 3.7, DHT
hash ranges are weighted based on brick sizes by default:
$ gluster volume get <vol cluster.weighted-rebalance
Option Value
------ -----
cluster.weighted-rebalance on
When running rebalance with force, I see this in the rebalance log:
...
[2016-10-11 16:38:37.655144] I [MSGID: 109045]
[dht-selfheal.c:1751:dht_fix_layout_of_directory] 0-cronut-dht: subvolume 10
(cronut-replicate-10): 5721127 chunks
[2016-10-11 16:38:37.655154] I [MSGID: 109045]
[dht-selfheal.c:1751:dht_fix_layout_of_directory] 0-cronut-dht: subvolume 11
(cronut-replicate-11): 7628846 chunks
?
subvolume >=11 are 8TB, subvolume <= 10 is are 6TB.
Do you think it is possible to even out usage on all bricks by % utilized now?
This would be the case if gluster rebalanced simply by what the scaled DHT says,
including all required data migrations?
It would be preferable for us to avoid having to depend on cluster.min-free-disk
to manage overflow later on - as this introduces one extra read of the link
followed by the actual IOP.
Thanks,
Jackie
> On Oct 10, 2016, at 11:13 AM, Joe Julian <joe at julianfamily.org>
wrote:
>
> I've written an example of how gluster's dht works on my blog at
https://joejulian.name/blog/dht-misses-are-expensive/
<https://joejulian.name/blog/dht-misses-are-expensive/> which might make
it clear why the end result is not what you expected.
>
> By setting cluster.min-free-disk (defaults to 10%) you can, at least,
ensure that your new bricks are utilized as needed to prevent over filling your
smaller bricks.
> On 10/10/2016 10:13 AM, Jackie Tung wrote:
>> Hi,
>>
>> We have a 2 node, distributed replicated setup (11 bricks on each
node). Each of these bricks are 6TB in size.
>>
>> node_A:/brick1 replicates node_B:/brick1
>> node_A:/brick2 replicates node_B:/brick2
>> node_A:/brick3 replicates node_B:/brick3
>> ?
>> ?
>> node_A:/brick11 replicates node_B:/brick11
>>
>> We recently added 5 more bricks to make it 16 bricks on each node in
total. Each of these new bricks are 8TB in size.
>>
>> We completed a full rebalance operation (status says ?completed?).
>>
>> However the end result is somewhat unexpected:
>> /dev/sdl1 7.3T 2.2T 5.2T 29%
>> /dev/sdk1 7.3T 2.0T 5.3T 28%
>> /dev/sdj1 7.3T 2.0T 5.3T 28%
>> /dev/sdn1 7.3T 2.2T 5.2T 30%
>> /dev/sdp1 7.3T 2.2T 5.2T 30%
>> /dev/sdc1 5.5T 2.3T 3.2T 42%
>> /dev/sdf1 5.5T 2.3T 3.2T 43%
>> /dev/sdo1 5.5T 2.3T 3.2T 42%
>> /dev/sda1 5.5T 2.3T 3.2T 43%
>> /dev/sdi1 5.5T 2.3T 3.2T 42%
>> /dev/sdh1 5.5T 2.3T 3.2T 43%
>> /dev/sde1 5.5T 2.3T 3.2T 42%
>> /dev/sdb1 5.5T 2.3T 3.2T 42%
>> /dev/sdm1 5.5T 2.3T 3.2T 42%
>> /dev/sdg1 5.5T 2.3T 3.2T 42%
>> /dev/sdd1 5.5T 2.3T 3.2T 42%
>>
>> The df output in bold are the new 8TB drives.
>> Was I wrong to expect the % usage to be roughly equal? Is there some
parameter I need to tweak to make rebalance account for disk sizes properly?
>>
>> I?m using Gluster 3.8 on Ubuntu.
>>
>> Thanks,
>> Jackie
>>
>> The information in this email is confidential and may be legally
privileged. It is intended solely for the addressee. Access to this email by
anyone else is unauthorized. If you are not the intended recipient, any
disclosure, copying, distribution or any action taken or omitted to be taken in
reliance on it, is prohibited and may be unlawful.
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>> http://www.gluster.org/mailman/listinfo/gluster-users
<http://www.gluster.org/mailman/listinfo/gluster-users>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
--
The information in this email is confidential and may be legally
privileged. It is intended solely for the addressee. Access to this email
by anyone else is unauthorized. If you are not the intended recipient, any
disclosure, copying, distribution or any action taken or omitted to be
taken in reliance on it, is prohibited and may be unlawful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20161011/80986ef6/attachment.html>