thr3ads.net - Gluster users - [Gluster-users] Gluster distributed replicated setup does not serve read from all bricks belonging to the same replica [Nov 2018]

If this information is useful, please help other people find it:
Share via:

Ravishankar N

2018-Nov-23 03:58 UTC

[Gluster-users] Gluster distributed replicated setup does not serve read from all bricks belonging to the same replica

On 11/22/2018 07:07 PM, Anh Vo wrote:> Thanks Ravi, I will try that option.
> One question:
> Let's say there are self heal pending, how would the default of
"0"
> have worked? I understand 0 means "first responder" What if first
> responder doesn't have good copy? (and it failed in such a way that 
> the dirty attribute wasn't set on its copy - but there are index heal 
> pending from the other two sources)
0 = first readable child of AFR, starting from 1st child. So if 1st 
brick doesn't have the good copy, it will try the 2nd brick and so on.
The default value seems to be '1' not '0'. You can look at 
afr_read_subvol_select_by_policy() in the source code to understand the 
preference of selection.

Regards,
Ravi>
> On Wed, Nov 21, 2018 at 9:57 PM Ravishankar N <ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>> wrote:
>
>     Hi,
>     If there are multiple clients , you can change the
>     'cluster.read-hash-mode' volume option's value to 2. Then
>     different reads should be served from different bricks for
>     different clients. The meaning of various values for
>     'cluster.read-hash-mode' can be got from `gluster volume set
>     help`. gluster-4.1 also has added a new value[1] to this option.
>     Of course, the assumption is that all bricks host good copies
>     (i.e. there are no self-heals pending).
>
>     Hope this helps,
>     Ravi
>
>     [1] https://review.gluster.org/#/c/glusterfs/+/19698/
>
>     On 11/22/2018 10:20 AM, Anh Vo wrote:
>>     Hi,
>>     Our setup: We have a distributed replicated setup of 3 replica.
>>     The total number of servers varies between clusters, in
>>     some?cases we have a total of 36 (12 x 3) servers, in some of
>>     them we have 12 servers (4 x 3). We're using gluster 3.12.15
>>
>>     In all instances what I am noticing is that only one member of
>>     the replica is serving read for a particular file, even when all
>>     the members of the replica set is online. We have many large
>>     input files (for example: 150GB zip file) and when there are 50
>>     clients reading from one single server the performance degrades
>>     by several magnitude for reading that file only. Shouldn't all
>>     members of the replica participate in serving the read requests?
>>
>>     Our options
>>
>>     cluster.shd-max-threads: 1
>>     cluster.heal-timeout: 900
>>     network.inode-lru-limit: 50000
>>     performance.md-cache-timeout: 600
>>     performance.cache-invalidation: on
>>     performance.stat-prefetch: on
>>     features.cache-invalidation-timeout: 600
>>     features.cache-invalidation: on
>>     cluster.metadata-self-heal: off
>>     cluster.entry-self-heal: off
>>     cluster.data-self-heal: off
>>     features.inode-quota: off
>>     features.quota: off
>>     transport.listen-backlog: 100
>>     transport.address-family: inet
>>     performance.readdir-ahead: on
>>     nfs.disable: on
>>     performance.strict-o-direct: on
>>     network.remote-dio: off
>>     server.allow-insecure: on
>>     performance.write-behind: off
>>     cluster.nufa: disable
>>     diagnostics.latency-measurement: on
>>     diagnostics.count-fop-hits: on
>>     cluster.ensure-durability: off
>>     cluster.self-heal-window-size: 32
>>     cluster.favorite-child-policy: mtime
>>     performance.io-thread-count: 32
>>     cluster.eager-lock: off
>>     server.outstanding-rpc-limit: 128
>>     cluster.rebal-throttle: aggressive
>>     server.event-threads: 3
>>     client.event-threads: 3
>>     performance.cache-size: 6GB
>>     cluster.readdir-optimize: on
>>     storage.build-pgfid: on
>>
>>
>>
>>
>>
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>     https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20181123/f8786568/attachment.html>

Anh Vo

2018-Nov-24 07:33 UTC

head link

[Gluster-users] Gluster distributed replicated setup does not serve read from all bricks belonging to the same replica

Looking at the source (afr-common.c) even in the case of using hashed mode
and the hashed brick doesn't have a good copy it will try the next brick am
I correct? I'm curious because your first reply seemed to place some
significance on the part about pending self-heal. Is there anything about
pending self-heal that would have made hashed mode worse, or is it about as
bad as any brick selection policy?

Thanks

On Thu, Nov 22, 2018 at 7:59 PM Ravishankar N <ravishankar at redhat.com>
wrote:
>
>
> On 11/22/2018 07:07 PM, Anh Vo wrote:
>
> Thanks Ravi, I will try that option.
> One question:
> Let's say there are self heal pending, how would the default of
"0" have
> worked? I understand 0 means "first responder" What if first
responder
> doesn't have good copy? (and it failed in such a way that the dirty
> attribute wasn't set on its copy - but there are index heal pending
from
> the other two sources)
>
>
> 0 = first readable child of AFR, starting from 1st child. So if 1st brick
> doesn't have the good copy, it will try the 2nd brick and so on.
> The default value seems to be '1' not '0'. You can look at
> afr_read_subvol_select_by_policy() in the source code to understand the
> preference of selection.
>
> Regards,
> Ravi
>
>
> On Wed, Nov 21, 2018 at 9:57 PM Ravishankar N <ravishankar at
redhat.com>
> wrote:
>
>> Hi,
>> If there are multiple clients , you can change the
>> 'cluster.read-hash-mode' volume option's value to 2. Then
different reads
>> should be served from different bricks for different clients. The
meaning
>> of various values for 'cluster.read-hash-mode' can be got from
`gluster
>> volume set help`. gluster-4.1 also has added a new value[1] to this
option.
>> Of course, the assumption is that all bricks host good copies (i.e.
there
>> are no self-heals pending).
>>
>> Hope this helps,
>> Ravi
>>
>> [1]  https://review.gluster.org/#/c/glusterfs/+/19698/
>>
>> On 11/22/2018 10:20 AM, Anh Vo wrote:
>>
>> Hi,
>> Our setup: We have a distributed replicated setup of 3 replica. The
total
>> number of servers varies between clusters, in some cases we have a
total of
>> 36 (12 x 3) servers, in some of them we have 12 servers (4 x 3).
We're
>> using gluster 3.12.15
>>
>> In all instances what I am noticing is that only one member of the
>> replica is serving read for a particular file, even when all the
members of
>> the replica set is online. We have many large input files (for example:
>> 150GB zip file) and when there are 50 clients reading from one single
>> server the performance degrades by several magnitude for reading that
file
>> only. Shouldn't all members of the replica participate in serving
the read
>> requests?
>>
>> Our options
>>
>> cluster.shd-max-threads: 1
>> cluster.heal-timeout: 900
>> network.inode-lru-limit: 50000
>> performance.md-cache-timeout: 600
>> performance.cache-invalidation: on
>> performance.stat-prefetch: on
>> features.cache-invalidation-timeout: 600
>> features.cache-invalidation: on
>> cluster.metadata-self-heal: off
>> cluster.entry-self-heal: off
>> cluster.data-self-heal: off
>> features.inode-quota: off
>> features.quota: off
>> transport.listen-backlog: 100
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> nfs.disable: on
>> performance.strict-o-direct: on
>> network.remote-dio: off
>> server.allow-insecure: on
>> performance.write-behind: off
>> cluster.nufa: disable
>> diagnostics.latency-measurement: on
>> diagnostics.count-fop-hits: on
>> cluster.ensure-durability: off
>> cluster.self-heal-window-size: 32
>> cluster.favorite-child-policy: mtime
>> performance.io-thread-count: 32
>> cluster.eager-lock: off
>> server.outstanding-rpc-limit: 128
>> cluster.rebal-throttle: aggressive
>> server.event-threads: 3
>> client.event-threads: 3
>> performance.cache-size: 6GB
>> cluster.readdir-optimize: on
>> storage.build-pgfid: on
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing listGluster-users at
gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20181123/8a1b5595/attachment.html>

Gluster users - Nov 2018 - Gluster distributed replicated setup does not serve read from all bricks belonging to the same replica

[Gluster-users] Gluster distributed replicated setup does not serve read from all bricks belonging to the same replica

[Gluster-users] Gluster distributed replicated setup does not serve read from all bricks belonging to the same replica