thr3ads.net - Gluster users - [Gluster-users] gluster heal performance [Sep 2020]

If this information is useful, please help other people find it:
Share via:

Gionatan Danti

2020-Sep-11 06:34 UTC

[Gluster-users] gluster heal performance

Il 2020-09-11 05:27 Martin B?hr ha scritto:> Excerpts from Gionatan Danti's message of 2020-09-11 00:35:52 +0200:
>> The main point was the potentially long heal time
> 
> could you (or anyone else) please elaborate on what long heal times are
> to be expected?
Hi, there are multiple factor at works here:
- healing via network (gluster) vs internal bus data transfer (RAID 
rebuild);
- gluster being a user-space application which commands a significant 
CPU load;
- healing proceeding per-file and not in LBA order (ie: it has to 
traverse all the affected files/dirs, which means scattered random IO 
for the most part);
- other things which I am surely missing.
> we have a 3-node replica cluster running version 3.12.9 (we are 
> building
> a new cluster now) with 32TiB of space. each node has a single brick on
> top of a 7-disk raid5 (linux softraid)
3.12.9, while being the official RHEL 7 release, is very old now.
> at one point we had one node unavailable for one month (gluster failed
> to start up properly on that node and we didn't have monitoring in 
> place
> to notice) and the accumulated changes of one month of operation took 4
> months to heal. i would have expected this ideally to take 2 weeks or
> less, one month at the worst (ie faster than or at least as fast as it
> took to create the data but not slower, and especially not 4 times
> slower)
Wow, 4 months is a lot... but you had at least internal redundancy 
(RAID5 bricks). The OP was asking about running with *no* internal 
redundancy and this is the reason I suggest against it: losing a disk 
while needing weeks to heal is not good.
> the initial heal count was about 6million files for one node and
> 5.4million for the other.
> ...
> we do have a few huge directories with 250000, 88000, 60000 and 29000
> subdirectories each. in total 26TiB of small files, but no more than
> a few 1000 per directory. (it's user data, some have more, some have
> less)
> 
> could those huge directories be responsible for the slow healing?
The very high number of to-be-healed files surely has a negative impact 
on your heal speed.

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8

Martin Bähr

2020-Sep-11 09:12 UTC

head link

[Gluster-users] gluster heal performance

Excerpts from Gionatan Danti's message of 2020-09-11 08:34:04
+0200:> > we have a 3-node replica cluster running version 3.12.9 
> > with 32TiB of space. each node has a single brick on
> > top of a 7-disk raid5 (linux softraid)
> 3.12.9, while being the official RHEL 7 release, is very old now.
yes, i am aware. we didn't bother upgrading as we need to expand
capacity and it's cheaper to rent new servers than expand the old ones.
> > the accumulated changes of one month of operation took 4 months to
> > heal.
> Wow, 4 months is a lot... but you had at least internal redundancy 
> (RAID5 bricks). 
right, that, and we had 3 replicas. we could have just dropped the third
node, and would still have been ok.

for the new cluster we decided that 2 nodes is enough, because the data
is all backups anyways. even if we loose both nodes, we can at least in
theory still recover all the data. whether that's a good decision is a
risk calculation. is a third server worth the extra expense? we decided
that, for what is essentially a backup, it's not.

i considered 3 nodes but dropping the raid instead, but several comments
inclusing yours convinced me that keeping the raid is good. on the new
servers we'll each have 3 bricks with 5 disks in a raid 5 per brick.
> > the initial heal count was about 6million files for one node and
> > 5.4million for the other.
> > ...
> > we do have a few huge directories with 250000, 88000, 60000 and 29000
> > subdirectories each. in total 26TiB of small files, but no more than
> > a few 1000 per directory. (it's user data, some have more, some
have
> > less)
> > 
> > could those huge directories be responsible for the slow healing?
> 
> The very high number of to-be-healed files surely has a negative impact 
> on your heal speed.
that sounds like that there is an inefficiency within the healing
process that causes the healing speed to be non-linear depending on the
number of files.

greetings, martin.
--
general manager                                                    realss.com
student mentor                                                   fossasia.org
community mentor     blug.sh                                  beijinglug.club
pike programmer      pike.lysator.liu.se    caudium.net     societyserver.org
Martin B?hr          working in china        http://societyserver.org/mbaehr/

Gluster users - Sep 2020 - gluster heal performance

[Gluster-users] gluster heal performance

[Gluster-users] gluster heal performance