thr3ads.net - Gluster users - [Gluster-users] State of Gluster project [Jun 2020]

If this information is useful, please help other people find it:
Share via:

Hu Bert

2020-Jun-22 04:58 UTC

[Gluster-users] State of Gluster project

Am So., 21. Juni 2020 um 19:43 Uhr schrieb Gionatan Danti <g.danti at
assyoma.it>:
> For the RAID6/10 setup, I found no issues: simply replace the broken
> disk without involing Gluster at all. However, this also means facing
> the "iops wall" I described earlier for single-brick node. Going
> full-Guster with JBODs would be interesting from a performance
> standpoint, but this complicate eventual recovery from bad disks.
>
> Does someone use Gluster in JBOD mode? If so, can you share your
> experience?
> Thanks.
Hi,
we once used gluster with disks in JBOD mode (3 servers, 4x10TB hdd
each, 4 x 3 = 12), and to make it short: in our special case it wasn't
that funny. Big HDDs, lots of small files, (highly) concurrent access
through our application. It was running quite fine, until a disk
failed. The reset-disk took ~30 (!) days, as you have gluster
copying/restoring the data and the normal application read/write.
After the first reset had finished, a couple of days later another
disk died, and the fun started again :-) Maybe a bad use case.

With this experience, the next setup was: splitting data into 2 chunks
(high I/O, low I/O), 3 servers with 2 raid10 (same type of disk), each
raid used as a brick, resulting in replicate 3: 1 x 3 = 3. Changing a
failed disk now results in a complete raid resync, but regarding I/O
this is far better than using reset-disk with a HDD only. Only the
regularly running raid check was a bit of a performance issue.

Latest setup (for the high I/O part) looks like this: 3 servers, 10
disks with 10TB each -> 5 raid1, forming a distribute replicate with 5
bricks, 5 x 3 = 15. No disk has failed so far (fingers crossed), but
if now a disk fails, gluster is still running with all bricks
available, and after changing the failed, there's one raid resync
running, affecting only 1/5 of the volume. In theory that should be
better ;-) The regularly running raid checks are no problem so far,
for 15 raid1 only 1 is running, none parallel.

disclaimer: JBOD may work better with SSDs/NVMes - untested ;-)


Best regards,
Hubert

Strahil Nikolov

2020-Jun-22 07:46 UTC

head link

[Gluster-users] State of Gluster project

Hi Hubert,

keep in mind  RH recommends disks of size 2-3 TB, not 10. I guess that has
changed the situation.

For NVMe/SSD  - raid controller is pointless ,  so JBOD makes  most sense.

Best Regards,
Strahil Nikolov

?? 22 ??? 2020 ?. 7:58:56 GMT+03:00, Hu Bert <revirii at googlemail.com>
??????:>Am So., 21. Juni 2020 um 19:43 Uhr schrieb Gionatan Danti
><g.danti at assyoma.it>:
>
>> For the RAID6/10 setup, I found no issues: simply replace the broken
>> disk without involing Gluster at all. However, this also means facing
>> the "iops wall" I described earlier for single-brick node.
Going
>> full-Guster with JBODs would be interesting from a performance
>> standpoint, but this complicate eventual recovery from bad disks.
>>
>> Does someone use Gluster in JBOD mode? If so, can you share your
>> experience?
>> Thanks.
>
>Hi,
>we once used gluster with disks in JBOD mode (3 servers, 4x10TB hdd
>each, 4 x 3 = 12), and to make it short: in our special case it wasn't
>that funny. Big HDDs, lots of small files, (highly) concurrent access
>through our application. It was running quite fine, until a disk
>failed. The reset-disk took ~30 (!) days, as you have gluster
>copying/restoring the data and the normal application read/write.
>After the first reset had finished, a couple of days later another
>disk died, and the fun started again :-) Maybe a bad use case.
>
>With this experience, the next setup was: splitting data into 2 chunks
>(high I/O, low I/O), 3 servers with 2 raid10 (same type of disk), each
>raid used as a brick, resulting in replicate 3: 1 x 3 = 3. Changing a
>failed disk now results in a complete raid resync, but regarding I/O
>this is far better than using reset-disk with a HDD only. Only the
>regularly running raid check was a bit of a performance issue.
>
>Latest setup (for the high I/O part) looks like this: 3 servers, 10
>disks with 10TB each -> 5 raid1, forming a distribute replicate with 5
>bricks, 5 x 3 = 15. No disk has failed so far (fingers crossed), but
>if now a disk fails, gluster is still running with all bricks
>available, and after changing the failed, there's one raid resync
>running, affecting only 1/5 of the volume. In theory that should be
>better ;-) The regularly running raid checks are no problem so far,
>for 15 raid1 only 1 is running, none parallel.
>
>disclaimer: JBOD may work better with SSDs/NVMes - untested ;-)
>
>
>Best regards,
>Hubert

Gionatan Danti

2020-Jun-22 07:58 UTC

head link

[Gluster-users] State of Gluster project

Il 2020-06-22 06:58 Hu Bert ha scritto:> Am So., 21. Juni 2020 um 19:43 Uhr schrieb Gionatan Danti 
> <g.danti at assyoma.it>:
> 
>> For the RAID6/10 setup, I found no issues: simply replace the broken
>> disk without involing Gluster at all. However, this also means facing
>> the "iops wall" I described earlier for single-brick node.
Going
>> full-Guster with JBODs would be interesting from a performance
>> standpoint, but this complicate eventual recovery from bad disks.
>> 
>> Does someone use Gluster in JBOD mode? If so, can you share your
>> experience?
>> Thanks.
> 
> Hi,
> we once used gluster with disks in JBOD mode (3 servers, 4x10TB hdd
> each, 4 x 3 = 12), and to make it short: in our special case it wasn't
> that funny. Big HDDs, lots of small files, (highly) concurrent access
> through our application. It was running quite fine, until a disk
> failed. The reset-disk took ~30 (!) days, as you have gluster
> copying/restoring the data and the normal application read/write.
> After the first reset had finished, a couple of days later another
> disk died, and the fun started again :-) Maybe a bad use case.
Hi Hubert,
this is the exact scenario which scares me if/when using JBOD. Maybe for 
virtual machine disks (ie: big files) it would be faster, but still...
> Latest setup (for the high I/O part) looks like this: 3 servers, 10
> disks with 10TB each -> 5 raid1, forming a distribute replicate with 5
> bricks, 5 x 3 = 15. No disk has failed so far (fingers crossed), but
> if now a disk fails, gluster is still running with all bricks
> available, and after changing the failed, there's one raid resync
> running, affecting only 1/5 of the volume. In theory that should be
> better ;-) The regularly running raid checks are no problem so far,
> for 15 raid1 only 1 is running, none parallel.
Ok, so you multiplied the number of bricks by using multiple RAID1 
array. Good idea, it should be fine in this manner.
> disclaimer: JBOD may work better with SSDs/NVMes - untested ;-)
Yeah, I think so!

Regads.


-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it [1]
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8

Gluster users - Jun 2020 - State of Gluster project

[Gluster-users] State of Gluster project

[Gluster-users] State of Gluster project

[Gluster-users] State of Gluster project