thr3ads.net - Gluster users - [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks [Nov 2016]

If this information is useful, please help other people find it:
Share via:

Alex Crow

2016-Nov-12 19:56 UTC

[Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

> Sure, but thinking about it later we realised that it might be for the
better.
> I believe when sharding is enabled the shards will be dispersed across all
the
> replica sets, making it that losing a replica set will kill all your VMs.
>
> Imagine a 16x3 volume for example, losing 2 bricks could bring the whole
thing
> down if they happen to be in the same replica set. (I might be wrong about
the
> way gluster disperse shards, it's my understanding only, never had the
chance
> to test it).
> With multiple small clusters, we have the same disk space in the end but
not
> that problem, it's a bit more annoying to manage but for now that's
allright.
>
>>    I'm also subscribed to moosefs and lizardfs mailing list and I
don't
>>    recall any single data corruption/data loss event
>>
> Never used those, might be just because there are less users ? Really have
no idea,
> maybe you are right.I can add to this. I've been using MooseFS for general file storage with
Samba for over a year now for >25 million files shared to 350+ users.

I've *never* lost even a single file. We had some issues with
permissions but that needed a couple of lines added to our smb.conf
(CTDB cluster).

On the other hand at home, I tried to use GlusterFS for VM images in a
simple replica 2 setup with Pacemaker for HA. VMs were constantly
failing en masse even without making any changes. Very often the images
got corrupted and had to be restored from backups. This was over a year
ago but motivated me to try the VMs on MooseFS. Since then I've not had
a single problem with unexpected downtime or corruption.

It's not the fastest FS in the world but it's well balanced and has a
focus on consistency and reliability, Documentation clearly explains
where all the chunks of your files will be so you can clearly define
your resilience and recovery strategies. IMHO GlusterFS would be a great
product if it tried to:

a) Add less features per release, and/or slowing down the release cycle.
Maybe have a "Feature" branch like RozoFS, with a separate Stable and
Testing/Current. Stable is safe, Testing is risky, and "Feature" is
for
those that need to try new, well, features.
b) Concentrate on issues like split-brain, healing, and scaling online
without data loss. Seems to be a common theme on the list where healing
doesn't work without tinkering. It should really "just work".
c) Have a peek at BeeGFS. It's a very well-performing FS that has its
focus on HPC. You can't stand to lose many thousands of CPU-hours of
work if your FS goes down, and it has to be fast.

The biggest question for me is what is the target market for GlusterFS?
Is it:

HPC (performance, reliability on the large scale, ie loss of one file is
OK, all not, no funky features)
VM storage (much the same as HPC but large file performance required,no
loss or corruption of blocks within a file)
General File (medium performance OK, small file and random access
paramount, resilience and consistency need to be 99.999%, features such
as ACLs and XATTRs, snapshots required)

i think if the documentation/wiki addressed these questions it would
make it easier for newcomers to evaluate the product.


>
>>    If you change the shard size on a populated cluster,A  you break all
>>    existing data.
>
This needs to be a warning or clearly documented. If you lose a couple
of PB of data in a professional role, I'd not fancy your employment
prospects. I've always had the feeling that GlusterFS is a bit of a
playground for new features and the only way to really have a stable
storage system is to stump up the cash to RedHat (and we've purchased a
lot of RHEL/RHEV licences), but having so many problems in the community
version really even puts me off buying the full package!

Cheers

Alex

--
This message is intended only for the addressee and may contain
confidential information. Unless you are that person, you may not
disclose its contents or use it in any way and are requested to delete
the message along with any attachments and notify us immediately.
This email is not intended to, nor should it be taken to, constitute advice.
The information provided is correct to our knowledge & belief and must not
be used as a substitute for obtaining tax, regulatory, investment, legal or
any other appropriate advice.

"Transact" is operated by Integrated Financial Arrangements Ltd.
29 Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608
5300.
(Registered office: as above; Registered in England and Wales under
number: 3727592). Authorised and regulated by the Financial Conduct
Authority (entered on the Financial Services Register; no. 190856).

Kevin Lemonnier

2016-Nov-12 20:11 UTC

head link

[Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

> 
> On the other hand at home, I tried to use GlusterFS for VM images in a
> simple replica 2 setup with Pacemaker for HA. VMs were constantly
> failing en masse even without making any changes. Very often the images
> got corrupted and had to be restored from backups. This was over a year
> ago but motivated me to try the VMs on MooseFS. Since then I've not had
Yeah, there has been a lot of bad versions for VM, anything < 3.7.12 has
either huge heal problems or random corruption at runtime. That's why
I keep 3.7.12 everywhere, I know it works well at least with my config,
and since I have no use for the new feature why take the risk to update ?

Interesting comments on MooseFS, I've seen it but never tried it yet
because of the single server managing the cluster, seems like a huge
risk. Guess there must be ways to have that role failover or something.

-- 
Kevin Lemonnier
PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Digital signature
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20161112/634337ff/attachment.sig>

Gandalf Corvotempesta

2016-Nov-12 20:14 UTC

head link

[Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

Il 12 nov 2016 9:04 PM, "Alex Crow" <acrow at integrafin.co.uk>
ha scritto:
IMHO GlusterFS would be a great> product if it tried to:
>
> a) Add less features per release, and/or slowing down the release cycle.
> Maybe have a "Feature"
> those that need to try new, well, features.
> b) Concentrate on issues like split-brain, healing, and scaling online
> without data loss. Seems to be a common theme on the list where healing
> doesn't work without tinkering. It should really "just work".
> c) Have a peek at BeeGFS. It's a very well-performing FS that has its
> focus on HPC. You can't stand to lose many thousands of CPU-hours of
> work if your FS goes down, and it has to be fast.
>
> The biggest question for me is what is the target market for GlusterFS?
> Is it:
>
> HPC (performance, reliability on the large scale, ie loss of one file is
> OK, all not, no funky features)
> VM storage (much the same as HPC but large file performance required,no
> loss or corruption of blocks within a file)
> General File (medium performance OK, small file and random access
> paramount, resilience and consistency need to be 99.999%, features such
> as ACLs and XATTRs, snapshots required)
>
> i think if the documentation/wiki addressed these questions it would
> make it easier for newcomers to evaluate the product.
Totally agree
> This needs to be a warning or clearly documented. If you lose a couple
> of PB of data in a professional role, I'd not fancy your employment
> prospects. I've always had the feeling that GlusterFS is a bit of a
> playground for new features and the only way to really have a stable
> storage system is to stump up the cash to RedHat (and we've purchased a
> lot of RHEL/RHEV licences), but having so many problems in the community
> version really even puts me off buying the full package!
>
Again, totally agree
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20161112/66b198ff/attachment.html>

Gluster users - Nov 2016 - 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

[Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

[Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

[Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks