thr3ads.net - Gluster users - [Gluster-users] Failed rebalance resulting in major problems [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Shawn Heisey

2013-Nov-06 05:05 UTC

[Gluster-users] Failed rebalance resulting in major problems

We recently added storage servers to our gluster install, running 3.3.1
on CentOS 6.  It went from 40TB usable (8x2 distribute-replicate) to
80TB usable (16x2).  There was a little bit over 20TB used space on the
volume.

The add-brick went through without incident, but the rebalance failed
after moving 1.5TB of the approximately 10TB that needed to be moved.  A
side issue is that it took four days for that 1.5TB to move.  I'm aware
that gluster has overhead, and that there's only so much speed you can
get out of gigabit, but a 100Mb/s half-duplex link could have copied the
data faster if it had been a straight copy.

After I discovered that the rebalance had failed, I noticed that there
were other problems.  There are a small number of completely lost files
(91 that I know about so far), a huge number of permission issues (over
800,000 files changed to 000), and about 32000 files that are throwing
read errors via the fuse/nfs mount but seem to be available directly on
bricks.  That last category of problem file has the sticky bit set, with
almost all of them having ---------T permissions.  The good files on
bricks typically have the same permissions, but are readable by root.  I
haven't worked out the scripting necessary to automate all the fixing
that needs to happen yet.

We really need to know what happened.  We do plan to upgrade to 3.4.1,
but there were some reasons that we didn't want to upgrade before adding
storage.

* Upgrading will result in service interruption to our clients, which
mount via NFS.  It would likely be just a hiccup, with quick failover,
but it's still a service interruption.
* We have a pacemaker cluster providing the shared IP address for NFS
mounting.  It's running CentOS 6.3.  A "yum upgrade" to upgrade
gluster
will also upgrade to CentOS 6.4.  The pacemaker in 6.4 is incompatible
with the pacemaker in 6.3, which will likely result in
longer-than-expected downtime for the shared IP address.
* We didn't want to risk potential problems with running gluster 3.3.1
on the existing servers and 3.4.1 on the new servers.
* We needed the new storage added right away, before we could schedule
maintenance to deal with the upgrade issues.

Something that would be extremely helpful would be obtaining the
services of an expert-level gluster consultant who can look over
everything we've done to see if there is anything we've done wrong and
how we might avoid problems in the future.  I don't know how much the
company can authorize for this, but we obviously want it to be as cheap
as possible.  We are in Salt Lake City, UT, USA.  It would be preferable
to have the consultant be physically present at our location.

I'm working on redacting one bit of identifying info from our rebalance
log, then I can put it up on dropbox for everyone to examine.

Thanks,
Shawn

Justin Dossey

2013-Nov-06 19:52 UTC

head link

[Gluster-users] Failed rebalance resulting in major problems

Shawn,

I had a very similar experience with a rebalance on 3.3.1, and it took
weeks to get everything straightened out.  I would be happy to share the
scripts I wrote to correct the permissions issues if you wish, though I'm
not sure it would be appropriate to share them directly on this list.
 Perhaps I should just create a project on Github that is devoted to
collecting scripts people use to fix their GlusterFS environments!

After that (awful) experience, I am loath to run further rebalances.  I've
even spent days evaluating alternatives to GlusterFS, as my experience with
this list over the last six months indicates that support for community
users is minimal, even in the face of major bugs such as the one with
rebalancing and the continuing "gfid different on subvolume" bugs with
3.3.2.

Let me know what you think of the Github thing and I'll proceed
appropriately.


On Tue, Nov 5, 2013 at 9:05 PM, Shawn Heisey <gluster at elyograg.org>
wrote:
> We recently added storage servers to our gluster install, running 3.3.1
> on CentOS 6.  It went from 40TB usable (8x2 distribute-replicate) to
> 80TB usable (16x2).  There was a little bit over 20TB used space on the
> volume.
>
> The add-brick went through without incident, but the rebalance failed
> after moving 1.5TB of the approximately 10TB that needed to be moved.  A
> side issue is that it took four days for that 1.5TB to move.  I'm aware
> that gluster has overhead, and that there's only so much speed you can
> get out of gigabit, but a 100Mb/s half-duplex link could have copied the
> data faster if it had been a straight copy.
>
> After I discovered that the rebalance had failed, I noticed that there
> were other problems.  There are a small number of completely lost files
> (91 that I know about so far), a huge number of permission issues (over
> 800,000 files changed to 000), and about 32000 files that are throwing
> read errors via the fuse/nfs mount but seem to be available directly on
> bricks.  That last category of problem file has the sticky bit set, with
> almost all of them having ---------T permissions.  The good files on
> bricks typically have the same permissions, but are readable by root.  I
> haven't worked out the scripting necessary to automate all the fixing
> that needs to happen yet.
>
> We really need to know what happened.  We do plan to upgrade to 3.4.1,
> but there were some reasons that we didn't want to upgrade before
adding
> storage.
>
> * Upgrading will result in service interruption to our clients, which
> mount via NFS.  It would likely be just a hiccup, with quick failover,
> but it's still a service interruption.
> * We have a pacemaker cluster providing the shared IP address for NFS
> mounting.  It's running CentOS 6.3.  A "yum upgrade" to
upgrade gluster
> will also upgrade to CentOS 6.4.  The pacemaker in 6.4 is incompatible
> with the pacemaker in 6.3, which will likely result in
> longer-than-expected downtime for the shared IP address.
> * We didn't want to risk potential problems with running gluster 3.3.1
> on the existing servers and 3.4.1 on the new servers.
> * We needed the new storage added right away, before we could schedule
> maintenance to deal with the upgrade issues.
>
> Something that would be extremely helpful would be obtaining the
> services of an expert-level gluster consultant who can look over
> everything we've done to see if there is anything we've done wrong
and
> how we might avoid problems in the future.  I don't know how much the
> company can authorize for this, but we obviously want it to be as cheap
> as possible.  We are in Salt Lake City, UT, USA.  It would be preferable
> to have the consultant be physically present at our location.
>
> I'm working on redacting one bit of identifying info from our rebalance
> log, then I can put it up on dropbox for everyone to examine.
>
> Thanks,
> Shawn
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>


-- 
Justin Dossey
CTO, PodOmatic
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131106/635a6200/attachment.html>

Shawn Heisey

2013-Nov-07 21:04 UTC

head link

[Gluster-users] Failed rebalance resulting in major problems

(resending because my reply only went to Luk??)

On 11/7/2013 3:20 AM, Luk?? Bezdi?ka wrote:> I strongly suggest not using 3.3.1 or whole 3.3 branch. I would only 
> go for 3.4.1 on something close to production and even there I 
> wouldn't yet use rebalance/shrinking. We give gluster heavy testing 
> before it goes to production and about updating, why don't you build 
> your own packages? We are maintaining our builds for several years now 
> with our patches which gladly end up in gluster upstream sooner or later.
When I built the system, version 3.3.1 (and CentOS 6.3) was the latest 
that was available.  Before I added the new storage last week, I got 
onto the IRC channel and asked whether I should install the same version 
on the new servers, install the new version on the new servers, or 
upgrade the entire cluster before adding anything.  I got no actual 
answers to that question, and there wasn't really a lot of discussion 
that I noticed.  If someone did answer my question at that time, I 
missed it.

I decided to play it safe by installing the 3.3.1 version on the new 
servers.  It was a slightly newer revision, but I was told that there 
were only packaging differences, that the code itself was unchanged.  I 
installed CentOS 6.4, which I figured would be safe because Gluster is 
user-space and it's typically safe to upgrade RHEL/CentOS minor versions.

Before we deployed, I did do tests on my testbed where I added new 
storage bricks, did rebalances, removed bricks, etc. There were no 
problems with adding bricks or rebalancing, but I had nowhere near as 
many files or space used as we have in production.  I did encounter a 
bug with removing bricks, which I filed: 
https://bugzilla.redhat.com/show_bug.cgi?id=862347

Except for the 91 files that appear to be simply gone and unrecoverable, 
I am pretty much done dealing with the fallout ... but I still have 
nearly 9TB of data that needs to migrate before the bricks will be 
evenly filled, and I can't be sure that this won't happen when I request
another rebalance, or next time we need to increase the volume size by 
adding bricks.  I really need an expert to evaluate our setup and make 
recommendations.

I sent a request off to Redhat Consulting for help on this, but I 
haven't heard anything back from them.

Thanks,
Shawn

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131107/1b57fb1c/attachment.html>

Gluster users - Nov 2013 - Failed rebalance resulting in major problems

[Gluster-users] Failed rebalance resulting in major problems

[Gluster-users] Failed rebalance resulting in major problems

[Gluster-users] Failed rebalance resulting in major problems