thr3ads.net - Gluster users - [Gluster-users] Problems after upgrade/volume expansion [Feb 2014]

If this information is useful, please help other people find it:
Share via:

Branden Timm

2014-Feb-03 21:35 UTC

[Gluster-users] Problems after upgrade/volume expansion

Hello,
   I'm experiencing some major problems with my GlusterFS filesystem 
after an upgrade/expansion, and I'm hoping I can get pointed in the 
right direction for troubleshooting it.

I had a 5 server, 5 brick distributed volume on 3.3.1.  I brought the 
volume offline, stopped glusterd and glusterfsd on all servers, then 
upgraded to 3.4.2 and brought glusterd and glusterfsd back online.  So 
far so good.

Once the volume was back online and healthy, I added a new server to the 
trusted storage pool and added two bricks attached to that server to the 
pool.  Everything looked fine so far, gluster volume status showed all 
six servers and seven bricks as online.

The problem came next when I tried to rebalance.  I ran "gluster volume 
rebalance <volname> start force", then once it returned ran
"status" and
saw that the rebalance failed on all but one node, which showed in 
progress.  The node that it was running successfully on was a 
pre-existing server, not the new server/brick(s).  The other five 
servers report "1 subvolume(s) are down. Skipping fix layout." 
Somebody
in the IRC channel suggested this means that one of my bricks are down, 
but "gluster volume <volname> status" reports all servers and
bricks as
being online.   Full pastebin of the rebalance log (essentially the same 
on all five failing servers) here: http://fpaste.org/74082/14615971/

Currently, I have both missing files and files that report "Transport 
endopint not connected" when they are accessed.  It seems to really be 
related to the rebalance failures, and the layout seems incorrect as 
well.  Really hoping somebody can point me in the right direction of 
where to look next.  Thanks in advance for any help.

-Branden

Branden Timm

2014-Feb-03 22:15 UTC

head link

[Gluster-users] Problems after upgrade/volume expansion

I should mention that the following line from the log is also worrying, 
as each trusted server is running Gluster v. 3.4.2, as verified by 
running /usr/sbin/glusterd -V:

Using Program GlusterFS 3.3, Num (1298437), Version (330)

Branden

On 2/3/2014 3:35 PM, Branden Timm wrote:> Hello,
>   I'm experiencing some major problems with my GlusterFS filesystem 
> after an upgrade/expansion, and I'm hoping I can get pointed in the 
> right direction for troubleshooting it.
>
> I had a 5 server, 5 brick distributed volume on 3.3.1.  I brought the 
> volume offline, stopped glusterd and glusterfsd on all servers, then 
> upgraded to 3.4.2 and brought glusterd and glusterfsd back online.  So 
> far so good.
>
> Once the volume was back online and healthy, I added a new server to 
> the trusted storage pool and added two bricks attached to that server 
> to the pool.  Everything looked fine so far, gluster volume status 
> showed all six servers and seven bricks as online.
>
> The problem came next when I tried to rebalance.  I ran "gluster 
> volume rebalance <volname> start force", then once it returned
ran
> "status" and saw that the rebalance failed on all but one node,
which
> showed in progress.  The node that it was running successfully on was 
> a pre-existing server, not the new server/brick(s).  The other five 
> servers report "1 subvolume(s) are down. Skipping fix layout."  
> Somebody in the IRC channel suggested this means that one of my bricks 
> are down, but "gluster volume <volname> status" reports all
servers
> and bricks as being online.   Full pastebin of the rebalance log 
> (essentially the same on all five failing servers) here: 
> http://fpaste.org/74082/14615971/
>
> Currently, I have both missing files and files that report "Transport 
> endopint not connected" when they are accessed.  It seems to really be
> related to the rebalance failures, and the layout seems incorrect as 
> well.  Really hoping somebody can point me in the right direction of 
> where to look next.  Thanks in advance for any help.
>
> -Branden
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

Gluster users - Feb 2014 - Problems after upgrade/volume expansion

[Gluster-users] Problems after upgrade/volume expansion

[Gluster-users] Problems after upgrade/volume expansion