thr3ads.net - Gluster users - [Gluster-users] volume not working after yum update

If this information is useful, please help other people find it:
Share via:

Kingsley

2015-Aug-10 10:05 UTC

[Gluster-users] volume not working after yum update - gluster 3.6.3

Sorry for the blind panic - restarting the volume seems to have fixed
it.

But then my next question - why is this necessary? Surely it undermines
the whole point of a high availability system?

Cheers,
Kingsley.

On Mon, 2015-08-10 at 10:53 +0100, Kingsley wrote:> Hi,
> 
> We have a 4 way replicated volume using gluster 3.6.3 on CentOS 7.
> 
> Over the weekend I did a yum update on each of the bricks in turn, but
> now when clients (using fuse mounts) try to access the volume, it hangs.
> Gluster itself wasn't updated (we've disabled that repo so that we
keep
> to 3.6.3 for now).
> 
> This was what I did:
> 
>       * on first brick, "yum update"
>       * reboot brick
>       * watch "gluster volume status" on another brick and wait
for it
>         to say all 4 bricks are online before proceeding to update the
>         next brick
> 
> I was expecting the clients might pause 30 seconds while they notice a
> brick is offline, but then recover.
> 
> I've tried re-mounting clients, but that hasn't helped.
> 
> I can't see much data in any of the log files.
> 
> I've tried "gluster volume heal callrec" but it doesn't
seem to have
> helped.
> 
> What shall I do next?
> 
> I've pasted some stuff below in case any of it helps.
> 
> Cheers,
> Kingsley.
> 
> [root at gluster1b-1 ~]# gluster volume info callrec
> 
> Volume Name: callrec
> Type: Replicate
> Volume ID: a39830b7-eddb-4061-b381-39411274131a
> Status: Started
> Number of Bricks: 1 x 4 = 4
> Transport-type: tcp
> Bricks:
> Brick1: gluster1a-1:/data/brick/callrec
> Brick2: gluster1b-1:/data/brick/callrec
> Brick3: gluster2a-1:/data/brick/callrec
> Brick4: gluster2b-1:/data/brick/callrec
> Options Reconfigured:
> performance.flush-behind: off
> [root at gluster1b-1 ~]#
> 
> 
> [root at gluster1b-1 ~]# gluster volume status callrec
> Status of volume: callrec
> Gluster process                                         Port    Online  Pid
>
------------------------------------------------------------------------------
> Brick gluster1a-1:/data/brick/callrec                   49153   Y      
6803
> Brick gluster1b-1:/data/brick/callrec                   49153   Y      
2614
> Brick gluster2a-1:/data/brick/callrec                   49153   Y      
2645
> Brick gluster2b-1:/data/brick/callrec                   49153   Y      
4325
> NFS Server on localhost                                 2049    Y      
2769
> Self-heal Daemon on localhost                           N/A     Y      
2789
> NFS Server on gluster2a-1                               2049    Y      
2857
> Self-heal Daemon on gluster2a-1                         N/A     Y      
2814
> NFS Server on 88.151.41.100                             2049    Y      
6833
> Self-heal Daemon on 88.151.41.100                       N/A     Y      
6824
> NFS Server on gluster2b-1                               2049    Y      
4428
> Self-heal Daemon on gluster2b-1                         N/A     Y      
4387
> 
> Task Status of Volume callrec
>
------------------------------------------------------------------------------
> There are no active volume tasks
> 
> [root at gluster1b-1 ~]#
> 
> 
> [root at gluster1b-1 ~]# gluster volume heal callrec info
> Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/
> /to_process - Possibly undergoing heal
> 
> Number of entries: 1
> 
> Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/
> Number of entries: 0
> 
> Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/
> /to_process - Possibly undergoing heal
> 
> Number of entries: 1
> 
> Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/
> Number of entries: 0
> 
> [root at gluster1b-1 ~]#
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

Atin Mukherjee

2015-Aug-10 10:21 UTC

head link

[Gluster-users] volume not working after yum update - gluster 3.6.3

On 08/10/2015 03:35 PM, Kingsley wrote:> Sorry for the blind panic - restarting the volume seems to have fixed
> it.
> 
> But then my next question - why is this necessary? Surely it undermines
> the whole point of a high availability system?
> 
> Cheers,
> Kingsley.
> 
> On Mon, 2015-08-10 at 10:53 +0100, Kingsley wrote:
>> Hi,
>>
>> We have a 4 way replicated volume using gluster 3.6.3 on CentOS 7.
>>
>> Over the weekend I did a yum update on each of the bricks in turn, but
>> now when clients (using fuse mounts) try to access the volume, it
hangs.What does mount log file say when you tried to access the volume? Can
you attach the mount log file?>> Gluster itself wasn't updated (we've disabled that repo so that
we keep
>> to 3.6.3 for now).
>>
>> This was what I did:
>>
>>       * on first brick, "yum update"
>>       * reboot brick
>>       * watch "gluster volume status" on another brick and
wait for it
>>         to say all 4 bricks are online before proceeding to update the
>>         next brick
>>
>> I was expecting the clients might pause 30 seconds while they notice a
>> brick is offline, but then recover.
>>
>> I've tried re-mounting clients, but that hasn't helped.
>>
>> I can't see much data in any of the log files.
>>
>> I've tried "gluster volume heal callrec" but it
doesn't seem to have
>> helped.
>>
>> What shall I do next?
>>
>> I've pasted some stuff below in case any of it helps.
>>
>> Cheers,
>> Kingsley.
>>
>> [root at gluster1b-1 ~]# gluster volume info callrec
>>
>> Volume Name: callrec
>> Type: Replicate
>> Volume ID: a39830b7-eddb-4061-b381-39411274131a
>> Status: Started
>> Number of Bricks: 1 x 4 = 4
>> Transport-type: tcp
>> Bricks:
>> Brick1: gluster1a-1:/data/brick/callrec
>> Brick2: gluster1b-1:/data/brick/callrec
>> Brick3: gluster2a-1:/data/brick/callrec
>> Brick4: gluster2b-1:/data/brick/callrec
>> Options Reconfigured:
>> performance.flush-behind: off
>> [root at gluster1b-1 ~]#
>>
>>
>> [root at gluster1b-1 ~]# gluster volume status callrec
>> Status of volume: callrec
>> Gluster process                                         Port    Online 
Pid
>>
------------------------------------------------------------------------------
>> Brick gluster1a-1:/data/brick/callrec                   49153   Y      
6803
>> Brick gluster1b-1:/data/brick/callrec                   49153   Y      
2614
>> Brick gluster2a-1:/data/brick/callrec                   49153   Y      
2645
>> Brick gluster2b-1:/data/brick/callrec                   49153   Y      
4325
>> NFS Server on localhost                                 2049    Y      
2769
>> Self-heal Daemon on localhost                           N/A     Y      
2789
>> NFS Server on gluster2a-1                               2049    Y      
2857
>> Self-heal Daemon on gluster2a-1                         N/A     Y      
2814
>> NFS Server on 88.151.41.100                             2049    Y      
6833
>> Self-heal Daemon on 88.151.41.100                       N/A     Y      
6824
>> NFS Server on gluster2b-1                               2049    Y      
4428
>> Self-heal Daemon on gluster2b-1                         N/A     Y      
4387
>>
>> Task Status of Volume callrec
>>
------------------------------------------------------------------------------
>> There are no active volume tasks
>>
>> [root at gluster1b-1 ~]#
>>
>>
>> [root at gluster1b-1 ~]# gluster volume heal callrec info
>> Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/
>> /to_process - Possibly undergoing heal
>>
>> Number of entries: 1
>>
>> Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/
>> Number of entries: 0
>>
>> Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/
>> /to_process - Possibly undergoing heal
>>
>> Number of entries: 1
>>
>> Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/
>> Number of entries: 0
>>
>> [root at gluster1b-1 ~]#
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
-- 
~Atin

Kingsley

2015-Aug-10 13:49 UTC

head link

[Gluster-users] volume not working after yum update - gluster 3.6.3

Further to this, the volume doesn't seem overly healthy. Any idea how I
can get it back into a working state?

Trying to access one particular directory on the clients just hangs. If
I query heal info, that directory appears in the output as possibly
undergoing heal (actual directory name changed as it's private info):

[root at gluster1b-1 ~]# gluster volume heal callrec info
Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/
<gfid:164f888f-2049-49e6-ad26-c758ee091863>
/recordings/834723/14391 - Possibly undergoing heal

<gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f>
<gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e>
<gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c>
<gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb>
<gfid:650efeca-b45c-413b-acc3-f0a5853ccebd>
Number of entries: 7

Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/
Number of entries: 0

Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/
<gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f>
<gfid:164f888f-2049-49e6-ad26-c758ee091863>
<gfid:650efeca-b45c-413b-acc3-f0a5853ccebd>
<gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e>
/recordings/834723/14391 - Possibly undergoing heal

<gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c>
<gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb>
Number of entries: 7

Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/
Number of entries: 0


If I query each brick directly for the number of files/directories
within that, I get 1731 on gluster1a-1 and gluster2a-1, but 1737 on the
other two, using this command:

# find /data/brick/callrec/recordings/834723/14391 -print | wc -l

Cheers,
Kingsley.

On Mon, 2015-08-10 at 11:05 +0100, Kingsley wrote:> Sorry for the blind panic - restarting the volume seems to have fixed
> it.
> 
> But then my next question - why is this necessary? Surely it undermines
> the whole point of a high availability system?
> 
> Cheers,
> Kingsley.
> 
> On Mon, 2015-08-10 at 10:53 +0100, Kingsley wrote:
> > Hi,
> > 
> > We have a 4 way replicated volume using gluster 3.6.3 on CentOS 7.
> > 
> > Over the weekend I did a yum update on each of the bricks in turn, but
> > now when clients (using fuse mounts) try to access the volume, it
hangs.
> > Gluster itself wasn't updated (we've disabled that repo so
that we keep
> > to 3.6.3 for now).
> > 
> > This was what I did:
> > 
> >       * on first brick, "yum update"
> >       * reboot brick
> >       * watch "gluster volume status" on another brick and
wait for it
> >         to say all 4 bricks are online before proceeding to update the
> >         next brick
> > 
> > I was expecting the clients might pause 30 seconds while they notice a
> > brick is offline, but then recover.
> > 
> > I've tried re-mounting clients, but that hasn't helped.
> > 
> > I can't see much data in any of the log files.
> > 
> > I've tried "gluster volume heal callrec" but it
doesn't seem to have
> > helped.
> > 
> > What shall I do next?
> > 
> > I've pasted some stuff below in case any of it helps.
> > 
> > Cheers,
> > Kingsley.
> > 
> > [root at gluster1b-1 ~]# gluster volume info callrec
> > 
> > Volume Name: callrec
> > Type: Replicate
> > Volume ID: a39830b7-eddb-4061-b381-39411274131a
> > Status: Started
> > Number of Bricks: 1 x 4 = 4
> > Transport-type: tcp
> > Bricks:
> > Brick1: gluster1a-1:/data/brick/callrec
> > Brick2: gluster1b-1:/data/brick/callrec
> > Brick3: gluster2a-1:/data/brick/callrec
> > Brick4: gluster2b-1:/data/brick/callrec
> > Options Reconfigured:
> > performance.flush-behind: off
> > [root at gluster1b-1 ~]#
> > 
> > 
> > [root at gluster1b-1 ~]# gluster volume status callrec
> > Status of volume: callrec
> > Gluster process                                         Port    Online
Pid
> >
------------------------------------------------------------------------------
> > Brick gluster1a-1:/data/brick/callrec                   49153   Y     
6803
> > Brick gluster1b-1:/data/brick/callrec                   49153   Y     
2614
> > Brick gluster2a-1:/data/brick/callrec                   49153   Y     
2645
> > Brick gluster2b-1:/data/brick/callrec                   49153   Y     
4325
> > NFS Server on localhost                                 2049    Y     
2769
> > Self-heal Daemon on localhost                           N/A     Y     
2789
> > NFS Server on gluster2a-1                               2049    Y     
2857
> > Self-heal Daemon on gluster2a-1                         N/A     Y     
2814
> > NFS Server on 88.151.41.100                             2049    Y     
6833
> > Self-heal Daemon on 88.151.41.100                       N/A     Y     
6824
> > NFS Server on gluster2b-1                               2049    Y     
4428
> > Self-heal Daemon on gluster2b-1                         N/A     Y     
4387
> > 
> > Task Status of Volume callrec
> >
------------------------------------------------------------------------------
> > There are no active volume tasks
> > 
> > [root at gluster1b-1 ~]#
> > 
> > 
> > [root at gluster1b-1 ~]# gluster volume heal callrec info
> > Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/
> > /to_process - Possibly undergoing heal
> > 
> > Number of entries: 1
> > 
> > Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/
> > Number of entries: 0
> > 
> > Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/
> > /to_process - Possibly undergoing heal
> > 
> > Number of entries: 1
> > 
> > Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/
> > Number of entries: 0
> > 
> > [root at gluster1b-1 ~]#
> > 
> > 
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> > 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

Gluster users - Aug 2015 - volume not working after yum update - gluster 3.6.3

[Gluster-users] volume not working after yum update - gluster 3.6.3

[Gluster-users] volume not working after yum update - gluster 3.6.3

[Gluster-users] volume not working after yum update - gluster 3.6.3