thr3ads.net - Gluster users - [Gluster-users] volume not working after yum update

If this information is useful, please help other people find it:
Share via:

Atin Mukherjee

2015-Aug-11 02:18 UTC

[Gluster-users] volume not working after yum update - gluster 3.6.3

-Atin
Sent from one plus one
On Aug 10, 2015 11:58 PM, "Kingsley" <gluster at
gluster.dogwind.com> wrote:>
>
> On Mon, 2015-08-10 at 22:53 +0530, Atin Mukherjee wrote:
> [snip]
>>
>> > stat("/sys/fs/selinux", {st_mode=S_IFDIR|0755,
st_size=0, ...}) = 0
>>
>> > brk(0)                                  = 0x8db000
>> > brk(0x8fc000)                           = 0x8fc000
>> > mkdir("test", 0777
>> Can you also collect the statedump of all the brick processes when the
command is hung?>>
>> + Ravi, could you check this?
>
>
> I ran the command but I could not find where it put the output:
>
>
> [root at gluster1a-1 ~]# gluster volume statedump callrec all
> volume statedump: success
> [root at gluster1a-1 ~]# gluster volume info callrec
>
> Volume Name: callrec
> Type: Replicate
> Volume ID: a39830b7-eddb-4061-b381-39411274131a
> Status: Started
> Number of Bricks: 1 x 4 = 4
> Transport-type: tcp
> Bricks:
> Brick1: gluster1a-1:/data/brick/callrec
> Brick2: gluster1b-1:/data/brick/callrec
> Brick3: gluster2a-1:/data/brick/callrec
> Brick4: gluster2b-1:/data/brick/callrec
> Options Reconfigured:
> performance.flush-behind: off
> [root at gluster1a-1 ~]#gluster volume status callrec
> Status of volume: callrec
> Gluster process                                         Port    Online
Pid>
------------------------------------------------------------------------------> Brick gluster1a-1:/data/brick/callrec                   49153   Y
29041> Brick gluster1b-1:/data/brick/callrec                   49153   Y
31260> Brick gluster2a-1:/data/brick/callrec                   49153   Y
31585> Brick gluster2b-1:/data/brick/callrec                   49153   Y
12153> NFS Server on localhost                                 2049    Y
29733> Self-heal Daemon on localhost                           N/A     Y
29741> NFS Server on gluster1b-1                               2049    Y
31872> Self-heal Daemon on gluster1b-1                         N/A     Y
31882> NFS Server on gluster2a-1                               2049    Y
32216> Self-heal Daemon on gluster2a-1                         N/A     Y
32226> NFS Server on gluster2b-1                               2049    Y
12752> Self-heal Daemon on gluster2b-1                         N/A     Y
12762>
> Task Status of Volume callrec
>
------------------------------------------------------------------------------> There are no active volume tasks
>
> [root at gluster1a-1 ~]# ls -l /tmp
> total 144
> drwx------. 3 root root    16 Aug  8 22:20 systemd-private-Dp10Pz
> -rw-------. 1 root root  5818 Jul 31 06:39
yum_save_tx.2015-07-31.06-39.JCvHd5.yumtx> -rw-------. 1 root root  5818 Aug  1 06:58
yum_save_tx.2015-08-01.06-58.wBytr2.yumtx> -rw-------. 1 root root  5818 Aug  2 05:18
yum_save_tx.2015-08-02.05-18.AXIFSe.yumtx> -rw-------. 1 root root  5818 Aug  3 07:15
yum_save_tx.2015-08-03.07-15.EDd8rg.yumtx> -rw-------. 1 root root  5818 Aug  4 03:48
yum_save_tx.2015-08-04.03-48.XE513B.yumtx> -rw-------. 1 root root  5818 Aug  5 09:03
yum_save_tx.2015-08-05.09-03.mX8xXF.yumtx> -rw-------. 1 root root 28869 Aug  6 06:39
yum_save_tx.2015-08-06.06-39.166wJX.yumtx> -rw-------. 1 root root 28869 Aug  7 07:20
yum_save_tx.2015-08-07.07-20.rLqJnT.yumtx> -rw-------. 1 root root 28869 Aug  8 08:29
yum_save_tx.2015-08-08.08-29.KKaite.yumtx> [root at gluster1a-1 ~]#
>
>
> Where should I find the output of the statedump command?
It should be there in var/run/gluster folder>
> Cheers,
> Kingsley.
>
>
>> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >> >
>> >> >> > Then ... do I need to run something on one of
the bricks while
strace is>> >> >> > running?
>> >> >> >
>> >> >> > Cheers,
>> >> >> > Kingsley.
>> >> >> >
>> >> >> >
>> >> >> > > >
>> >> >> > > > [root at gluster1b-1 ~]# gluster
volume heal callrec info
>> >> >> > > > Brick
gluster1a-1.dns99.co.uk:/data/brick/callrec/
>> >> >> > > >
<gfid:164f888f-2049-49e6-ad26-c758ee091863>
>> >> >> > > > /recordings/834723/14391 - Possibly
undergoing heal
>> >> >> > > >
>> >> >> > > >
<gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f>
>> >> >> > > >
<gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e>
>> >> >> > > >
<gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c>
>> >> >> > > >
<gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb>
>> >> >> > > >
<gfid:650efeca-b45c-413b-acc3-f0a5853ccebd>
>> >> >> > > > Number of entries: 7
>> >> >> > > >
>> >> >> > > > Brick
gluster1b-1.dns99.co.uk:/data/brick/callrec/
>> >> >> > > > Number of entries: 0
>> >> >> > > >
>> >> >> > > > Brick
gluster2a-1.dns99.co.uk:/data/brick/callrec/
>> >> >> > > >
<gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f>
>> >> >> > > >
<gfid:164f888f-2049-49e6-ad26-c758ee091863>
>> >> >> > > >
<gfid:650efeca-b45c-413b-acc3-f0a5853ccebd>
>> >> >> > > >
<gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e>
>> >> >> > > > /recordings/834723/14391 - Possibly
undergoing heal
>> >> >> > > >
>> >> >> > > >
<gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c>
>> >> >> > > >
<gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb>
>> >> >> > > > Number of entries: 7
>> >> >> > > >
>> >> >> > > > Brick
gluster2b-1.dns99.co.uk:/data/brick/callrec/
>> >> >> > > > Number of entries: 0
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > If I query each brick directly for the
number of
files/directories>> >> >> > > > within that, I get 1731 on gluster1a-1
and gluster2a-1, but
1737 on>> >> >> > > the
>> >> >> > > > other two, using this command:
>> >> >> > > >
>> >> >> > > > # find
/data/brick/callrec/recordings/834723/14391 -print |
wc -l>> >> >> > > >
>> >> >> > > > Cheers,
>> >> >> > > > Kingsley.
>> >> >> > > >
>> >> >> > > > On Mon, 2015-08-10 at 11:05 +0100,
Kingsley wrote:
>> >> >> > > > > Sorry for the blind panic -
restarting the volume seems to
have>> >> >> > > fixed
>> >> >> > > > > it.
>> >> >> > > > >
>> >> >> > > > > But then my next question - why
is this necessary? Surely
it>> >> >> > > undermines
>> >> >> > > > > the whole point of a high
availability system?
>> >> >> > > > >
>> >> >> > > > > Cheers,
>> >> >> > > > > Kingsley.
>> >> >> > > > >
>> >> >> > > > > On Mon, 2015-08-10 at 10:53
+0100, Kingsley wrote:
>> >> >> > > > > > Hi,
>> >> >> > > > > >
>> >> >> > > > > > We have a 4 way replicated
volume using gluster 3.6.3 on
CentOS>> >> >> > > 7.
>> >> >> > > > > >
>> >> >> > > > > > Over the weekend I did a yum
update on each of the
bricks in>> >> >> > > turn, but
>> >> >> > > > > > now when clients (using fuse
mounts) try to access the
volume,>> >> >> > > it hangs.
>> >> >> > > > > > Gluster itself wasn't
updated (we've disabled that repo
so that>> >> >> > > we keep
>> >> >> > > > > > to 3.6.3 for now).
>> >> >> > > > > >
>> >> >> > > > > > This was what I did:
>> >> >> > > > > >
>> >> >> > > > > >       * on first brick,
"yum update"
>> >> >> > > > > >       * reboot brick
>> >> >> > > > > >       * watch "gluster
volume status" on another brick
and wait>> >> >> > > for it
>> >> >> > > > > >         to say all 4 bricks
are online before proceeding
to>> >> >> > > update the
>> >> >> > > > > >         next brick
>> >> >> > > > > >
>> >> >> > > > > > I was expecting the clients
might pause 30 seconds while
they>> >> >> > > notice a
>> >> >> > > > > > brick is offline, but then
recover.
>> >> >> > > > > >
>> >> >> > > > > > I've tried re-mounting
clients, but that hasn't helped.
>> >> >> > > > > >
>> >> >> > > > > > I can't see much data in
any of the log files.
>> >> >> > > > > >
>> >> >> > > > > > I've tried "gluster
volume heal callrec" but it doesn't
seem to>> >> >> > > have
>> >> >> > > > > > helped.
>> >> >> > > > > >
>> >> >> > > > > > What shall I do next?
>> >> >> > > > > >
>> >> >> > > > > > I've pasted some stuff
below in case any of it helps.
>> >> >> > > > > >
>> >> >> > > > > > Cheers,
>> >> >> > > > > > Kingsley.
>> >> >> > > > > >
>> >> >> > > > > > [root at gluster1b-1 ~]#
gluster volume info callrec
>> >> >> > > > > >
>> >> >> > > > > > Volume Name: callrec
>> >> >> > > > > > Type: Replicate
>> >> >> > > > > > Volume ID:
a39830b7-eddb-4061-b381-39411274131a
>> >> >> > > > > > Status: Started
>> >> >> > > > > > Number of Bricks: 1 x 4 = 4
>> >> >> > > > > > Transport-type: tcp
>> >> >> > > > > > Bricks:
>> >> >> > > > > > Brick1:
gluster1a-1:/data/brick/callrec
>> >> >> > > > > > Brick2:
gluster1b-1:/data/brick/callrec
>> >> >> > > > > > Brick3:
gluster2a-1:/data/brick/callrec
>> >> >> > > > > > Brick4:
gluster2b-1:/data/brick/callrec
>> >> >> > > > > > Options Reconfigured:
>> >> >> > > > > > performance.flush-behind:
off
>> >> >> > > > > > [root at gluster1b-1 ~]#
>> >> >> > > > > >
>> >> >> > > > > >
>> >> >> > > > > > [root at gluster1b-1 ~]#
gluster volume status callrec
>> >> >> > > > > > Status of volume: callrec
>> >> >> > > > > > Gluster process
 Port>> >> >> > > Online  Pid
>> >> >> > > > > >
>> >> >> > >
------------------------------------------------------------------------------>> >> >> > > > > > Brick
gluster1a-1:/data/brick/callrec
 49153>> >> >> > >  Y       6803
>> >> >> > > > > > Brick
gluster1b-1:/data/brick/callrec
 49153>> >> >> > >  Y       2614
>> >> >> > > > > > Brick
gluster2a-1:/data/brick/callrec
 49153>> >> >> > >  Y       2645
>> >> >> > > > > > Brick
gluster2b-1:/data/brick/callrec
 49153>> >> >> > >  Y       4325
>> >> >> > > > > > NFS Server on localhost
 2049>> >> >> > > Y       2769
>> >> >> > > > > > Self-heal Daemon on
localhost
 N/A>> >> >> > >  Y       2789
>> >> >> > > > > > NFS Server on gluster2a-1
 2049>> >> >> > > Y       2857
>> >> >> > > > > > Self-heal Daemon on
gluster2a-1
 N/A>> >> >> > >  Y       2814
>> >> >> > > > > > NFS Server on 88.151.41.100
 2049>> >> >> > > Y       6833
>> >> >> > > > > > Self-heal Daemon on
88.151.41.100
 N/A>> >> >> > >  Y       6824
>> >> >> > > > > > NFS Server on gluster2b-1
 2049>> >> >> > > Y       4428
>> >> >> > > > > > Self-heal Daemon on
gluster2b-1
 N/A>> >> >> > >  Y       4387
>> >> >> > > > > >
>> >> >> > > > > > Task Status of Volume
callrec
>> >> >> > > > > >
>> >> >> > >
------------------------------------------------------------------------------>> >> >> > > > > > There are no active volume
tasks
>> >> >> > > > > >
>> >> >> > > > > > [root at gluster1b-1 ~]#
>> >> >> > > > > >
>> >> >> > > > > >
>> >> >> > > > > > [root at gluster1b-1 ~]#
gluster volume heal callrec info
>> >> >> > > > > > Brick
gluster1a-1.dns99.co.uk:/data/brick/callrec/
>> >> >> > > > > > /to_process - Possibly
undergoing heal
>> >> >> > > > > >
>> >> >> > > > > > Number of entries: 1
>> >> >> > > > > >
>> >> >> > > > > > Brick
gluster1b-1.dns99.co.uk:/data/brick/callrec/
>> >> >> > > > > > Number of entries: 0
>> >> >> > > > > >
>> >> >> > > > > > Brick
gluster2a-1.dns99.co.uk:/data/brick/callrec/
>> >> >> > > > > > /to_process - Possibly
undergoing heal
>> >> >> > > > > >
>> >> >> > > > > > Number of entries: 1
>> >> >> > > > > >
>> >> >> > > > > > Brick
gluster2b-1.dns99.co.uk:/data/brick/callrec/
>> >> >> > > > > > Number of entries: 0
>> >> >> > > > > >
>> >> >> > > > > > [root at gluster1b-1 ~]#
>> >> >> > > > > >
>> >> >> > > > > >
>> >> >> > > > > >
_______________________________________________
>> >> >> > > > > > Gluster-users mailing list
>> >> >> > > > > > Gluster-users at gluster.org
>> >> >> > > > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>> >> >> > > > > >
>> >> >> > > > >
>> >> >> > > > >
_______________________________________________
>> >> >> > > > > Gluster-users mailing list
>> >> >> > > > > Gluster-users at gluster.org
>> >> >> > > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>> >> >> > > > >
>> >> >> > > >
>> >> >> > > >
_______________________________________________
>> >> >> > > > Gluster-users mailing list
>> >> >> > > > Gluster-users at gluster.org
>> >> >> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>> >> >> > >
>> >> >> > >
>> >> >> >
>> >> >>
>> >> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Gluster-users mailing list
>> >> Gluster-users at gluster.org
>> >> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150811/7bc9c29a/attachment.html>

Kingsley

2015-Aug-11 05:14 UTC

head link

[Gluster-users] volume not working after yum update - gluster 3.6.3

On Tue, 2015-08-11 at 07:48 +0530, Atin Mukherjee wrote:
> -Atin
> Sent from one plus one
> On Aug 10, 2015 11:58 PM, "Kingsley" <gluster at
gluster.dogwind.com>
> wrote:
> >
> >
> > On Mon, 2015-08-10 at 22:53 +0530, Atin Mukherjee wrote:
> > [snip]
> >>
> >> > stat("/sys/fs/selinux", {st_mode=S_IFDIR|0755,
st_size=0, ...}) > 0
> >>
> >> > brk(0)                                  = 0x8db000
> >> > brk(0x8fc000)                           = 0x8fc000
> >> > mkdir("test", 0777
> >> Can you also collect the statedump of all the brick processes when
> the command is hung?
> >>   
> >> + Ravi, could you check this?
> >
> >
> > I ran the command but I could not find where it put the output:
[snip]
> > Where should I find the output of the statedump command?
> It should be there in var/run/gluster folder

Thanks - replied offlist.

Cheers,
Kingsley.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150811/0e88d088/attachment.html>

Gluster users - Aug 2015 - volume not working after yum update - gluster 3.6.3

[Gluster-users] volume not working after yum update - gluster 3.6.3

[Gluster-users] volume not working after yum update - gluster 3.6.3