Atin Mukherjee
2015-Aug-11 02:18 UTC
[Gluster-users] volume not working after yum update - gluster 3.6.3
-Atin Sent from one plus one On Aug 10, 2015 11:58 PM, "Kingsley" <gluster at gluster.dogwind.com> wrote:> > > On Mon, 2015-08-10 at 22:53 +0530, Atin Mukherjee wrote: > [snip] >> >> > stat("/sys/fs/selinux", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 >> >> > brk(0) = 0x8db000 >> > brk(0x8fc000) = 0x8fc000 >> > mkdir("test", 0777 >> Can you also collect the statedump of all the brick processes when thecommand is hung?>> >> + Ravi, could you check this? > > > I ran the command but I could not find where it put the output: > > > [root at gluster1a-1 ~]# gluster volume statedump callrec all > volume statedump: success > [root at gluster1a-1 ~]# gluster volume info callrec > > Volume Name: callrec > Type: Replicate > Volume ID: a39830b7-eddb-4061-b381-39411274131a > Status: Started > Number of Bricks: 1 x 4 = 4 > Transport-type: tcp > Bricks: > Brick1: gluster1a-1:/data/brick/callrec > Brick2: gluster1b-1:/data/brick/callrec > Brick3: gluster2a-1:/data/brick/callrec > Brick4: gluster2b-1:/data/brick/callrec > Options Reconfigured: > performance.flush-behind: off > [root at gluster1a-1 ~]#gluster volume status callrec > Status of volume: callrec > Gluster process Port OnlinePid>------------------------------------------------------------------------------> Brick gluster1a-1:/data/brick/callrec 49153 Y29041> Brick gluster1b-1:/data/brick/callrec 49153 Y31260> Brick gluster2a-1:/data/brick/callrec 49153 Y31585> Brick gluster2b-1:/data/brick/callrec 49153 Y12153> NFS Server on localhost 2049 Y29733> Self-heal Daemon on localhost N/A Y29741> NFS Server on gluster1b-1 2049 Y31872> Self-heal Daemon on gluster1b-1 N/A Y31882> NFS Server on gluster2a-1 2049 Y32216> Self-heal Daemon on gluster2a-1 N/A Y32226> NFS Server on gluster2b-1 2049 Y12752> Self-heal Daemon on gluster2b-1 N/A Y12762> > Task Status of Volume callrec >------------------------------------------------------------------------------> There are no active volume tasks > > [root at gluster1a-1 ~]# ls -l /tmp > total 144 > drwx------. 3 root root 16 Aug 8 22:20 systemd-private-Dp10Pz > -rw-------. 1 root root 5818 Jul 31 06:39yum_save_tx.2015-07-31.06-39.JCvHd5.yumtx> -rw-------. 1 root root 5818 Aug 1 06:58yum_save_tx.2015-08-01.06-58.wBytr2.yumtx> -rw-------. 1 root root 5818 Aug 2 05:18yum_save_tx.2015-08-02.05-18.AXIFSe.yumtx> -rw-------. 1 root root 5818 Aug 3 07:15yum_save_tx.2015-08-03.07-15.EDd8rg.yumtx> -rw-------. 1 root root 5818 Aug 4 03:48yum_save_tx.2015-08-04.03-48.XE513B.yumtx> -rw-------. 1 root root 5818 Aug 5 09:03yum_save_tx.2015-08-05.09-03.mX8xXF.yumtx> -rw-------. 1 root root 28869 Aug 6 06:39yum_save_tx.2015-08-06.06-39.166wJX.yumtx> -rw-------. 1 root root 28869 Aug 7 07:20yum_save_tx.2015-08-07.07-20.rLqJnT.yumtx> -rw-------. 1 root root 28869 Aug 8 08:29yum_save_tx.2015-08-08.08-29.KKaite.yumtx> [root at gluster1a-1 ~]# > > > Where should I find the output of the statedump command?It should be there in var/run/gluster folder> > Cheers, > Kingsley. > > >> > >> >> > >> >> > >> >> > >> >> > >> >> >> > >> >> >> > Then ... do I need to run something on one of the bricks whilestrace is>> >> >> > running? >> >> >> > >> >> >> > Cheers, >> >> >> > Kingsley. >> >> >> > >> >> >> > >> >> >> > > > >> >> >> > > > [root at gluster1b-1 ~]# gluster volume heal callrec info >> >> >> > > > Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/ >> >> >> > > > <gfid:164f888f-2049-49e6-ad26-c758ee091863> >> >> >> > > > /recordings/834723/14391 - Possibly undergoing heal >> >> >> > > > >> >> >> > > > <gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f> >> >> >> > > > <gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e> >> >> >> > > > <gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c> >> >> >> > > > <gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb> >> >> >> > > > <gfid:650efeca-b45c-413b-acc3-f0a5853ccebd> >> >> >> > > > Number of entries: 7 >> >> >> > > > >> >> >> > > > Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/ >> >> >> > > > Number of entries: 0 >> >> >> > > > >> >> >> > > > Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/ >> >> >> > > > <gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f> >> >> >> > > > <gfid:164f888f-2049-49e6-ad26-c758ee091863> >> >> >> > > > <gfid:650efeca-b45c-413b-acc3-f0a5853ccebd> >> >> >> > > > <gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e> >> >> >> > > > /recordings/834723/14391 - Possibly undergoing heal >> >> >> > > > >> >> >> > > > <gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c> >> >> >> > > > <gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb> >> >> >> > > > Number of entries: 7 >> >> >> > > > >> >> >> > > > Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/ >> >> >> > > > Number of entries: 0 >> >> >> > > > >> >> >> > > > >> >> >> > > > If I query each brick directly for the number offiles/directories>> >> >> > > > within that, I get 1731 on gluster1a-1 and gluster2a-1, but1737 on>> >> >> > > the >> >> >> > > > other two, using this command: >> >> >> > > > >> >> >> > > > # find /data/brick/callrec/recordings/834723/14391 -print |wc -l>> >> >> > > > >> >> >> > > > Cheers, >> >> >> > > > Kingsley. >> >> >> > > > >> >> >> > > > On Mon, 2015-08-10 at 11:05 +0100, Kingsley wrote: >> >> >> > > > > Sorry for the blind panic - restarting the volume seems tohave>> >> >> > > fixed >> >> >> > > > > it. >> >> >> > > > > >> >> >> > > > > But then my next question - why is this necessary? Surelyit>> >> >> > > undermines >> >> >> > > > > the whole point of a high availability system? >> >> >> > > > > >> >> >> > > > > Cheers, >> >> >> > > > > Kingsley. >> >> >> > > > > >> >> >> > > > > On Mon, 2015-08-10 at 10:53 +0100, Kingsley wrote: >> >> >> > > > > > Hi, >> >> >> > > > > > >> >> >> > > > > > We have a 4 way replicated volume using gluster 3.6.3 onCentOS>> >> >> > > 7. >> >> >> > > > > > >> >> >> > > > > > Over the weekend I did a yum update on each of thebricks in>> >> >> > > turn, but >> >> >> > > > > > now when clients (using fuse mounts) try to access thevolume,>> >> >> > > it hangs. >> >> >> > > > > > Gluster itself wasn't updated (we've disabled that reposo that>> >> >> > > we keep >> >> >> > > > > > to 3.6.3 for now). >> >> >> > > > > > >> >> >> > > > > > This was what I did: >> >> >> > > > > > >> >> >> > > > > > * on first brick, "yum update" >> >> >> > > > > > * reboot brick >> >> >> > > > > > * watch "gluster volume status" on another brickand wait>> >> >> > > for it >> >> >> > > > > > to say all 4 bricks are online before proceedingto>> >> >> > > update the >> >> >> > > > > > next brick >> >> >> > > > > > >> >> >> > > > > > I was expecting the clients might pause 30 seconds whilethey>> >> >> > > notice a >> >> >> > > > > > brick is offline, but then recover. >> >> >> > > > > > >> >> >> > > > > > I've tried re-mounting clients, but that hasn't helped. >> >> >> > > > > > >> >> >> > > > > > I can't see much data in any of the log files. >> >> >> > > > > > >> >> >> > > > > > I've tried "gluster volume heal callrec" but it doesn'tseem to>> >> >> > > have >> >> >> > > > > > helped. >> >> >> > > > > > >> >> >> > > > > > What shall I do next? >> >> >> > > > > > >> >> >> > > > > > I've pasted some stuff below in case any of it helps. >> >> >> > > > > > >> >> >> > > > > > Cheers, >> >> >> > > > > > Kingsley. >> >> >> > > > > > >> >> >> > > > > > [root at gluster1b-1 ~]# gluster volume info callrec >> >> >> > > > > > >> >> >> > > > > > Volume Name: callrec >> >> >> > > > > > Type: Replicate >> >> >> > > > > > Volume ID: a39830b7-eddb-4061-b381-39411274131a >> >> >> > > > > > Status: Started >> >> >> > > > > > Number of Bricks: 1 x 4 = 4 >> >> >> > > > > > Transport-type: tcp >> >> >> > > > > > Bricks: >> >> >> > > > > > Brick1: gluster1a-1:/data/brick/callrec >> >> >> > > > > > Brick2: gluster1b-1:/data/brick/callrec >> >> >> > > > > > Brick3: gluster2a-1:/data/brick/callrec >> >> >> > > > > > Brick4: gluster2b-1:/data/brick/callrec >> >> >> > > > > > Options Reconfigured: >> >> >> > > > > > performance.flush-behind: off >> >> >> > > > > > [root at gluster1b-1 ~]# >> >> >> > > > > > >> >> >> > > > > > >> >> >> > > > > > [root at gluster1b-1 ~]# gluster volume status callrec >> >> >> > > > > > Status of volume: callrec >> >> >> > > > > > Gluster processPort>> >> >> > > Online Pid >> >> >> > > > > > >> >> >> > >------------------------------------------------------------------------------>> >> >> > > > > > Brick gluster1a-1:/data/brick/callrec49153>> >> >> > > Y 6803 >> >> >> > > > > > Brick gluster1b-1:/data/brick/callrec49153>> >> >> > > Y 2614 >> >> >> > > > > > Brick gluster2a-1:/data/brick/callrec49153>> >> >> > > Y 2645 >> >> >> > > > > > Brick gluster2b-1:/data/brick/callrec49153>> >> >> > > Y 4325 >> >> >> > > > > > NFS Server on localhost2049>> >> >> > > Y 2769 >> >> >> > > > > > Self-heal Daemon on localhostN/A>> >> >> > > Y 2789 >> >> >> > > > > > NFS Server on gluster2a-12049>> >> >> > > Y 2857 >> >> >> > > > > > Self-heal Daemon on gluster2a-1N/A>> >> >> > > Y 2814 >> >> >> > > > > > NFS Server on 88.151.41.1002049>> >> >> > > Y 6833 >> >> >> > > > > > Self-heal Daemon on 88.151.41.100N/A>> >> >> > > Y 6824 >> >> >> > > > > > NFS Server on gluster2b-12049>> >> >> > > Y 4428 >> >> >> > > > > > Self-heal Daemon on gluster2b-1N/A>> >> >> > > Y 4387 >> >> >> > > > > > >> >> >> > > > > > Task Status of Volume callrec >> >> >> > > > > > >> >> >> > >------------------------------------------------------------------------------>> >> >> > > > > > There are no active volume tasks >> >> >> > > > > > >> >> >> > > > > > [root at gluster1b-1 ~]# >> >> >> > > > > > >> >> >> > > > > > >> >> >> > > > > > [root at gluster1b-1 ~]# gluster volume heal callrec info >> >> >> > > > > > Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/ >> >> >> > > > > > /to_process - Possibly undergoing heal >> >> >> > > > > > >> >> >> > > > > > Number of entries: 1 >> >> >> > > > > > >> >> >> > > > > > Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/ >> >> >> > > > > > Number of entries: 0 >> >> >> > > > > > >> >> >> > > > > > Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/ >> >> >> > > > > > /to_process - Possibly undergoing heal >> >> >> > > > > > >> >> >> > > > > > Number of entries: 1 >> >> >> > > > > > >> >> >> > > > > > Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/ >> >> >> > > > > > Number of entries: 0 >> >> >> > > > > > >> >> >> > > > > > [root at gluster1b-1 ~]# >> >> >> > > > > > >> >> >> > > > > > >> >> >> > > > > > _______________________________________________ >> >> >> > > > > > Gluster-users mailing list >> >> >> > > > > > Gluster-users at gluster.org >> >> >> > > > > > http://www.gluster.org/mailman/listinfo/gluster-users >> >> >> > > > > > >> >> >> > > > > >> >> >> > > > > _______________________________________________ >> >> >> > > > > Gluster-users mailing list >> >> >> > > > > Gluster-users at gluster.org >> >> >> > > > > http://www.gluster.org/mailman/listinfo/gluster-users >> >> >> > > > > >> >> >> > > > >> >> >> > > > _______________________________________________ >> >> >> > > > Gluster-users mailing list >> >> >> > > > Gluster-users at gluster.org >> >> >> > > > http://www.gluster.org/mailman/listinfo/gluster-users >> >> >> > > >> >> >> > > >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> >> Gluster-users mailing list >> >> Gluster-users at gluster.org >> >> http://www.gluster.org/mailman/listinfo/gluster-users >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150811/7bc9c29a/attachment.html>
Kingsley
2015-Aug-11 05:14 UTC
[Gluster-users] volume not working after yum update - gluster 3.6.3
On Tue, 2015-08-11 at 07:48 +0530, Atin Mukherjee wrote:> -Atin > Sent from one plus one > On Aug 10, 2015 11:58 PM, "Kingsley" <gluster at gluster.dogwind.com> > wrote: > > > > > > On Mon, 2015-08-10 at 22:53 +0530, Atin Mukherjee wrote: > > [snip] > >> > >> > stat("/sys/fs/selinux", {st_mode=S_IFDIR|0755, st_size=0, ...}) > 0 > >> > >> > brk(0) = 0x8db000 > >> > brk(0x8fc000) = 0x8fc000 > >> > mkdir("test", 0777 > >> Can you also collect the statedump of all the brick processes when > the command is hung? > >> > >> + Ravi, could you check this? > > > > > > I ran the command but I could not find where it put the output:[snip]> > Where should I find the output of the statedump command? > It should be there in var/run/gluster folderThanks - replied offlist. Cheers, Kingsley. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150811/0e88d088/attachment.html>