Hi, I found the coredump file, but it's a 15Mo file (zipped), I can't post it on this mailling list. Here is some parts of the repport :> ProblemType: Crash > Architecture: amd64 > Date: Sun Jun 26 11:27:44 2016 > DistroRelease: Ubuntu 14.04 > ExecutablePath: /usr/sbin/glusterfsd > ExecutableTimestamp: 1460982898 > ProcCmdline: /usr/sbin/glusterfsd -s nfs05 --volfile-id > cdn.nfs05.srv-cdn -p /var/lib/glusterd/vols/cdn/run/nfs05-srv-cdn.pid > -S /var/run/gluster/d52ac3e6c0a3fa316a9e8360976f3af5.socket > --brick-name /srv/cdn -l /var/log/glusterfs/bricks/srv-cdn.log > --xlator-option > *-posix.glusterd-uuid=6af63b78-a3da-459d-a909-c010e6c9072c > --brick-port 49155 --xlator-option cdn-server.listen-port=49155 > ProcCwd: / > ProcEnviron: > PATH=(custom, no user) > TERM=linux > ProcMaps: > 7f25f18d9000-7f25f18da000 ---p 00000000 00:00 0 > 7f25f18da000-7f25f19da000 rw-p 00000000 00:00 > 0 [stack:849] > 7f25f19da000-7f25f19db000 ---p 00000000 00:00 0...> ProcStatus: > Name: glusterfsd > State: D (disk sleep) > Tgid: 7879 > Ngid: 0 > Pid: 7879 > PPid: 1 > TracerPid: 0 > Uid: 0 0 0 0 > Gid: 0 0 0 0 > FDSize: 64 > Groups: 0 > VmPeak: 878404 kB > VmSize: 878404 kB > VmLck: 0 kB > VmPin: 0 kB > VmHWM: 96104 kB > VmRSS: 90652 kB > VmData: 792012 kB > VmStk: 276 kB > VmExe: 84 kB > VmLib: 7716 kB > VmPTE: 700 kB > VmSwap: 20688 kB > Threads: 22 > SigQ: 0/30034 > SigPnd: 0000000000000000 > ShdPnd: 0000000000000000 > SigBlk: 0000000000004a01 > SigIgn: 0000000000001000 > SigCgt: 00000001800000fa > CapInh: 0000000000000000 > CapPrm: 0000001fffffffff > CapEff: 0000001fffffffff > CapBnd: 0000001fffffffff > Seccomp: 0 > Cpus_allowed: 7fff > Cpus_allowed_list: 0-14 > Mems_allowed: 00000000,00000001 > Mems_allowed_list: 0 > voluntary_ctxt_switches: 3 > nonvoluntary_ctxt_switches: 1 > Signal: 11 > Uname: Linux 3.13.0-44-generic x86_64 > UserGroups: > CoreDump: base64... Yann Le 28/06/2016 09:31, Anoop C S a ?crit :> On Mon, 2016-06-27 at 15:05 +0200, Yann LEMARIE wrote: >> @Anoop, >> >> Where can I find the coredump file ? >> > You will get hints about the crash from entries inside > /var/log/messages(for example pid of the process, location of coredump > etc). > >> The crash occurs 2 times last 7 days, each time a sunday morning with >> no reason, no increase of traffic or something like this, the volume >> was mounted since 15 days. >> >> The bricks are used as a CDN like, distributting small images and css >> files with a nginx https service (with a load balancer and 2 EC2), on >> a sunday morning there is not a lot of activity ... >> > From the very minimal back trace that we have from brick logs I would > assume that a truncate operation was being handled by trash translator > and it crashed. > >> Volume infos: >>> root at nfs05 /var/log/glusterfs # gluster volume info cdn >>> >>> Volume Name: cdn >>> Type: Replicate >>> Volume ID: c53b9bae-5e12-4f13-8217-53d8c96c302c >>> Status: Started >>> Number of Bricks: 1 x 2 = 2 >>> Transport-type: tcp >>> Bricks: >>> Brick1: nfs05:/srv/cdn >>> Brick2: nfs06:/srv/cdn >>> Options Reconfigured: >>> performance.readdir-ahead: on >>> features.trash: on >>> features.trash-max-filesize: 20MB >> >> I don't know if there is a link with this crash problem, but I have >> another problem with my 2 servers that make GluserFS's clients >> disconnected (from another volume) : >>> Jun 24 02:28:04 nfs05 kernel: [2039468.818617] xen_netfront: >>> xennet: skb rides the rocket: 19 slots >>> Jun 24 02:28:11 nfs05 kernel: [2039475.744086] net_ratelimit: 66 >>> callbacks suppressed >> It seem to be a network interface problem : >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1317811 >> >> Yann >> >> Le 27/06/2016 12:59, Anoop C S a ?crit : >>> On Mon, 2016-06-27 at 09:47 +0200, Yann LEMARIE wrote: >>>> Hi, >>>> >>>> I'm using GlusterFS since many years and never see this problem, >>>> but >>>> this is the second time in one week ... >>>> >>>> I have 3 volumes with 2 bricks and 1 volume crash with no reason, >>> Did you observe the crash while mounting the volume? Or can you be >>> more >>> specific on what were you doing just before you saw the crash? Can >>> you >>> please share the output of `gluster volume info <VOLNAME>`? >>> >>>> I just have to stop/start the volume to make it up again. >>>> The only logs I can find are in syslog : >>>> >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: pending frames: >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: frame : type(0) op(10) >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: patchset: >>>>> git://git.gluster.com/glusterfs.git >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: signal received: 11 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: time of crash: >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: 2016-06-26 09:27:44 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: configuration details: >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: argp 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: backtrace 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: dlfcn 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: libpthread 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: llistxattr 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: setfsid 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: spinlock 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: epoll.h 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: xattr.h 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: st_atim.tv_nsec 1 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: package-string: glusterfs >>>>> 3.7.11 >>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: --------- >>>>> >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: pending frames: >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: frame : type(0) op(10) >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: patchset: >>>>> git://git.gluster.com/glusterfs.git >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: signal received: 11 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: time of crash: >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: 2016-06-26 09:27:44 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: configuration details: >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: argp 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: backtrace 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: dlfcn 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: libpthread 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: llistxattr 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: setfsid 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: spinlock 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: epoll.h 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: xattr.h 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: st_atim.tv_nsec 1 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: package-string: glusterfs >>>>> 3.7.11 >>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: --------- >>>>> >>>> >>>> Thanks for your help >>>> >>>> >>>> Regards >>>> -- >>>> Yann Lemari? >>>> iRaiser - Support Technique >>>> >>>> ylemarie at iraiser.eu >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >> >> -- >> Yann Lemari? >> iRaiser - Support Technique >> >> ylemarie at iraiser.eu >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users-- Yann Lemari? iRaiser - Support Technique iRaiser Logotype <http://www.iraiser.eu> ylemarie at iraiser.eu <mailto:ylemarie at iraiser.eu> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160628/be9de102/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: efihhdbb.png Type: image/png Size: 1842 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160628/be9de102/attachment.png>
On Tue, 2016-06-28 at 10:49 +0200, Yann LEMARIE wrote:> Hi, > > I found the coredump file, but it's a 15Mo file (zipped), I can't > post it on this mailling list. >Great. In order to exactly pin point the crash location, can you please attach gdb to extracted coredump file and share us the complete back trace by executing `bt` command in gdb shell? Apart from gdb you may be instructed to install some debug-info packages for extracting a useful back trace while attaching gdb as follows: # gdb /usr/sbin/glusterfsd <path-to-coredump-file> If prompted install required packages and reattach the coredump file. When you are inside (gdb) prompt type 'bt' and paste the back trace.> Here is some parts of the repport : > > > ProblemType: Crash > > Architecture: amd64 > > Date: Sun Jun 26 11:27:44 2016 > > DistroRelease: Ubuntu 14.04 > > ExecutablePath: /usr/sbin/glusterfsd > > ExecutableTimestamp: 1460982898 > > ProcCmdline: /usr/sbin/glusterfsd -s nfs05 --volfile-id > > cdn.nfs05.srv-cdn -p /var/lib/glusterd/vols/cdn/run/nfs05-srv- > > cdn.pid -S /var/run/gluster/d52ac3e6c0a3fa316a9e8360976f3af5.socket > > --brick-name /srv/cdn -l /var/log/glusterfs/bricks/srv-cdn.log -- > > xlator-option *-posix.glusterd-uuid=6af63b78-a3da-459d-a909- > > c010e6c9072c --brick-port 49155 --xlator-option cdn-server.listen- > > port=49155 > > ProcCwd: / > > ProcEnviron: > > ?PATH=(custom, no user) > > ?TERM=linux > > ProcMaps: > > ?7f25f18d9000-7f25f18da000 ---p 00000000 00:00 0 > > ?7f25f18da000-7f25f19da000 rw-p 00000000 00:00 > > 0????????????????????????? [stack:849] > > ?7f25f19da000-7f25f19db000 ---p 00000000 00:00 0 > ?... > > ProcStatus: > > ?Name:? glusterfsd > > ?State: D (disk sleep) > > ?Tgid:? 7879 > > ?Ngid:? 0 > > ?Pid:?? 7879 > > ?PPid:? 1 > > ?TracerPid:???? 0 > > ?Uid:?? 0?????? 0?????? 0?????? 0 > > ?Gid:?? 0?????? 0?????? 0?????? 0 > > ?FDSize:??????? 64 > > ?Groups:??????? 0 > > ?VmPeak:????????? 878404 kB > > ?VmSize:????????? 878404 kB > > ?VmLck:??????? 0 kB > > ?VmPin:??????? 0 kB > > ?VmHWM:??? 96104 kB > > ?VmRSS:??? 90652 kB > > ?VmData:????????? 792012 kB > > ?VmStk:????? 276 kB > > ?VmExe:?????? 84 kB > > ?VmLib:???? 7716 kB > > ?VmPTE:????? 700 kB > > ?VmSwap:?????????? 20688 kB > > ?Threads:?????? 22 > > ?SigQ:? 0/30034 > > ?SigPnd:??????? 0000000000000000 > > ?ShdPnd:??????? 0000000000000000 > > ?SigBlk:??????? 0000000000004a01 > > ?SigIgn:??????? 0000000000001000 > > ?SigCgt:??????? 00000001800000fa > > ?CapInh:??????? 0000000000000000 > > ?CapPrm:??????? 0000001fffffffff > > ?CapEff:??????? 0000001fffffffff > > ?CapBnd:??????? 0000001fffffffff > > ?Seccomp:?????? 0 > > ?Cpus_allowed:? 7fff > > ?Cpus_allowed_list:???? 0-14 > > ?Mems_allowed:? 00000000,00000001 > > ?Mems_allowed_list:???? 0 > > ?voluntary_ctxt_switches:?????? 3 > > ?nonvoluntary_ctxt_switches:??? 1 > > Signal: 11 > > Uname: Linux 3.13.0-44-generic x86_64 > > UserGroups: > > CoreDump: base64 > ?... > > Yann > > Le 28/06/2016 09:31, Anoop C S a ?crit?: > > On Mon, 2016-06-27 at 15:05 +0200, Yann LEMARIE wrote: > > > ?@Anoop, > > > > > > Where can I find the coredump file ? > > > > > You will get hints about the crash from entries inside > > /var/log/messages(for example pid of the process, location of > > coredump > > etc).? > > > > > The crash occurs 2 times last 7 days, each time a sunday morning > > > with > > > no reason, no increase of traffic or something like this, the > > > volume > > > was mounted since 15 days. > > > > > > The bricks are used as a CDN like, distributting small images and > > > css > > > files with a nginx https service (with a load balancer and 2 > > > EC2), on > > > a sunday morning there is not a lot of activity ... > > > > > From the very minimal back trace that we have from brick logs I > > would > > assume that a truncate operation was being handled by trash > > translator > > and it crashed. > > > > > Volume infos:? > > > > root at nfs05 /var/log/glusterfs # gluster volume info cdn > > > > ? > > > > Volume Name: cdn > > > > Type: Replicate > > > > Volume ID: c53b9bae-5e12-4f13-8217-53d8c96c302c > > > > Status: Started > > > > Number of Bricks: 1 x 2 = 2 > > > > Transport-type: tcp > > > > Bricks: > > > > Brick1: nfs05:/srv/cdn > > > > Brick2: nfs06:/srv/cdn > > > > Options Reconfigured: > > > > performance.readdir-ahead: on > > > > features.trash: on > > > > features.trash-max-filesize: 20MB > > > ? > > > I don't know if there is a link with this crash problem, but I > > > have > > > another problem with my 2 servers that make GluserFS's clients > > > disconnected (from another volume) : > > > > Jun 24 02:28:04 nfs05 kernel: [2039468.818617] xen_netfront: > > > > xennet: skb rides the rocket: 19 slots > > > > Jun 24 02:28:11 nfs05 kernel: [2039475.744086] net_ratelimit: > > > > 66 > > > > callbacks suppressed > > > ?It seem to be a network interface problem : > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1317811 > > > > > > Yann > > > > > > Le 27/06/2016 12:59, Anoop C S a ?crit?: > > > > On Mon, 2016-06-27 at 09:47 +0200, Yann LEMARIE wrote: > > > > > Hi, > > > > > > > > > > I'm using GlusterFS since many years and never see this > > > > > problem, > > > > > but > > > > > this is the second time in one week ... > > > > > > > > > > I have 3 volumes with 2 bricks and 1 volume crash with no > > > > > reason, > > > > Did you observe the crash while mounting the volume? Or can you > > > > be > > > > more > > > > specific on what were you doing just before you saw the crash? > > > > Can > > > > you > > > > please share the output of `gluster volume info <VOLNAME>`? > > > > > > > > > ?I just have to stop/start the volume to make it up again. > > > > > The only logs I can find are in syslog : > > > > > > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: pending frames: > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: frame : type(0) op(10) > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: patchset: > > > > > > git://git.gluster.com/glusterfs.git > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: signal received: 11 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: time of crash: > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: 2016-06-26 09:27:44 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: configuration details: > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: argp 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: backtrace 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: dlfcn 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: libpthread 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: llistxattr 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: setfsid 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: spinlock 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: epoll.h 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: xattr.h 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: st_atim.tv_nsec 1 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: package-string: > > > > > > glusterfs > > > > > > 3.7.11 > > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: --------- > > > > > > > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: pending frames: > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: frame : type(0) op(10) > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: patchset: > > > > > > git://git.gluster.com/glusterfs.git > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: signal received: 11 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: time of crash: > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: 2016-06-26 09:27:44 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: configuration details: > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: argp 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: backtrace 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: dlfcn 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: libpthread 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: llistxattr 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: setfsid 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: spinlock 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: epoll.h 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: xattr.h 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: st_atim.tv_nsec 1 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: package-string: > > > > > > glusterfs > > > > > > 3.7.11 > > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: --------- > > > > > > > > > > > ? > > > > > Thanks for your help > > > > > > > > > > > > > > > Regards > > > > > --? > > > > > Yann Lemari? > > > > > iRaiser - Support Technique > > > > > ? > > > > > ylemarie at iraiser.eu > > > > > _______________________________________________ > > > > > Gluster-users mailing list > > > > > Gluster-users at gluster.org > > > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > ? > > > --? > > > Yann Lemari? > > > iRaiser - Support Technique > > > ? > > > ylemarie at iraiser.eu > > > > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users at gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-users > ? > --? > Yann Lemari? > iRaiser - Support Technique > ? > ylemarie at iraiser.eu > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users
Hi, Another crash this morning in another volume (srv-payment), always a problem with a file in "trashcan" but this time I haven't any dump file in /var/crash and something very strange, the volume seem to re-up itself 2 or 3 minutes later, how is it possible ? It doesn't really crashed ?> Jul 1 06:36:19 nfs05 srv-payment[15744]: pending frames: > Jul 1 06:36:19 nfs05 srv-payment[15744]: frame : type(0) op(10) > Jul 1 06:36:19 nfs05 srv-payment[15744]: patchset: > git://git.gluster.com/glusterfs.git > Jul 1 06:36:19 nfs05 srv-payment[15744]: signal received: 11 > Jul 1 06:36:19 nfs05 srv-payment[15744]: time of crash: > Jul 1 06:36:19 nfs05 srv-payment[15744]: 2016-07-01 04:36:19 > Jul 1 06:36:19 nfs05 srv-payment[15744]: configuration details: > Jul 1 06:36:19 nfs05 srv-payment[15744]: argp 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: backtrace 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: dlfcn 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: libpthread 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: llistxattr 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: setfsid 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: spinlock 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: epoll.h 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: xattr.h 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: st_atim.tv_nsec 1 > Jul 1 06:36:19 nfs05 srv-payment[15744]: package-string: glusterfs 3.7.11 > Jul 1 06:36:19 nfs05 srv-payment[15744]: ---------> [2016-07-01 04:20:01.896593] E [MSGID: 113020] > [posix.c:2651:posix_create] 0-payment-posix: setting gfid on > /srv/payment/.trashcan//sites/dons.fondationdefrance.org/log/synchroNetful.log_2016-07-01_042001 > failed > [2016-07-01 04:20:01.896715] E [posix.c:2996:_fill_writev_xdata] > (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/trash.so(trash_truncate_readv_cbk+0x16c) > [0x7f95a35a1c4c] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/storage/posix.so(posix_writev+0x1dc) > [0x7f95a3de037c] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/storage/posix.so(_fill_writev_xdata+0x1ff) > [0x7f95a3de016f] ) 0-payment-posix: fd: 0x7f9598005a14 inode: > 0x7f952ae0a6acgfid:00000000-0000-0000-0000-000000000000 [Invalid argument] > [2016-07-01 04:20:01.896756] E [posix.c:2996:_fill_writev_xdata] > (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/trash.so(trash_truncate_readv_cbk+0x16c) > [0x7f95a35a1c4c] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/storage/posix.so(posix_writev+0x1dc) > [0x7f95a3de037c] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/storage/posix.so(_fill_writev_xdata+0x1ff) > [0x7f95a3de016f] ) 0-payment-posix: fd: 0x7f9598005a14 inode: > 0x7f952ae0a6acgfid:00000000-0000-0000-0000-000000000000 [Invalid argument] > [2016-07-01 04:25:53.012016] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xac) > [0x7f95a97b417c] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f95a2028877] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) > [0x7f95a97a491c] ) 0-dict: !this || key=() [Invalid argument] > [2016-07-01 04:25:53.013409] E [MSGID: 113091] > [posix.c:178:posix_lookup] 0-payment-posix: null gfid for path (null) > [2016-07-01 04:25:53.013424] E [MSGID: 113018] > [posix.c:196:posix_lookup] 0-payment-posix: lstat on null failed > [Invalid argument] > The message "E [MSGID: 113091] [posix.c:178:posix_lookup] > 0-payment-posix: null gfid for path (null)" repeated 3 times between > [2016-07-01 04:25:53.013409] and [2016-07-01 04:25:53.025339] > The message "E [MSGID: 113018] [posix.c:196:posix_lookup] > 0-payment-posix: lstat on null failed [Invalid argument]" repeated 3 > times between [2016-07-01 04:25:53.013424] and [2016-07-01 > 04:25:53.025340] > [2016-07-01 04:35:54.017530] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xac) > [0x7f95a97b417c] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f95a2028877] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) > [0x7f95a97a491c] ) 0-dict: !this || key=() [Invalid argument] > [2016-07-01 04:35:54.019695] E [MSGID: 113091] > [posix.c:178:posix_lookup] 0-payment-posix: null gfid for path (null) > [2016-07-01 04:35:54.019710] E [MSGID: 113018] > [posix.c:196:posix_lookup] 0-payment-posix: lstat on null failed > [Invalid argument] > The message "E [MSGID: 113091] [posix.c:178:posix_lookup] > 0-payment-posix: null gfid for path (null)" repeated 3 times between > [2016-07-01 04:35:54.019695] and [2016-07-01 04:35:54.027856] > The message "E [MSGID: 113018] [posix.c:196:posix_lookup] > 0-payment-posix: lstat on null failed [Invalid argument]" repeated 3 > times between [2016-07-01 04:35:54.019710] and [2016-07-01 > 04:35:54.027857] > pending frames: > frame : type(0) op(10) > patchset: git://git.gluster.com/glusterfs.git > signal received: 11 > time of crash: > 2016-07-01 04:36:19Le 29/06/2016 08:39, Anoop C S a ?crit :> On Tue, 2016-06-28 at 10:49 +0200, Yann LEMARIE wrote: >> Hi, >> >> I found the coredump file, but it's a 15Mo file (zipped), I can't >> post it on this mailling list. >> > Great. In order to exactly pin point the crash location, can you please > attach gdb to extracted coredump file and share us the complete back > trace by executing `bt` command in gdb shell? Apart from gdb you may be > instructed to install some debug-info packages for extracting a useful > back trace while attaching gdb as follows: > > # gdb /usr/sbin/glusterfsd <path-to-coredump-file> > > If prompted install required packages and reattach the coredump file. > When you are inside (gdb) prompt type 'bt' and paste the back trace. > >> Here is some parts of the repport : >> >>> ProblemType: Crash >>> Architecture: amd64 >>> Date: Sun Jun 26 11:27:44 2016 >>> DistroRelease: Ubuntu 14.04 >>> ExecutablePath: /usr/sbin/glusterfsd >>> ExecutableTimestamp: 1460982898 >>> ProcCmdline: /usr/sbin/glusterfsd -s nfs05 --volfile-id >>> cdn.nfs05.srv-cdn -p /var/lib/glusterd/vols/cdn/run/nfs05-srv- >>> cdn.pid -S /var/run/gluster/d52ac3e6c0a3fa316a9e8360976f3af5.socket >>> --brick-name /srv/cdn -l /var/log/glusterfs/bricks/srv-cdn.log -- >>> xlator-option *-posix.glusterd-uuid=6af63b78-a3da-459d-a909- >>> c010e6c9072c --brick-port 49155 --xlator-option cdn-server.listen- >>> port=49155 >>> ProcCwd: / >>> ProcEnviron: >>> PATH=(custom, no user) >>> TERM=linux >>> ProcMaps: >>> 7f25f18d9000-7f25f18da000 ---p 00000000 00:00 0 >>> 7f25f18da000-7f25f19da000 rw-p 00000000 00:00 >>> 0 [stack:849] >>> 7f25f19da000-7f25f19db000 ---p 00000000 00:00 0 >> ... >>> ProcStatus: >>> Name: glusterfsd >>> State: D (disk sleep) >>> Tgid: 7879 >>> Ngid: 0 >>> Pid: 7879 >>> PPid: 1 >>> TracerPid: 0 >>> Uid: 0 0 0 0 >>> Gid: 0 0 0 0 >>> FDSize: 64 >>> Groups: 0 >>> VmPeak: 878404 kB >>> VmSize: 878404 kB >>> VmLck: 0 kB >>> VmPin: 0 kB >>> VmHWM: 96104 kB >>> VmRSS: 90652 kB >>> VmData: 792012 kB >>> VmStk: 276 kB >>> VmExe: 84 kB >>> VmLib: 7716 kB >>> VmPTE: 700 kB >>> VmSwap: 20688 kB >>> Threads: 22 >>> SigQ: 0/30034 >>> SigPnd: 0000000000000000 >>> ShdPnd: 0000000000000000 >>> SigBlk: 0000000000004a01 >>> SigIgn: 0000000000001000 >>> SigCgt: 00000001800000fa >>> CapInh: 0000000000000000 >>> CapPrm: 0000001fffffffff >>> CapEff: 0000001fffffffff >>> CapBnd: 0000001fffffffff >>> Seccomp: 0 >>> Cpus_allowed: 7fff >>> Cpus_allowed_list: 0-14 >>> Mems_allowed: 00000000,00000001 >>> Mems_allowed_list: 0 >>> voluntary_ctxt_switches: 3 >>> nonvoluntary_ctxt_switches: 1 >>> Signal: 11 >>> Uname: Linux 3.13.0-44-generic x86_64 >>> UserGroups: >>> CoreDump: base64 >> ... >> >> Yann >> >> Le 28/06/2016 09:31, Anoop C S a ?crit : >>> On Mon, 2016-06-27 at 15:05 +0200, Yann LEMARIE wrote: >>>> @Anoop, >>>> >>>> Where can I find the coredump file ? >>>> >>> You will get hints about the crash from entries inside >>> /var/log/messages(for example pid of the process, location of >>> coredump >>> etc). >>> >>>> The crash occurs 2 times last 7 days, each time a sunday morning >>>> with >>>> no reason, no increase of traffic or something like this, the >>>> volume >>>> was mounted since 15 days. >>>> >>>> The bricks are used as a CDN like, distributting small images and >>>> css >>>> files with a nginx https service (with a load balancer and 2 >>>> EC2), on >>>> a sunday morning there is not a lot of activity ... >>>> >>> From the very minimal back trace that we have from brick logs I >>> would >>> assume that a truncate operation was being handled by trash >>> translator >>> and it crashed. >>> >>>> Volume infos: >>>>> root at nfs05 /var/log/glusterfs # gluster volume info cdn >>>>> >>>>> Volume Name: cdn >>>>> Type: Replicate >>>>> Volume ID: c53b9bae-5e12-4f13-8217-53d8c96c302c >>>>> Status: Started >>>>> Number of Bricks: 1 x 2 = 2 >>>>> Transport-type: tcp >>>>> Bricks: >>>>> Brick1: nfs05:/srv/cdn >>>>> Brick2: nfs06:/srv/cdn >>>>> Options Reconfigured: >>>>> performance.readdir-ahead: on >>>>> features.trash: on >>>>> features.trash-max-filesize: 20MB >>>> >>>> I don't know if there is a link with this crash problem, but I >>>> have >>>> another problem with my 2 servers that make GluserFS's clients >>>> disconnected (from another volume) : >>>>> Jun 24 02:28:04 nfs05 kernel: [2039468.818617] xen_netfront: >>>>> xennet: skb rides the rocket: 19 slots >>>>> Jun 24 02:28:11 nfs05 kernel: [2039475.744086] net_ratelimit: >>>>> 66 >>>>> callbacks suppressed >>>> It seem to be a network interface problem : >>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1317811 >>>> >>>> Yann >>>> >>>> Le 27/06/2016 12:59, Anoop C S a ?crit : >>>>> On Mon, 2016-06-27 at 09:47 +0200, Yann LEMARIE wrote: >>>>>> Hi, >>>>>> >>>>>> I'm using GlusterFS since many years and never see this >>>>>> problem, >>>>>> but >>>>>> this is the second time in one week ... >>>>>> >>>>>> I have 3 volumes with 2 bricks and 1 volume crash with no >>>>>> reason, >>>>> Did you observe the crash while mounting the volume? Or can you >>>>> be >>>>> more >>>>> specific on what were you doing just before you saw the crash? >>>>> Can >>>>> you >>>>> please share the output of `gluster volume info <VOLNAME>`? >>>>> >>>>>> I just have to stop/start the volume to make it up again. >>>>>> The only logs I can find are in syslog : >>>>>> >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: pending frames: >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: frame : type(0) op(10) >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: patchset: >>>>>>> git://git.gluster.com/glusterfs.git >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: signal received: 11 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: time of crash: >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: 2016-06-26 09:27:44 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: configuration details: >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: argp 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: backtrace 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: dlfcn 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: libpthread 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: llistxattr 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: setfsid 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: spinlock 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: epoll.h 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: xattr.h 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: st_atim.tv_nsec 1 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: package-string: >>>>>>> glusterfs >>>>>>> 3.7.11 >>>>>>> Jun 26 11:27:44 nfs05 srv-cdn[7879]: --------- >>>>>>> >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: pending frames: >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: frame : type(0) op(10) >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: patchset: >>>>>>> git://git.gluster.com/glusterfs.git >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: signal received: 11 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: time of crash: >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: 2016-06-26 09:27:44 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: configuration details: >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: argp 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: backtrace 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: dlfcn 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: libpthread 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: llistxattr 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: setfsid 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: spinlock 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: epoll.h 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: xattr.h 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: st_atim.tv_nsec 1 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: package-string: >>>>>>> glusterfs >>>>>>> 3.7.11 >>>>>>> Jun 26 11:27:44 nfs06 srv-cdn[1787]: --------- >>>>>>> >>>>>> >>>>>> Thanks for your help >>>>>> >>>>>> >>>>>> Regards >>>>>> -- >>>>>> Yann Lemari? >>>>>> iRaiser - Support Technique >>>>>> >>>>>> ylemarie at iraiser.eu >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>>> -- >>>> Yann Lemari? >>>> iRaiser - Support Technique >>>> >>>> ylemarie at iraiser.eu >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >> >> -- >> Yann Lemari? >> iRaiser - Support Technique >> >> ylemarie at iraiser.eu >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users-- Yann Lemari? iRaiser - Support Technique iRaiser Logotype <http://www.iraiser.eu> ylemarie at iraiser.eu <mailto:ylemarie at iraiser.eu> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160701/7d856625/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: dedihfgc.png Type: image/png Size: 1842 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160701/7d856625/attachment.png>