Without the glusterd log file and the core file or the backtrace I can't comment anything. On Wed, Dec 6, 2017 at 3:09 PM, ABHISHEK PALIWAL <abhishpaliwal at gmail.com> wrote:> Any suggestion.... > > On Dec 6, 2017 11:51, "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com> wrote: > >> Hi Team, >> >> We are getting the crash in glusterd after start of it. When I tried to >> debug in brick logs we are getting below errors: >> >> [2017-12-01 14:10:14.684122] E [MSGID: 100018] >> [glusterfsd.c:1960:glusterfs_pidfile_update] 0-glusterfsd: pidfile >> /system/glusterd/vols/c_glusterfs/run/10.32.1.144-opt-lvmdir-c2-brick.pid >> lock failed [Resource temporarily unavailable] >> : >> : >> : >> [2017-12-01 14:10:16.862903] E [MSGID: 113001] >> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=18: >> key:trusted.bit-rot.version [No space left on device] >> [2017-12-01 14:10:16.862985] I [MSGID: 115063] >> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: 92: >> FTRUNCATE 1 (934f08b7-e3b5-4690-84fc-742a4b1fb78b)==> (No space left on >> device) [No space left on device] >> [2017-12-01 14:10:16.907037] E [MSGID: 113001] >> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >> key:trusted.bit-rot.version [No space left on device] >> [2017-12-01 14:10:16.907108] I [MSGID: 115063] >> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: 35: >> FTRUNCATE 0 (109d6537-a1ec-4556-8ce1-04c365c451eb)==> (No space left on >> device) [No space left on device] >> [2017-12-01 14:10:16.947541] E [MSGID: 113001] >> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >> key:trusted.bit-rot.version [No space left on device] >> [2017-12-01 14:10:16.947623] I [MSGID: 115063] >> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: 70: >> FTRUNCATE 0 (8f9c8054-b0d7-4b93-a95b-cd3ab249c56d)==> (No space left on >> device) [No space left on device] >> [2017-12-01 14:10:16.968515] E [MSGID: 113001] >> [posix.c:4616:_posix_remove_xattr] 0-c_glusterfs-posix: removexattr >> failed on /opt/lvmdir/c2/brick/.glusterfs/00/00/00000000-0000-0000- >> 0000-000000000001/configuration (for trusted.glusterfs.dht) [No space >> left on device] >> [2017-12-01 14:10:16.968589] I [MSGID: 115058] >> [server-rpc-fops.c:740:server_removexattr_cbk] 0-c_glusterfs-server: 90: >> REMOVEXATTR <gfid:a240d2fd-869c-408d-9b95-62ee1bff074e> >> (a240d2fd-869c-408d-9b95-62ee1bff074e) of key ==> (No space left on >> device) [No space left on device] >> [2017-12-01 14:10:17.039815] E [MSGID: 113001] >> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >> key:trusted.bit-rot.version [No space left on device] >> [2017-12-01 14:10:17.039900] I [MSGID: 115063] >> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: 152: >> FTRUNCATE 0 (d67bcfcd-ff19-4b58-9823-46d6cce9ace3)==> (No space left on >> device) [No space left on device] >> [2017-12-01 14:10:17.048767] E [MSGID: 113001] >> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >> key:trusted.bit-rot.version [No space left on device] >> [2017-12-01 14:10:17.048874] I [MSGID: 115063] >> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: 163: >> FTRUNCATE 0 (0e3ee6ad-408b-4fcf-a1a7-4262ec113316)==> (No space left on >> device) [No space left on device] >> [2017-12-01 14:10:17.075007] E [MSGID: 113001] >> [posix.c:4616:_posix_remove_xattr] 0-c_glusterfs-posix: removexattr >> failed on /opt/lvmdir/c2/brick/.glusterfs/00/00/00000000-0000-0000-0000-000000000001/java >> (for trusted.glusterfs.dht) [No space left on device] >> >> Also, we are having the lack disk space. >> >> Could any one please explain me what glusterd is doing in brick so that >> it is causing of its crash. >> >> Please find the brick logs in attachment. >> >> Thanks in advance!!! >> -- >> Regards >> Abhishek Paliwal >> > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171206/61b2b87e/attachment.html>
ABHISHEK PALIWAL
2017-Dec-06 09:56 UTC
[Gluster-users] [Gluster-devel] Crash in glusterd!!!
Hi Atin, Please find the backtrace and logs files attached here. Also, below are the BT from core. (gdb) bt #0 0x00003fff8834b898 in __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:55 #1 0x00003fff88350fd0 in __GI_abort () at abort.c:89 [**ALERT: The abort() might not be exactly invoked from the following function line. If the trail function contains multiple abort() calls, then you should cross check by other means to get correct abort() call location. This is due to the optimized compilation which hides the debug info for multiple abort() calls in a given function. Refer TR HU16995 for more information] #2 0x00003fff8838be04 in __libc_message (do_abort=<optimized out>, fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:175 #3 0x00003fff8839aba8 in malloc_printerr (action=<optimized out>, str=0x3fff8847e498 "double free or corruption (!prev)", ptr=<optimized out>, ar_ptr=<optimized out>) at malloc.c:5007 #4 0x00003fff8839ba40 in _int_free (av=0x3fff6c000020, p=<optimized out>, have_lock=<optimized out>) at malloc.c:3868 #5 0x00003fff885e0814 in __gf_free (free_ptr=0x3fff6c045da0) at mem-pool.c:336 #6 0x00003fff849093c4 in glusterd_friend_sm () at glusterd-sm.c:1295 #7 0x00003fff84901a58 in __glusterd_handle_incoming_unfriend_req (req=0x3fff8481c06c) at glusterd-handler.c:2606 #8 0x00003fff848fb870 in glusterd_big_locked_handler (req=0x3fff8481c06c, actor_fn=@0x3fff84a43e70: 0x3fff84901830 <__glusterd_handle_incoming_unfriend_req>) at glusterd-handler.c:83 #9 0x00003fff848fbd08 in glusterd_handle_incoming_unfriend_req (req=<optimized out>) at glusterd-handler.c:2615 #10 0x00003fff8854e87c in rpcsvc_handle_rpc_call (svc=0x10062fd0 <_GLOBAL__sub_I__ZN27UehChSwitchFachToDchC_ActorC2EP12RTControllerP10RTActorRef()+1148>, trans=<optimized out>, msg=0x3fff6c000920) at rpcsvc.c:705 #11 0x00003fff8854eb7c in rpcsvc_notify (trans=0x3fff74002210, mydata=<optimized out>, event=<optimized out>, data=<optimized out>) at rpcsvc.c:799 #12 0x00003fff885514fc in rpc_transport_notify (this=<optimized out>, event=<optimized out>, data=<optimized out>) at rpc-transport.c:546 #13 0x00003fff847fcd44 in socket_event_poll_in (this=this at entry=0x3fff74002210) at socket.c:2236 #14 0x00003fff847ff89c in socket_event_handler (fd=<optimized out>, idx=<optimized out>, data=0x3fff74002210, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=<optimized out>) at socket.c:2349 #15 0x00003fff88616874 in event_dispatch_epoll_handler (event=0x3fff83d9d6a0, event_pool=0x10045bc0 <_GLOBAL__sub_I__ZN29DrhIfRhControlPdrProxyC_ActorC2EP12RTControllerP10RTActorRef()+116>) at event-epoll.c:575 #16 event_dispatch_epoll_worker (data=0x100bb4a0 <main_thread_func__()+1756>) at event-epoll.c:678 #17 0x00003fff884cfb10 in start_thread (arg=0x3fff83d9e160) at pthread_create.c:339 #18 0x00003fff88419c0c in .__clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:96 (gdb) bt full #0 0x00003fff8834b898 in __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:55 r4 = 1560 r7 = 16 arg2 = 1560 r5 = 6 r8 = 0 arg3 = 6 r0 = 250 r3 = 0 r6 = 8 arg1 = 0 sc_err = <optimized out> sc_ret = <optimized out> pd = 0x3fff83d9e160 pid = 0 ---Type <return> to continue, or q <return> to quit--- selftid = 1560 #1 0x00003fff88350fd0 in __GI_abort () at abort.c:89 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x0} sigs = {__val = {32, 0 <repeats 15 times>}} [**ALERT: The abort() might not be exactly invoked from the following function line. If the trail function contains multiple abort() calls, then you should cross check by other means to get correct abort() call location. This is due to the optimized compilation which hides the debug info for multiple abort() calls in a given function. Refer TR HU16995 for more information] #2 0x00003fff8838be04 in __libc_message (do_abort=<optimized out>, fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:175 ap = <optimized out> fd = <optimized out> on_2 = <optimized out> list = <optimized out> nlist = <optimized out> cp = <optimized out> written = <optimized out> #3 0x00003fff8839aba8 in malloc_printerr (action=<optimized out>, str=0x3fff8847e498 "*double free or corruption (!prev)*", ptr=<optimized out>, ar_ptr=<optimized out>) at malloc.c:5007 buf = "00003fff6c045d60" cp = <optimized out> ar_ptr = <optimized out> ptr = <optimized out> str = 0x3fff8847e498 "double free or corruption (!prev)" action = 3 #4 0x00003fff8839ba40 in _int_free (av=0x3fff6c000020, p=<optimized out>, have_lock=<optimized out>) at malloc.c:3868 size = <optimized out> fb = <optimized out> nextchunk = <optimized out> nextsize = <optimized out> nextinuse = <optimized out> prevsize = <optimized out> bck = <optimized out> fwd = <optimized out> errstr = <optimized out> locked = <optimized out> __func__ = "_int_free" #5 0x00003fff885e0814 in __gf_free (free_ptr=0x3fff6c045da0) at mem-pool.c:336 ptr = 0x3fff6c045d60 mem_acct = <optimized out> header = 0x3fff6c045d60 free_ptr = 0x3fff6c045da0 #6 0x00003fff849093c4 in glusterd_friend_sm () at glusterd-sm.c:1295 event = 0x3fff6c045da0 tmp = 0x3fff6c045da0 ret = <optimized out> ---Type <return> to continue, or q <return> to quit--- handler = @0x3fff84a44038: 0x3fff84906750 <glusterd_ac_friend_remove> state = 0x3fff84a390c0 <glusterd_state_befriended> peerinfo = <optimized out> event_type = GD_FRIEND_EVENT_REMOVE_FRIEND is_await_conn = <optimized out> quorum_action = <optimized out> old_state = GD_FRIEND_STATE_BEFRIENDED this = <optimized out> priv = 0x3fff84748050 __FUNCTION__ = "glusterd_friend_sm" #7 0x00003fff84901a58 in __glusterd_handle_incoming_unfriend_req (req=0x3fff8481c06c) at glusterd-handler.c:2606 ret = 0 friend_req = {uuid = "\231\214R?\177\223I\216\236?\214d??y?", hostname = 0x3fff6c028ef0 "", port = 0, vols = {vols_len = 0, vols_val 0x0}} remote_hostname = "10.32.0.48", '\000' <repeats 98 times> __FUNCTION__ = "__glusterd_handle_incoming_unfriend_req" #8 0x00003fff848fb870 in glusterd_big_locked_handler (req=0x3fff8481c06c, actor_fn=@0x3fff84a43e70: 0x3fff84901830 <__glusterd_handle_incoming_unfriend_req>) at glusterd-handler.c:83 priv = 0x3fff84748050 ret = -1 #9 0x00003fff848fbd08 in glusterd_handle_incoming_unfriend_req (req=<optimized out>) at glusterd-handler.c:2615 No locals. #10 0x00003fff8854e87c in rpcsvc_handle_rpc_call (svc=0x10062fd0 <_GLOBAL__sub_I__ZN27UehChSwitchFachToDchC_ActorC2EP12RTControllerP10RTActorRef()+1148>, trans=<optimized out>, msg=0x3fff6c000920) at rpcsvc.c:705 actor = 0x3fff84a38860 <gd_svc_peer_actors+192> actor_fn = @0x3fff84a43ab0: 0x3fff848fbcf0 <glusterd_handle_incoming_unfriend_req> req = 0x3fff8481c06c ret = -1 port = <optimized out> unprivileged = <optimized out> reply = <optimized out> drc = <optimized out> __FUNCTION__ = "rpcsvc_handle_rpc_call" #11 0x00003fff8854eb7c in rpcsvc_notify (trans=0x3fff74002210, mydata=<optimized out>, event=<optimized out>, data=<optimized out>) at rpcsvc.c:799 ret = -1 msg = <optimized out> new_trans = 0x0 svc = <optimized out> listener = 0x0 __FUNCTION__ = "rpcsvc_notify" #12 0x00003fff885514fc in rpc_transport_notify (this=<optimized out>, event=<optimized out>, data=<optimized out>) at rpc-transport.c:546 ret = -1 __FUNCTION__ = "rpc_transport_notify" #13 0x00003fff847fcd44 in socket_event_poll_in (this=this at entry=0x3fff74002210) at socket.c:2236 ret = <optimized out> pollin = 0x3fff6c000920 priv = 0x3fff74002d50 #14 0x00003fff847ff89c in socket_event_handler (fd=<optimized out>, idx=<optimized out>, data=0x3fff74002210, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=<optimized out>) at socket.c:2349 ---Type <return> to continue, or q <return> to quit--- this = 0x3fff74002210 priv = 0x3fff74002d50 ret = <optimized out> __FUNCTION__ = "socket_event_handler" #15 0x00003fff88616874 in event_dispatch_epoll_handler (event=0x3fff83d9d6a0, event_pool=0x10045bc0 <_GLOBAL__sub_I__ZN29DrhIfRhControlPdrProxyC_ActorC2EP12RTControllerP10RTActorRef()+116>) at event-epoll.c:575 handler = @0x3fff8481a620: 0x3fff847ff6f0 <socket_event_handler> gen = 1 slot = 0x100803f0 <_GLOBAL__sub_I__ZN24RoamIfFroRrcRoExtAttribDC2Ev()+232> data = <optimized out> ret = -1 fd = 8 ev_data = 0x3fff83d9d6a8 idx = 7 #16 event_dispatch_epoll_worker (data=0x100bb4a0 <main_thread_func__()+1756>) at event-epoll.c:678 event = {events = 1, data = {ptr = 0x700000001, fd = 7, u32 = 7, u64 = 30064771073}} ret = <optimized out> ev_data = 0x100bb4a0 <main_thread_func__()+1756> event_pool = 0x10045bc0 <_GLOBAL__sub_I__ZN29DrhIfRhControlPdrProxyC_ActorC2EP12RTControllerP10RTActorRef()+116> myindex = <optimized out> timetodie = 0 __FUNCTION__ = "event_dispatch_epoll_worker" #17 0x00003fff884cfb10 in start_thread (arg=0x3fff83d9e160) at pthread_create.c:339 pd = 0x3fff83d9e160 now = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-6868946778599096053, 70366736145408, -6868946778421678961, 0, 0, 70366652919808, 70366661304864, 8388608, 70366736105504, 269202592, 70367897957648, 70366736131032, 70366737568040, 3, 0, 70366736131048, 70367897957296, 70367897957352, 4001536, 70366736106520, 70366661302080, -3187654076, 0 <repeats 42 times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> pagesize_m1 = <optimized out> sp = <optimized out> freesize = <optimized out> __PRETTY_FUNCTION__ = "start_thread" #18 0x00003fff88419c0c in .__clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:96 No locals. Regards, Abhishek On Wed, Dec 6, 2017 at 3:21 PM, Atin Mukherjee <amukherj at redhat.com> wrote:> Without the glusterd log file and the core file or the backtrace I can't > comment anything. > > On Wed, Dec 6, 2017 at 3:09 PM, ABHISHEK PALIWAL <abhishpaliwal at gmail.com> > wrote: > >> Any suggestion.... >> >> On Dec 6, 2017 11:51, "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com> wrote: >> >>> Hi Team, >>> >>> We are getting the crash in glusterd after start of it. When I tried to >>> debug in brick logs we are getting below errors: >>> >>> [2017-12-01 14:10:14.684122] E [MSGID: 100018] >>> [glusterfsd.c:1960:glusterfs_pidfile_update] 0-glusterfsd: pidfile >>> /system/glusterd/vols/c_glusterfs/run/10.32.1.144-opt-lvmdir-c2-brick.pid >>> lock failed [Resource temporarily unavailable] >>> : >>> : >>> : >>> [2017-12-01 14:10:16.862903] E [MSGID: 113001] >>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=18: >>> key:trusted.bit-rot.version [No space left on device] >>> [2017-12-01 14:10:16.862985] I [MSGID: 115063] >>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: 92: >>> FTRUNCATE 1 (934f08b7-e3b5-4690-84fc-742a4b1fb78b)==> (No space left on >>> device) [No space left on device] >>> [2017-12-01 14:10:16.907037] E [MSGID: 113001] >>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >>> key:trusted.bit-rot.version [No space left on device] >>> [2017-12-01 14:10:16.907108] I [MSGID: 115063] >>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: 35: >>> FTRUNCATE 0 (109d6537-a1ec-4556-8ce1-04c365c451eb)==> (No space left on >>> device) [No space left on device] >>> [2017-12-01 14:10:16.947541] E [MSGID: 113001] >>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >>> key:trusted.bit-rot.version [No space left on device] >>> [2017-12-01 14:10:16.947623] I [MSGID: 115063] >>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: 70: >>> FTRUNCATE 0 (8f9c8054-b0d7-4b93-a95b-cd3ab249c56d)==> (No space left on >>> device) [No space left on device] >>> [2017-12-01 14:10:16.968515] E [MSGID: 113001] >>> [posix.c:4616:_posix_remove_xattr] 0-c_glusterfs-posix: removexattr >>> failed on /opt/lvmdir/c2/brick/.glusterfs/00/00/00000000-0000-0000-0000-000000000001/configuration >>> (for trusted.glusterfs.dht) [No space left on device] >>> [2017-12-01 14:10:16.968589] I [MSGID: 115058] >>> [server-rpc-fops.c:740:server_removexattr_cbk] 0-c_glusterfs-server: >>> 90: REMOVEXATTR <gfid:a240d2fd-869c-408d-9b95-62ee1bff074e> >>> (a240d2fd-869c-408d-9b95-62ee1bff074e) of key ==> (No space left on >>> device) [No space left on device] >>> [2017-12-01 14:10:17.039815] E [MSGID: 113001] >>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >>> key:trusted.bit-rot.version [No space left on device] >>> [2017-12-01 14:10:17.039900] I [MSGID: 115063] >>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: >>> 152: FTRUNCATE 0 (d67bcfcd-ff19-4b58-9823-46d6cce9ace3)==> (No space >>> left on device) [No space left on device] >>> [2017-12-01 14:10:17.048767] E [MSGID: 113001] >>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >>> key:trusted.bit-rot.version [No space left on device] >>> [2017-12-01 14:10:17.048874] I [MSGID: 115063] >>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: >>> 163: FTRUNCATE 0 (0e3ee6ad-408b-4fcf-a1a7-4262ec113316)==> (No space >>> left on device) [No space left on device] >>> [2017-12-01 14:10:17.075007] E [MSGID: 113001] >>> [posix.c:4616:_posix_remove_xattr] 0-c_glusterfs-posix: removexattr >>> failed on /opt/lvmdir/c2/brick/.glusterfs/00/00/00000000-0000-0000-0000-000000000001/java >>> (for trusted.glusterfs.dht) [No space left on device] >>> >>> Also, we are having the lack disk space. >>> >>> Could any one please explain me what glusterd is doing in brick so that >>> it is causing of its crash. >>> >>> Please find the brick logs in attachment. >>> >>> Thanks in advance!!! >>> -- >>> Regards >>> Abhishek Paliwal >>> >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-devel >> > >-- Regards Abhishek Paliwal -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171206/15a9aed6/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: glusterfs.7z Type: application/x-7z-compressed Size: 148198 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171206/15a9aed6/attachment-0001.bin>
ABHISHEK PALIWAL
2017-Dec-06 12:09 UTC
[Gluster-users] [Gluster-devel] Crash in glusterd!!!
I hope these logs were sufficient... please let me know if you require more logs. On Wed, Dec 6, 2017 at 3:26 PM, ABHISHEK PALIWAL <abhishpaliwal at gmail.com> wrote:> Hi Atin, > > Please find the backtrace and logs files attached here. > > Also, below are the BT from core. > > (gdb) bt > > #0 0x00003fff8834b898 in __GI_raise (sig=<optimized out>) at > ../sysdeps/unix/sysv/linux/raise.c:55 > > #1 0x00003fff88350fd0 in __GI_abort () at abort.c:89 > > > > [**ALERT: The abort() might not be exactly invoked from the following > function line. > > If the trail function contains multiple abort() calls, > then you should cross check by other means to get correct abort() call > location. > > This is due to the optimized compilation which hides the > debug info for multiple abort() calls in a given function. > > Refer TR HU16995 for more information] > > > > #2 0x00003fff8838be04 in __libc_message (do_abort=<optimized out>, > fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:175 > > #3 0x00003fff8839aba8 in malloc_printerr (action=<optimized out>, > str=0x3fff8847e498 "double free or corruption (!prev)", ptr=<optimized > out>, ar_ptr=<optimized out>) at malloc.c:5007 > > #4 0x00003fff8839ba40 in _int_free (av=0x3fff6c000020, p=<optimized out>, > have_lock=<optimized out>) at malloc.c:3868 > > #5 0x00003fff885e0814 in __gf_free (free_ptr=0x3fff6c045da0) at > mem-pool.c:336 > > #6 0x00003fff849093c4 in glusterd_friend_sm () at glusterd-sm.c:1295 > > #7 0x00003fff84901a58 in __glusterd_handle_incoming_unfriend_req > (req=0x3fff8481c06c) at glusterd-handler.c:2606 > > #8 0x00003fff848fb870 in glusterd_big_locked_handler (req=0x3fff8481c06c, > actor_fn=@0x3fff84a43e70: 0x3fff84901830 <__glusterd_handle_incoming_unfriend_req>) > at glusterd-handler.c:83 > > #9 0x00003fff848fbd08 in glusterd_handle_incoming_unfriend_req > (req=<optimized out>) at glusterd-handler.c:2615 > > #10 0x00003fff8854e87c in rpcsvc_handle_rpc_call (svc=0x10062fd0 > <_GLOBAL__sub_I__ZN27UehChSwitchFachToDchC_ActorC2EP12RTControllerP10RTActorRef()+1148>, > trans=<optimized out>, msg=0x3fff6c000920) at rpcsvc.c:705 > > #11 0x00003fff8854eb7c in rpcsvc_notify (trans=0x3fff74002210, > mydata=<optimized out>, event=<optimized out>, data=<optimized out>) at > rpcsvc.c:799 > > #12 0x00003fff885514fc in rpc_transport_notify (this=<optimized out>, > event=<optimized out>, data=<optimized out>) at rpc-transport.c:546 > > #13 0x00003fff847fcd44 in socket_event_poll_in (this=this at entry=0x3fff74002210) > at socket.c:2236 > > #14 0x00003fff847ff89c in socket_event_handler (fd=<optimized out>, > idx=<optimized out>, data=0x3fff74002210, poll_in=<optimized out>, > poll_out=<optimized out>, poll_err=<optimized out>) at socket.c:2349 > > #15 0x00003fff88616874 in event_dispatch_epoll_handler > (event=0x3fff83d9d6a0, event_pool=0x10045bc0 <_GLOBAL__sub_I__ > ZN29DrhIfRhControlPdrProxyC_ActorC2EP12RTControllerP10RTActorRef()+116>) > at event-epoll.c:575 > > #16 event_dispatch_epoll_worker (data=0x100bb4a0 > <main_thread_func__()+1756>) at event-epoll.c:678 > > #17 0x00003fff884cfb10 in start_thread (arg=0x3fff83d9e160) at > pthread_create.c:339 > > #18 0x00003fff88419c0c in .__clone () at ../sysdeps/unix/sysv/linux/ > powerpc/powerpc64/clone.S:96 > > > > (gdb) bt full > > #0 0x00003fff8834b898 in __GI_raise (sig=<optimized out>) at > ../sysdeps/unix/sysv/linux/raise.c:55 > > r4 = 1560 > > r7 = 16 > > arg2 = 1560 > > r5 = 6 > > r8 = 0 > > arg3 = 6 > > r0 = 250 > > r3 = 0 > > r6 = 8 > > arg1 = 0 > > sc_err = <optimized out> > > sc_ret = <optimized out> > > pd = 0x3fff83d9e160 > > pid = 0 > > ---Type <return> to continue, or q <return> to quit--- > > selftid = 1560 > > #1 0x00003fff88350fd0 in __GI_abort () at abort.c:89 > > save_stage = 2 > > act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction > 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer > = 0x0} > > sigs = {__val = {32, 0 <repeats 15 times>}} > > > > [**ALERT: The abort() might not be exactly invoked from the following > function line. > > If the trail function contains multiple abort() calls, > then you should cross check by other means to get correct abort() call > location. > > This is due to the optimized compilation which hides the > debug info for multiple abort() calls in a given function. > > Refer TR HU16995 for more information] > > > > #2 0x00003fff8838be04 in __libc_message (do_abort=<optimized out>, > fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:175 > > ap = <optimized out> > > fd = <optimized out> > > on_2 = <optimized out> > > list = <optimized out> > > nlist = <optimized out> > > cp = <optimized out> > > written = <optimized out> > > #3 0x00003fff8839aba8 in malloc_printerr (action=<optimized out>, > str=0x3fff8847e498 "*double free or corruption (!prev)*", ptr=<optimized > out>, ar_ptr=<optimized out>) at malloc.c:5007 > > buf = "00003fff6c045d60" > > cp = <optimized out> > > ar_ptr = <optimized out> > > ptr = <optimized out> > > str = 0x3fff8847e498 "double free or corruption (!prev)" > > action = 3 > > #4 0x00003fff8839ba40 in _int_free (av=0x3fff6c000020, p=<optimized out>, > have_lock=<optimized out>) at malloc.c:3868 > > size = <optimized out> > > fb = <optimized out> > > nextchunk = <optimized out> > > nextsize = <optimized out> > > nextinuse = <optimized out> > > prevsize = <optimized out> > > bck = <optimized out> > > fwd = <optimized out> > > errstr = <optimized out> > > locked = <optimized out> > > __func__ = "_int_free" > > #5 0x00003fff885e0814 in __gf_free (free_ptr=0x3fff6c045da0) at > mem-pool.c:336 > > ptr = 0x3fff6c045d60 > > mem_acct = <optimized out> > > header = 0x3fff6c045d60 > > free_ptr = 0x3fff6c045da0 > > #6 0x00003fff849093c4 in glusterd_friend_sm () at glusterd-sm.c:1295 > > event = 0x3fff6c045da0 > > tmp = 0x3fff6c045da0 > > ret = <optimized out> > > ---Type <return> to continue, or q <return> to quit--- > > handler = @0x3fff84a44038: 0x3fff84906750 > <glusterd_ac_friend_remove> > > state = 0x3fff84a390c0 <glusterd_state_befriended> > > peerinfo = <optimized out> > > event_type = GD_FRIEND_EVENT_REMOVE_FRIEND > > is_await_conn = <optimized out> > > quorum_action = <optimized out> > > old_state = GD_FRIEND_STATE_BEFRIENDED > > this = <optimized out> > > priv = 0x3fff84748050 > > __FUNCTION__ = "glusterd_friend_sm" > > #7 0x00003fff84901a58 in __glusterd_handle_incoming_unfriend_req > (req=0x3fff8481c06c) at glusterd-handler.c:2606 > > ret = 0 > > friend_req = {uuid = "\231\214R?\177\223I\216\236?\214d??y?", > hostname = 0x3fff6c028ef0 "", port = 0, vols = {vols_len = 0, vols_val > 0x0}} > > remote_hostname = "10.32.0.48", '\000' <repeats 98 times> > > __FUNCTION__ = "__glusterd_handle_incoming_unfriend_req" > > #8 0x00003fff848fb870 in glusterd_big_locked_handler (req=0x3fff8481c06c, > actor_fn=@0x3fff84a43e70: 0x3fff84901830 <__glusterd_handle_incoming_unfriend_req>) > at glusterd-handler.c:83 > > priv = 0x3fff84748050 > > ret = -1 > > #9 0x00003fff848fbd08 in glusterd_handle_incoming_unfriend_req > (req=<optimized out>) at glusterd-handler.c:2615 > > No locals. > > #10 0x00003fff8854e87c in rpcsvc_handle_rpc_call (svc=0x10062fd0 > <_GLOBAL__sub_I__ZN27UehChSwitchFachToDchC_ActorC2EP12RTControllerP10RTActorRef()+1148>, > trans=<optimized out>, msg=0x3fff6c000920) at rpcsvc.c:705 > > actor = 0x3fff84a38860 <gd_svc_peer_actors+192> > > actor_fn = @0x3fff84a43ab0: 0x3fff848fbcf0 > <glusterd_handle_incoming_unfriend_req> > > req = 0x3fff8481c06c > > ret = -1 > > port = <optimized out> > > unprivileged = <optimized out> > > reply = <optimized out> > > drc = <optimized out> > > __FUNCTION__ = "rpcsvc_handle_rpc_call" > > #11 0x00003fff8854eb7c in rpcsvc_notify (trans=0x3fff74002210, > mydata=<optimized out>, event=<optimized out>, data=<optimized out>) at > rpcsvc.c:799 > > ret = -1 > > msg = <optimized out> > > new_trans = 0x0 > > svc = <optimized out> > > listener = 0x0 > > __FUNCTION__ = "rpcsvc_notify" > > #12 0x00003fff885514fc in rpc_transport_notify (this=<optimized out>, > event=<optimized out>, data=<optimized out>) at rpc-transport.c:546 > > ret = -1 > > __FUNCTION__ = "rpc_transport_notify" > > #13 0x00003fff847fcd44 in socket_event_poll_in (this=this at entry=0x3fff74002210) > at socket.c:2236 > > ret = <optimized out> > > pollin = 0x3fff6c000920 > > priv = 0x3fff74002d50 > > #14 0x00003fff847ff89c in socket_event_handler (fd=<optimized out>, > idx=<optimized out>, data=0x3fff74002210, poll_in=<optimized out>, > poll_out=<optimized out>, poll_err=<optimized out>) at socket.c:2349 > > ---Type <return> to continue, or q <return> to quit--- > > this = 0x3fff74002210 > > priv = 0x3fff74002d50 > > ret = <optimized out> > > __FUNCTION__ = "socket_event_handler" > > #15 0x00003fff88616874 in event_dispatch_epoll_handler > (event=0x3fff83d9d6a0, event_pool=0x10045bc0 <_GLOBAL__sub_I__ > ZN29DrhIfRhControlPdrProxyC_ActorC2EP12RTControllerP10RTActorRef()+116>) > at event-epoll.c:575 > > handler = @0x3fff8481a620: 0x3fff847ff6f0 <socket_event_handler> > > gen = 1 > > slot = 0x100803f0 <_GLOBAL__sub_I__ZN24RoamIfFroRrcRoExtAttribDC2 > Ev()+232> > > data = <optimized out> > > ret = -1 > > fd = 8 > > ev_data = 0x3fff83d9d6a8 > > idx = 7 > > #16 event_dispatch_epoll_worker (data=0x100bb4a0 > <main_thread_func__()+1756>) at event-epoll.c:678 > > event = {events = 1, data = {ptr = 0x700000001, fd = 7, u32 = 7, > u64 = 30064771073}} > > ret = <optimized out> > > ev_data = 0x100bb4a0 <main_thread_func__()+1756> > > event_pool = 0x10045bc0 <_GLOBAL__sub_I__ > ZN29DrhIfRhControlPdrProxyC_ActorC2EP12RTControllerP10RTActorRef()+116> > > myindex = <optimized out> > > timetodie = 0 > > __FUNCTION__ = "event_dispatch_epoll_worker" > > #17 0x00003fff884cfb10 in start_thread (arg=0x3fff83d9e160) at > pthread_create.c:339 > > pd = 0x3fff83d9e160 > > now = <optimized out> > > unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-6868946778599096053, > 70366736145408, -6868946778421678961, 0, 0, 70366652919808, 70366661304864, > 8388608, 70366736105504, 269202592, 70367897957648, 70366736131032, > 70366737568040, 3, 0, 70366736131048, 70367897957296, 70367897957352, > 4001536, 70366736106520, 70366661302080, -3187654076, 0 <repeats 42 > times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data > {prev = 0x0, cleanup = 0x0, canceltype = 0}}} > > not_first_call = <optimized out> > > pagesize_m1 = <optimized out> > > sp = <optimized out> > > freesize = <optimized out> > > __PRETTY_FUNCTION__ = "start_thread" > > #18 0x00003fff88419c0c in .__clone () at ../sysdeps/unix/sysv/linux/ > powerpc/powerpc64/clone.S:96 > > No locals. > > Regards, > Abhishek > > On Wed, Dec 6, 2017 at 3:21 PM, Atin Mukherjee <amukherj at redhat.com> > wrote: > >> Without the glusterd log file and the core file or the backtrace I can't >> comment anything. >> >> On Wed, Dec 6, 2017 at 3:09 PM, ABHISHEK PALIWAL <abhishpaliwal at gmail.com >> > wrote: >> >>> Any suggestion.... >>> >>> On Dec 6, 2017 11:51, "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com> >>> wrote: >>> >>>> Hi Team, >>>> >>>> We are getting the crash in glusterd after start of it. When I tried to >>>> debug in brick logs we are getting below errors: >>>> >>>> [2017-12-01 14:10:14.684122] E [MSGID: 100018] >>>> [glusterfsd.c:1960:glusterfs_pidfile_update] 0-glusterfsd: pidfile >>>> /system/glusterd/vols/c_glusterfs/run/10.32.1.144-opt-lvmdir-c2-brick.pid >>>> lock failed [Resource temporarily unavailable] >>>> : >>>> : >>>> : >>>> [2017-12-01 14:10:16.862903] E [MSGID: 113001] >>>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=18: >>>> key:trusted.bit-rot.version [No space left on device] >>>> [2017-12-01 14:10:16.862985] I [MSGID: 115063] >>>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: >>>> 92: FTRUNCATE 1 (934f08b7-e3b5-4690-84fc-742a4b1fb78b)==> (No space >>>> left on device) [No space left on device] >>>> [2017-12-01 14:10:16.907037] E [MSGID: 113001] >>>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >>>> key:trusted.bit-rot.version [No space left on device] >>>> [2017-12-01 14:10:16.907108] I [MSGID: 115063] >>>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: >>>> 35: FTRUNCATE 0 (109d6537-a1ec-4556-8ce1-04c365c451eb)==> (No space >>>> left on device) [No space left on device] >>>> [2017-12-01 14:10:16.947541] E [MSGID: 113001] >>>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >>>> key:trusted.bit-rot.version [No space left on device] >>>> [2017-12-01 14:10:16.947623] I [MSGID: 115063] >>>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: >>>> 70: FTRUNCATE 0 (8f9c8054-b0d7-4b93-a95b-cd3ab249c56d)==> (No space >>>> left on device) [No space left on device] >>>> [2017-12-01 14:10:16.968515] E [MSGID: 113001] >>>> [posix.c:4616:_posix_remove_xattr] 0-c_glusterfs-posix: removexattr >>>> failed on /opt/lvmdir/c2/brick/.glusterfs/00/00/00000000-0000-0000-0000-000000000001/configuration >>>> (for trusted.glusterfs.dht) [No space left on device] >>>> [2017-12-01 14:10:16.968589] I [MSGID: 115058] >>>> [server-rpc-fops.c:740:server_removexattr_cbk] 0-c_glusterfs-server: >>>> 90: REMOVEXATTR <gfid:a240d2fd-869c-408d-9b95-62ee1bff074e> >>>> (a240d2fd-869c-408d-9b95-62ee1bff074e) of key ==> (No space left on >>>> device) [No space left on device] >>>> [2017-12-01 14:10:17.039815] E [MSGID: 113001] >>>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >>>> key:trusted.bit-rot.version [No space left on device] >>>> [2017-12-01 14:10:17.039900] I [MSGID: 115063] >>>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: >>>> 152: FTRUNCATE 0 (d67bcfcd-ff19-4b58-9823-46d6cce9ace3)==> (No space >>>> left on device) [No space left on device] >>>> [2017-12-01 14:10:17.048767] E [MSGID: 113001] >>>> [posix-helpers.c:1228:posix_fhandle_pair] 0-c_glusterfs-posix: fd=17: >>>> key:trusted.bit-rot.version [No space left on device] >>>> [2017-12-01 14:10:17.048874] I [MSGID: 115063] >>>> [server-rpc-fops.c:1317:server_ftruncate_cbk] 0-c_glusterfs-server: >>>> 163: FTRUNCATE 0 (0e3ee6ad-408b-4fcf-a1a7-4262ec113316)==> (No space >>>> left on device) [No space left on device] >>>> [2017-12-01 14:10:17.075007] E [MSGID: 113001] >>>> [posix.c:4616:_posix_remove_xattr] 0-c_glusterfs-posix: removexattr >>>> failed on /opt/lvmdir/c2/brick/.glusterfs/00/00/00000000-0000-0000-0000-000000000001/java >>>> (for trusted.glusterfs.dht) [No space left on device] >>>> >>>> Also, we are having the lack disk space. >>>> >>>> Could any one please explain me what glusterd is doing in brick so that >>>> it is causing of its crash. >>>> >>>> Please find the brick logs in attachment. >>>> >>>> Thanks in advance!!! >>>> -- >>>> Regards >>>> Abhishek Paliwal >>>> >>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel at gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-devel >>> >> >> > > > -- > > > > > Regards > Abhishek Paliwal >-- Regards Abhishek Paliwal -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171206/1b58442b/attachment.html>