Here is the requested logs: https://www.dropbox.com/s/vt187h0gtu5doip/gluster_logs_20_40_80_servers.zip?dl=0 On Tue, Aug 29, 2017 at 7:48 AM, Gaurav Yadav <gyadav at redhat.com> wrote:> Till now I haven't found anything significant. > > Can you send me gluster logs along with command-history-logs for these > scenarios: > Scenario1 : 20 servers > Scenario2 : 40 servers > Scenario3: 80 Servers > > > Thanks > Gaurav > > > > On Mon, Aug 28, 2017 at 11:22 AM, Serkan ?oban <cobanserkan at gmail.com> > wrote: >> >> Hi Gaurav, >> Any progress about the problem? >> >> On Thursday, August 24, 2017, Serkan ?oban <cobanserkan at gmail.com> wrote: >>> >>> Thank you Gaurav, >>> Here is more findings: >>> Problem does not happen using only 20 servers each has 68 bricks. >>> (peer probe only 20 servers) >>> If we use 40 servers with single volume, glusterd cpu %100 state >>> continues for 5 minutes and it goes to normal state. >>> with 80 servers we have no working state yet... >>> >>> On Thu, Aug 24, 2017 at 1:33 PM, Gaurav Yadav <gyadav at redhat.com> wrote: >>> > >>> > I am working on it and will share my findings as soon as possible. >>> > >>> > >>> > Thanks >>> > Gaurav >>> > >>> > On Thu, Aug 24, 2017 at 3:58 PM, Serkan ?oban <cobanserkan at gmail.com> >>> > wrote: >>> >> >>> >> Restarting glusterd causes the same thing. I tried with 3.12.rc0, >>> >> 3.10.5. 3.8.15, 3.7.20 all same behavior. >>> >> My OS is centos 6.9, I tried with centos 6.8 problem remains... >>> >> Only way to a healthy state is destroy gluster config/rpms, reinstall >>> >> and recreate volumes. >>> >> >>> >> On Thu, Aug 24, 2017 at 8:49 AM, Serkan ?oban <cobanserkan at gmail.com> >>> >> wrote: >>> >> > Here you can find 10 stack trace samples from glusterd. I wait 10 >>> >> > seconds between each trace. >>> >> > https://www.dropbox.com/s/9f36goq5xn3p1yt/glusterd_pstack.zip?dl=0 >>> >> > >>> >> > Content of the first stack trace is here: >>> >> > >>> >> > Thread 8 (Thread 0x7f7a8cd4e700 (LWP 43069)): >>> >> > #0 0x0000003aa5c0f00d in nanosleep () from /lib64/libpthread.so.0 >>> >> > #1 0x000000303f837d57 in ?? () from /usr/lib64/libglusterfs.so.0 >>> >> > #2 0x0000003aa5c07aa1 in start_thread () from >>> >> > /lib64/libpthread.so.0 >>> >> > #3 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >>> >> > Thread 7 (Thread 0x7f7a8c34d700 (LWP 43070)): >>> >> > #0 0x0000003aa5c0f585 in sigwait () from /lib64/libpthread.so.0 >>> >> > #1 0x000000000040643b in glusterfs_sigwaiter () >>> >> > #2 0x0000003aa5c07aa1 in start_thread () from >>> >> > /lib64/libpthread.so.0 >>> >> > #3 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >>> >> > Thread 6 (Thread 0x7f7a8b94c700 (LWP 43071)): >>> >> > #0 0x0000003aa58acc4d in nanosleep () from /lib64/libc.so.6 >>> >> > #1 0x0000003aa58acac0 in sleep () from /lib64/libc.so.6 >>> >> > #2 0x000000303f8528fb in pool_sweeper () from >>> >> > /usr/lib64/libglusterfs.so.0 >>> >> > #3 0x0000003aa5c07aa1 in start_thread () from >>> >> > /lib64/libpthread.so.0 >>> >> > #4 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >>> >> > Thread 5 (Thread 0x7f7a8af4b700 (LWP 43072)): >>> >> > #0 0x0000003aa5c0ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () >>> >> > from >>> >> > /lib64/libpthread.so.0 >>> >> > #1 0x000000303f864afc in syncenv_task () from >>> >> > /usr/lib64/libglusterfs.so.0 >>> >> > #2 0x000000303f8729f0 in syncenv_processor () from >>> >> > /usr/lib64/libglusterfs.so.0 >>> >> > #3 0x0000003aa5c07aa1 in start_thread () from >>> >> > /lib64/libpthread.so.0 >>> >> > #4 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >>> >> > Thread 4 (Thread 0x7f7a8a54a700 (LWP 43073)): >>> >> > #0 0x0000003aa5c0ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () >>> >> > from >>> >> > /lib64/libpthread.so.0 >>> >> > #1 0x000000303f864afc in syncenv_task () from >>> >> > /usr/lib64/libglusterfs.so.0 >>> >> > #2 0x000000303f8729f0 in syncenv_processor () from >>> >> > /usr/lib64/libglusterfs.so.0 >>> >> > #3 0x0000003aa5c07aa1 in start_thread () from >>> >> > /lib64/libpthread.so.0 >>> >> > #4 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >>> >> > Thread 3 (Thread 0x7f7a886ac700 (LWP 43075)): >>> >> > #0 0x0000003aa5c0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from >>> >> > /lib64/libpthread.so.0 >>> >> > #1 0x00007f7a898a099b in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >>> >> > #2 0x0000003aa5c07aa1 in start_thread () from >>> >> > /lib64/libpthread.so.0 >>> >> > #3 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >>> >> > Thread 2 (Thread 0x7f7a87cab700 (LWP 43076)): >>> >> > #0 0x0000003aa5928692 in __strcmp_sse42 () from /lib64/libc.so.6 >>> >> > #1 0x000000303f82244a in ?? () from /usr/lib64/libglusterfs.so.0 >>> >> > #2 0x000000303f82433d in ?? () from /usr/lib64/libglusterfs.so.0 >>> >> > #3 0x000000303f8245f5 in dict_set () from >>> >> > /usr/lib64/libglusterfs.so.0 >>> >> > #4 0x000000303f82524c in dict_set_str () from >>> >> > /usr/lib64/libglusterfs.so.0 >>> >> > #5 0x00007f7a898da7fd in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >>> >> > #6 0x00007f7a8981b0df in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >>> >> > #7 0x00007f7a8981b47c in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >>> >> > #8 0x00007f7a89831edf in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >>> >> > #9 0x00007f7a897f28f7 in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >>> >> > #10 0x00007f7a897f0bb9 in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >>> >> > #11 0x00007f7a8984c89a in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >>> >> > #12 0x00007f7a898323ee in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >>> >> > #13 0x000000303f40fad5 in rpc_clnt_handle_reply () from >>> >> > /usr/lib64/libgfrpc.so.0 >>> >> > #14 0x000000303f410c85 in rpc_clnt_notify () from >>> >> > /usr/lib64/libgfrpc.so.0 >>> >> > #15 0x000000303f40bd68 in rpc_transport_notify () from >>> >> > /usr/lib64/libgfrpc.so.0 >>> >> > #16 0x00007f7a88a6fccd in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/rpc-transport/socket.so >>> >> > #17 0x00007f7a88a70ffe in ?? () from >>> >> > /usr/lib64/glusterfs/3.10.5/rpc-transport/socket.so >>> >> > #18 0x000000303f887806 in ?? () from /usr/lib64/libglusterfs.so.0 >>> >> > #19 0x0000003aa5c07aa1 in start_thread () from >>> >> > /lib64/libpthread.so.0 >>> >> > #20 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >>> >> > Thread 1 (Thread 0x7f7a93844740 (LWP 43068)): >>> >> > #0 0x0000003aa5c082fd in pthread_join () from >>> >> > /lib64/libpthread.so.0 >>> >> > #1 0x000000303f8872d5 in ?? () from /usr/lib64/libglusterfs.so.0 >>> >> > #2 0x0000000000409020 in main () >>> >> > >>> >> > On Wed, Aug 23, 2017 at 8:46 PM, Atin Mukherjee >>> >> > <amukherj at redhat.com> >>> >> > wrote: >>> >> >> Could you be able to provide the pstack dump of the glusterd >>> >> >> process? >>> >> >> >>> >> >> On Wed, 23 Aug 2017 at 20:22, Atin Mukherjee <amukherj at redhat.com> >>> >> >> wrote: >>> >> >>> >>> >> >>> Not yet. Gaurav will be taking a look at it tomorrow. >>> >> >>> >>> >> >>> On Wed, 23 Aug 2017 at 20:14, Serkan ?oban <cobanserkan at gmail.com> >>> >> >>> wrote: >>> >> >>>> >>> >> >>>> Hi Atin, >>> >> >>>> >>> >> >>>> Do you have time to check the logs? >>> >> >>>> >>> >> >>>> On Wed, Aug 23, 2017 at 10:02 AM, Serkan ?oban >>> >> >>>> <cobanserkan at gmail.com> >>> >> >>>> wrote: >>> >> >>>> > Same thing happens with 3.12.rc0. This time perf top shows >>> >> >>>> > hanging >>> >> >>>> > in >>> >> >>>> > libglusterfs.so and below is the glusterd logs, which are >>> >> >>>> > different >>> >> >>>> > from 3.10. >>> >> >>>> > With 3.10.5, after 60-70 minutes CPU usage becomes normal and >>> >> >>>> > we >>> >> >>>> > see >>> >> >>>> > brick processes come online and system starts to answer >>> >> >>>> > commands >>> >> >>>> > like >>> >> >>>> > "gluster peer status".. >>> >> >>>> > >>> >> >>>> > [2017-08-23 06:46:02.150472] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.152181] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.152287] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.153503] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.153647] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.153866] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.153948] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154018] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154108] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154162] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154250] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154322] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154425] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154494] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154575] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154649] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154705] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154774] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154852] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154903] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.154995] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.155052] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:02.155141] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:27.074052] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > [2017-08-23 06:46:27.077034] E [client_t.c:324:gf_client_ref] >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >>> >> >>>> > [0x7f5ae2c091b1] >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >>> >> >>>> > [0x7f5ae2c0851c] >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] >>> >> >>>> > >>> >> >>>> > On Tue, Aug 22, 2017 at 7:00 PM, Serkan ?oban >>> >> >>>> > <cobanserkan at gmail.com> >>> >> >>>> > wrote: >>> >> >>>> >> I reboot multiple times, also I destroyed the gluster >>> >> >>>> >> configuration >>> >> >>>> >> and recreate multiple times. The behavior is same. >>> >> >>>> >> >>> >> >>>> >> On Tue, Aug 22, 2017 at 6:47 PM, Atin Mukherjee >>> >> >>>> >> <amukherj at redhat.com> >>> >> >>>> >> wrote: >>> >> >>>> >>> My guess is there is a corruption in vol list or peer list >>> >> >>>> >>> which >>> >> >>>> >>> has >>> >> >>>> >>> lead >>> >> >>>> >>> glusterd to get into a infinite loop of traversing a >>> >> >>>> >>> peer/volume >>> >> >>>> >>> list >>> >> >>>> >>> and >>> >> >>>> >>> CPU to hog up. Again this is a guess and I've not got a >>> >> >>>> >>> chance to >>> >> >>>> >>> take a >>> >> >>>> >>> detail look at the logs and the strace output. >>> >> >>>> >>> >>> >> >>>> >>> I believe if you get to reboot the node again the problem >>> >> >>>> >>> will >>> >> >>>> >>> disappear. >>> >> >>>> >>> >>> >> >>>> >>> On Tue, 22 Aug 2017 at 20:07, Serkan ?oban >>> >> >>>> >>> <cobanserkan at gmail.com> >>> >> >>>> >>> wrote: >>> >> >>>> >>>> >>> >> >>>> >>>> As an addition perf top shows %80 libc-2.12.so >>> >> >>>> >>>> __strcmp_sse42 >>> >> >>>> >>>> during >>> >> >>>> >>>> glusterd %100 cpu usage >>> >> >>>> >>>> Hope this helps... >>> >> >>>> >>>> >>> >> >>>> >>>> On Tue, Aug 22, 2017 at 2:41 PM, Serkan ?oban >>> >> >>>> >>>> <cobanserkan at gmail.com> >>> >> >>>> >>>> wrote: >>> >> >>>> >>>> > Hi there, >>> >> >>>> >>>> > >>> >> >>>> >>>> > I have a strange problem. >>> >> >>>> >>>> > Gluster version in 3.10.5, I am testing new servers. >>> >> >>>> >>>> > Gluster >>> >> >>>> >>>> > configuration is 16+4 EC, I have three volumes, each have >>> >> >>>> >>>> > 1600 >>> >> >>>> >>>> > bricks. >>> >> >>>> >>>> > I can successfully create the cluster and volumes without >>> >> >>>> >>>> > any >>> >> >>>> >>>> > problems. I write data to cluster from 100 clients for 12 >>> >> >>>> >>>> > hours >>> >> >>>> >>>> > again >>> >> >>>> >>>> > no problem. But when I try to reboot a node, glusterd >>> >> >>>> >>>> > process >>> >> >>>> >>>> > hangs on >>> >> >>>> >>>> > %100 CPU usage and seems to do nothing, no brick processes >>> >> >>>> >>>> > come >>> >> >>>> >>>> > online. You can find strace of glusterd process for 1 >>> >> >>>> >>>> > minutes >>> >> >>>> >>>> > here: >>> >> >>>> >>>> > >>> >> >>>> >>>> > >>> >> >>>> >>>> > >>> >> >>>> >>>> > https://www.dropbox.com/s/c7bxfnbqxze1yus/gluster_strace.out?dl=0 >>> >> >>>> >>>> > >>> >> >>>> >>>> > Here is the glusterd logs: >>> >> >>>> >>>> > >>> >> >>>> >>>> > https://www.dropbox.com/s/hkstb3mdeil9a5u/glusterd.log?dl=0 >>> >> >>>> >>>> > >>> >> >>>> >>>> > >>> >> >>>> >>>> > By the way, reboot of one server completes without problem >>> >> >>>> >>>> > if >>> >> >>>> >>>> > I >>> >> >>>> >>>> > reboot >>> >> >>>> >>>> > the servers before creating any volumes. >>> >> >>>> >>>> _______________________________________________ >>> >> >>>> >>>> Gluster-users mailing list >>> >> >>>> >>>> Gluster-users at gluster.org >>> >> >>>> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >>>> >>> >>> >> >>>> >>> -- >>> >> >>>> >>> - Atin (atinm) >>> >> >>> >>> >> >>> -- >>> >> >>> - Atin (atinm) >>> >> >> >>> >> >> -- >>> >> >> - Atin (atinm) >>> >> _______________________________________________ >>> >> Gluster-users mailing list >>> >> Gluster-users at gluster.org >>> >> http://lists.gluster.org/mailman/listinfo/gluster-users >>> > >>> > > >
I believe logs you have shared logs which consist of create volume followed by starting the volume. However, you have mentioned that when a node from 80 server cluster gets rebooted, glusterd process hangs. Could you please provide the logs which led glusterd to hang for all the cases along with gusterd process utilization. Thanks Gaurav On Tue, Aug 29, 2017 at 2:44 PM, Serkan ?oban <cobanserkan at gmail.com> wrote:> Here is the requested logs: > https://www.dropbox.com/s/vt187h0gtu5doip/gluster_logs_ > 20_40_80_servers.zip?dl=0 > > > On Tue, Aug 29, 2017 at 7:48 AM, Gaurav Yadav <gyadav at redhat.com> wrote: > > Till now I haven't found anything significant. > > > > Can you send me gluster logs along with command-history-logs for these > > scenarios: > > Scenario1 : 20 servers > > Scenario2 : 40 servers > > Scenario3: 80 Servers > > > > > > Thanks > > Gaurav > > > > > > > > On Mon, Aug 28, 2017 at 11:22 AM, Serkan ?oban <cobanserkan at gmail.com> > > wrote: > >> > >> Hi Gaurav, > >> Any progress about the problem? > >> > >> On Thursday, August 24, 2017, Serkan ?oban <cobanserkan at gmail.com> > wrote: > >>> > >>> Thank you Gaurav, > >>> Here is more findings: > >>> Problem does not happen using only 20 servers each has 68 bricks. > >>> (peer probe only 20 servers) > >>> If we use 40 servers with single volume, glusterd cpu %100 state > >>> continues for 5 minutes and it goes to normal state. > >>> with 80 servers we have no working state yet... > >>> > >>> On Thu, Aug 24, 2017 at 1:33 PM, Gaurav Yadav <gyadav at redhat.com> > wrote: > >>> > > >>> > I am working on it and will share my findings as soon as possible. > >>> > > >>> > > >>> > Thanks > >>> > Gaurav > >>> > > >>> > On Thu, Aug 24, 2017 at 3:58 PM, Serkan ?oban <cobanserkan at gmail.com > > > >>> > wrote: > >>> >> > >>> >> Restarting glusterd causes the same thing. I tried with 3.12.rc0, > >>> >> 3.10.5. 3.8.15, 3.7.20 all same behavior. > >>> >> My OS is centos 6.9, I tried with centos 6.8 problem remains... > >>> >> Only way to a healthy state is destroy gluster config/rpms, > reinstall > >>> >> and recreate volumes. > >>> >> > >>> >> On Thu, Aug 24, 2017 at 8:49 AM, Serkan ?oban < > cobanserkan at gmail.com> > >>> >> wrote: > >>> >> > Here you can find 10 stack trace samples from glusterd. I wait 10 > >>> >> > seconds between each trace. > >>> >> > https://www.dropbox.com/s/9f36goq5xn3p1yt/glusterd_ > pstack.zip?dl=0 > >>> >> > > >>> >> > Content of the first stack trace is here: > >>> >> > > >>> >> > Thread 8 (Thread 0x7f7a8cd4e700 (LWP 43069)): > >>> >> > #0 0x0000003aa5c0f00d in nanosleep () from /lib64/libpthread.so.0 > >>> >> > #1 0x000000303f837d57 in ?? () from /usr/lib64/libglusterfs.so.0 > >>> >> > #2 0x0000003aa5c07aa1 in start_thread () from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #3 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 > >>> >> > Thread 7 (Thread 0x7f7a8c34d700 (LWP 43070)): > >>> >> > #0 0x0000003aa5c0f585 in sigwait () from /lib64/libpthread.so.0 > >>> >> > #1 0x000000000040643b in glusterfs_sigwaiter () > >>> >> > #2 0x0000003aa5c07aa1 in start_thread () from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #3 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 > >>> >> > Thread 6 (Thread 0x7f7a8b94c700 (LWP 43071)): > >>> >> > #0 0x0000003aa58acc4d in nanosleep () from /lib64/libc.so.6 > >>> >> > #1 0x0000003aa58acac0 in sleep () from /lib64/libc.so.6 > >>> >> > #2 0x000000303f8528fb in pool_sweeper () from > >>> >> > /usr/lib64/libglusterfs.so.0 > >>> >> > #3 0x0000003aa5c07aa1 in start_thread () from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #4 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 > >>> >> > Thread 5 (Thread 0x7f7a8af4b700 (LWP 43072)): > >>> >> > #0 0x0000003aa5c0ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () > >>> >> > from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #1 0x000000303f864afc in syncenv_task () from > >>> >> > /usr/lib64/libglusterfs.so.0 > >>> >> > #2 0x000000303f8729f0 in syncenv_processor () from > >>> >> > /usr/lib64/libglusterfs.so.0 > >>> >> > #3 0x0000003aa5c07aa1 in start_thread () from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #4 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 > >>> >> > Thread 4 (Thread 0x7f7a8a54a700 (LWP 43073)): > >>> >> > #0 0x0000003aa5c0ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () > >>> >> > from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #1 0x000000303f864afc in syncenv_task () from > >>> >> > /usr/lib64/libglusterfs.so.0 > >>> >> > #2 0x000000303f8729f0 in syncenv_processor () from > >>> >> > /usr/lib64/libglusterfs.so.0 > >>> >> > #3 0x0000003aa5c07aa1 in start_thread () from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #4 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 > >>> >> > Thread 3 (Thread 0x7f7a886ac700 (LWP 43075)): > >>> >> > #0 0x0000003aa5c0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #1 0x00007f7a898a099b in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so > >>> >> > #2 0x0000003aa5c07aa1 in start_thread () from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #3 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 > >>> >> > Thread 2 (Thread 0x7f7a87cab700 (LWP 43076)): > >>> >> > #0 0x0000003aa5928692 in __strcmp_sse42 () from /lib64/libc.so.6 > >>> >> > #1 0x000000303f82244a in ?? () from /usr/lib64/libglusterfs.so.0 > >>> >> > #2 0x000000303f82433d in ?? () from /usr/lib64/libglusterfs.so.0 > >>> >> > #3 0x000000303f8245f5 in dict_set () from > >>> >> > /usr/lib64/libglusterfs.so.0 > >>> >> > #4 0x000000303f82524c in dict_set_str () from > >>> >> > /usr/lib64/libglusterfs.so.0 > >>> >> > #5 0x00007f7a898da7fd in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so > >>> >> > #6 0x00007f7a8981b0df in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so > >>> >> > #7 0x00007f7a8981b47c in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so > >>> >> > #8 0x00007f7a89831edf in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so > >>> >> > #9 0x00007f7a897f28f7 in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so > >>> >> > #10 0x00007f7a897f0bb9 in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so > >>> >> > #11 0x00007f7a8984c89a in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so > >>> >> > #12 0x00007f7a898323ee in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so > >>> >> > #13 0x000000303f40fad5 in rpc_clnt_handle_reply () from > >>> >> > /usr/lib64/libgfrpc.so.0 > >>> >> > #14 0x000000303f410c85 in rpc_clnt_notify () from > >>> >> > /usr/lib64/libgfrpc.so.0 > >>> >> > #15 0x000000303f40bd68 in rpc_transport_notify () from > >>> >> > /usr/lib64/libgfrpc.so.0 > >>> >> > #16 0x00007f7a88a6fccd in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/rpc-transport/socket.so > >>> >> > #17 0x00007f7a88a70ffe in ?? () from > >>> >> > /usr/lib64/glusterfs/3.10.5/rpc-transport/socket.so > >>> >> > #18 0x000000303f887806 in ?? () from /usr/lib64/libglusterfs.so.0 > >>> >> > #19 0x0000003aa5c07aa1 in start_thread () from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #20 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 > >>> >> > Thread 1 (Thread 0x7f7a93844740 (LWP 43068)): > >>> >> > #0 0x0000003aa5c082fd in pthread_join () from > >>> >> > /lib64/libpthread.so.0 > >>> >> > #1 0x000000303f8872d5 in ?? () from /usr/lib64/libglusterfs.so.0 > >>> >> > #2 0x0000000000409020 in main () > >>> >> > > >>> >> > On Wed, Aug 23, 2017 at 8:46 PM, Atin Mukherjee > >>> >> > <amukherj at redhat.com> > >>> >> > wrote: > >>> >> >> Could you be able to provide the pstack dump of the glusterd > >>> >> >> process? > >>> >> >> > >>> >> >> On Wed, 23 Aug 2017 at 20:22, Atin Mukherjee < > amukherj at redhat.com> > >>> >> >> wrote: > >>> >> >>> > >>> >> >>> Not yet. Gaurav will be taking a look at it tomorrow. > >>> >> >>> > >>> >> >>> On Wed, 23 Aug 2017 at 20:14, Serkan ?oban < > cobanserkan at gmail.com> > >>> >> >>> wrote: > >>> >> >>>> > >>> >> >>>> Hi Atin, > >>> >> >>>> > >>> >> >>>> Do you have time to check the logs? > >>> >> >>>> > >>> >> >>>> On Wed, Aug 23, 2017 at 10:02 AM, Serkan ?oban > >>> >> >>>> <cobanserkan at gmail.com> > >>> >> >>>> wrote: > >>> >> >>>> > Same thing happens with 3.12.rc0. This time perf top shows > >>> >> >>>> > hanging > >>> >> >>>> > in > >>> >> >>>> > libglusterfs.so and below is the glusterd logs, which are > >>> >> >>>> > different > >>> >> >>>> > from 3.10. > >>> >> >>>> > With 3.10.5, after 60-70 minutes CPU usage becomes normal and > >>> >> >>>> > we > >>> >> >>>> > see > >>> >> >>>> > brick processes come online and system starts to answer > >>> >> >>>> > commands > >>> >> >>>> > like > >>> >> >>>> > "gluster peer status".. > >>> >> >>>> > > >>> >> >>>> > [2017-08-23 06:46:02.150472] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.152181] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.152287] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.153503] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.153647] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.153866] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.153948] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154018] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154108] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154162] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154250] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154322] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154425] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154494] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154575] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154649] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154705] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154774] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154852] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154903] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.154995] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.155052] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:02.155141] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:27.074052] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > [2017-08-23 06:46:27.077034] E [client_t.c:324:gf_client_ref] > >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) > >>> >> >>>> > [0x7f5ae2c091b1] > >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) > >>> >> >>>> > [0x7f5ae2c0851c] > >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) > >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid argument] > >>> >> >>>> > > >>> >> >>>> > On Tue, Aug 22, 2017 at 7:00 PM, Serkan ?oban > >>> >> >>>> > <cobanserkan at gmail.com> > >>> >> >>>> > wrote: > >>> >> >>>> >> I reboot multiple times, also I destroyed the gluster > >>> >> >>>> >> configuration > >>> >> >>>> >> and recreate multiple times. The behavior is same. > >>> >> >>>> >> > >>> >> >>>> >> On Tue, Aug 22, 2017 at 6:47 PM, Atin Mukherjee > >>> >> >>>> >> <amukherj at redhat.com> > >>> >> >>>> >> wrote: > >>> >> >>>> >>> My guess is there is a corruption in vol list or peer list > >>> >> >>>> >>> which > >>> >> >>>> >>> has > >>> >> >>>> >>> lead > >>> >> >>>> >>> glusterd to get into a infinite loop of traversing a > >>> >> >>>> >>> peer/volume > >>> >> >>>> >>> list > >>> >> >>>> >>> and > >>> >> >>>> >>> CPU to hog up. Again this is a guess and I've not got a > >>> >> >>>> >>> chance to > >>> >> >>>> >>> take a > >>> >> >>>> >>> detail look at the logs and the strace output. > >>> >> >>>> >>> > >>> >> >>>> >>> I believe if you get to reboot the node again the problem > >>> >> >>>> >>> will > >>> >> >>>> >>> disappear. > >>> >> >>>> >>> > >>> >> >>>> >>> On Tue, 22 Aug 2017 at 20:07, Serkan ?oban > >>> >> >>>> >>> <cobanserkan at gmail.com> > >>> >> >>>> >>> wrote: > >>> >> >>>> >>>> > >>> >> >>>> >>>> As an addition perf top shows %80 libc-2.12.so > >>> >> >>>> >>>> __strcmp_sse42 > >>> >> >>>> >>>> during > >>> >> >>>> >>>> glusterd %100 cpu usage > >>> >> >>>> >>>> Hope this helps... > >>> >> >>>> >>>> > >>> >> >>>> >>>> On Tue, Aug 22, 2017 at 2:41 PM, Serkan ?oban > >>> >> >>>> >>>> <cobanserkan at gmail.com> > >>> >> >>>> >>>> wrote: > >>> >> >>>> >>>> > Hi there, > >>> >> >>>> >>>> > > >>> >> >>>> >>>> > I have a strange problem. > >>> >> >>>> >>>> > Gluster version in 3.10.5, I am testing new servers. > >>> >> >>>> >>>> > Gluster > >>> >> >>>> >>>> > configuration is 16+4 EC, I have three volumes, each > have > >>> >> >>>> >>>> > 1600 > >>> >> >>>> >>>> > bricks. > >>> >> >>>> >>>> > I can successfully create the cluster and volumes > without > >>> >> >>>> >>>> > any > >>> >> >>>> >>>> > problems. I write data to cluster from 100 clients for > 12 > >>> >> >>>> >>>> > hours > >>> >> >>>> >>>> > again > >>> >> >>>> >>>> > no problem. But when I try to reboot a node, glusterd > >>> >> >>>> >>>> > process > >>> >> >>>> >>>> > hangs on > >>> >> >>>> >>>> > %100 CPU usage and seems to do nothing, no brick > processes > >>> >> >>>> >>>> > come > >>> >> >>>> >>>> > online. You can find strace of glusterd process for 1 > >>> >> >>>> >>>> > minutes > >>> >> >>>> >>>> > here: > >>> >> >>>> >>>> > > >>> >> >>>> >>>> > > >>> >> >>>> >>>> > > >>> >> >>>> >>>> > https://www.dropbox.com/s/c7bxfnbqxze1yus/gluster_ > strace.out?dl=0 > >>> >> >>>> >>>> > > >>> >> >>>> >>>> > Here is the glusterd logs: > >>> >> >>>> >>>> > > >>> >> >>>> >>>> > https://www.dropbox.com/s/hkstb3mdeil9a5u/glusterd.log? > dl=0 > >>> >> >>>> >>>> > > >>> >> >>>> >>>> > > >>> >> >>>> >>>> > By the way, reboot of one server completes without > problem > >>> >> >>>> >>>> > if > >>> >> >>>> >>>> > I > >>> >> >>>> >>>> > reboot > >>> >> >>>> >>>> > the servers before creating any volumes. > >>> >> >>>> >>>> _______________________________________________ > >>> >> >>>> >>>> Gluster-users mailing list > >>> >> >>>> >>>> Gluster-users at gluster.org > >>> >> >>>> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users > >>> >> >>>> >>> > >>> >> >>>> >>> -- > >>> >> >>>> >>> - Atin (atinm) > >>> >> >>> > >>> >> >>> -- > >>> >> >>> - Atin (atinm) > >>> >> >> > >>> >> >> -- > >>> >> >> - Atin (atinm) > >>> >> _______________________________________________ > >>> >> Gluster-users mailing list > >>> >> Gluster-users at gluster.org > >>> >> http://lists.gluster.org/mailman/listinfo/gluster-users > >>> > > >>> > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170829/414ad87e/attachment.html>
Here is the logs after stopping all three volumes and restarting glusterd in all nodes. I waited 70 minutes after glusterd restart but it is still consuming %100 CPU. https://www.dropbox.com/s/pzl0f198v03twx3/80servers_after_glusterd_restart.zip?dl=0 On Tue, Aug 29, 2017 at 12:37 PM, Gaurav Yadav <gyadav at redhat.com> wrote:> > I believe logs you have shared logs which consist of create volume followed > by starting the volume. > However, you have mentioned that when a node from 80 server cluster gets > rebooted, glusterd process hangs. > > Could you please provide the logs which led glusterd to hang for all the > cases along with gusterd process utilization. > > > Thanks > Gaurav > > > > > > > On Tue, Aug 29, 2017 at 2:44 PM, Serkan ?oban <cobanserkan at gmail.com> wrote: >> >> Here is the requested logs: >> >> https://www.dropbox.com/s/vt187h0gtu5doip/gluster_logs_20_40_80_servers.zip?dl=0 >> >> >> On Tue, Aug 29, 2017 at 7:48 AM, Gaurav Yadav <gyadav at redhat.com> wrote: >> > Till now I haven't found anything significant. >> > >> > Can you send me gluster logs along with command-history-logs for these >> > scenarios: >> > Scenario1 : 20 servers >> > Scenario2 : 40 servers >> > Scenario3: 80 Servers >> > >> > >> > Thanks >> > Gaurav >> > >> > >> > >> > On Mon, Aug 28, 2017 at 11:22 AM, Serkan ?oban <cobanserkan at gmail.com> >> > wrote: >> >> >> >> Hi Gaurav, >> >> Any progress about the problem? >> >> >> >> On Thursday, August 24, 2017, Serkan ?oban <cobanserkan at gmail.com> >> >> wrote: >> >>> >> >>> Thank you Gaurav, >> >>> Here is more findings: >> >>> Problem does not happen using only 20 servers each has 68 bricks. >> >>> (peer probe only 20 servers) >> >>> If we use 40 servers with single volume, glusterd cpu %100 state >> >>> continues for 5 minutes and it goes to normal state. >> >>> with 80 servers we have no working state yet... >> >>> >> >>> On Thu, Aug 24, 2017 at 1:33 PM, Gaurav Yadav <gyadav at redhat.com> >> >>> wrote: >> >>> > >> >>> > I am working on it and will share my findings as soon as possible. >> >>> > >> >>> > >> >>> > Thanks >> >>> > Gaurav >> >>> > >> >>> > On Thu, Aug 24, 2017 at 3:58 PM, Serkan ?oban >> >>> > <cobanserkan at gmail.com> >> >>> > wrote: >> >>> >> >> >>> >> Restarting glusterd causes the same thing. I tried with 3.12.rc0, >> >>> >> 3.10.5. 3.8.15, 3.7.20 all same behavior. >> >>> >> My OS is centos 6.9, I tried with centos 6.8 problem remains... >> >>> >> Only way to a healthy state is destroy gluster config/rpms, >> >>> >> reinstall >> >>> >> and recreate volumes. >> >>> >> >> >>> >> On Thu, Aug 24, 2017 at 8:49 AM, Serkan ?oban >> >>> >> <cobanserkan at gmail.com> >> >>> >> wrote: >> >>> >> > Here you can find 10 stack trace samples from glusterd. I wait 10 >> >>> >> > seconds between each trace. >> >>> >> > >> >>> >> > https://www.dropbox.com/s/9f36goq5xn3p1yt/glusterd_pstack.zip?dl=0 >> >>> >> > >> >>> >> > Content of the first stack trace is here: >> >>> >> > >> >>> >> > Thread 8 (Thread 0x7f7a8cd4e700 (LWP 43069)): >> >>> >> > #0 0x0000003aa5c0f00d in nanosleep () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #1 0x000000303f837d57 in ?? () from /usr/lib64/libglusterfs.so.0 >> >>> >> > #2 0x0000003aa5c07aa1 in start_thread () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #3 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >> >>> >> > Thread 7 (Thread 0x7f7a8c34d700 (LWP 43070)): >> >>> >> > #0 0x0000003aa5c0f585 in sigwait () from /lib64/libpthread.so.0 >> >>> >> > #1 0x000000000040643b in glusterfs_sigwaiter () >> >>> >> > #2 0x0000003aa5c07aa1 in start_thread () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #3 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >> >>> >> > Thread 6 (Thread 0x7f7a8b94c700 (LWP 43071)): >> >>> >> > #0 0x0000003aa58acc4d in nanosleep () from /lib64/libc.so.6 >> >>> >> > #1 0x0000003aa58acac0 in sleep () from /lib64/libc.so.6 >> >>> >> > #2 0x000000303f8528fb in pool_sweeper () from >> >>> >> > /usr/lib64/libglusterfs.so.0 >> >>> >> > #3 0x0000003aa5c07aa1 in start_thread () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #4 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >> >>> >> > Thread 5 (Thread 0x7f7a8af4b700 (LWP 43072)): >> >>> >> > #0 0x0000003aa5c0ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () >> >>> >> > from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #1 0x000000303f864afc in syncenv_task () from >> >>> >> > /usr/lib64/libglusterfs.so.0 >> >>> >> > #2 0x000000303f8729f0 in syncenv_processor () from >> >>> >> > /usr/lib64/libglusterfs.so.0 >> >>> >> > #3 0x0000003aa5c07aa1 in start_thread () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #4 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >> >>> >> > Thread 4 (Thread 0x7f7a8a54a700 (LWP 43073)): >> >>> >> > #0 0x0000003aa5c0ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () >> >>> >> > from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #1 0x000000303f864afc in syncenv_task () from >> >>> >> > /usr/lib64/libglusterfs.so.0 >> >>> >> > #2 0x000000303f8729f0 in syncenv_processor () from >> >>> >> > /usr/lib64/libglusterfs.so.0 >> >>> >> > #3 0x0000003aa5c07aa1 in start_thread () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #4 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >> >>> >> > Thread 3 (Thread 0x7f7a886ac700 (LWP 43075)): >> >>> >> > #0 0x0000003aa5c0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #1 0x00007f7a898a099b in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >> >>> >> > #2 0x0000003aa5c07aa1 in start_thread () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #3 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >> >>> >> > Thread 2 (Thread 0x7f7a87cab700 (LWP 43076)): >> >>> >> > #0 0x0000003aa5928692 in __strcmp_sse42 () from /lib64/libc.so.6 >> >>> >> > #1 0x000000303f82244a in ?? () from /usr/lib64/libglusterfs.so.0 >> >>> >> > #2 0x000000303f82433d in ?? () from /usr/lib64/libglusterfs.so.0 >> >>> >> > #3 0x000000303f8245f5 in dict_set () from >> >>> >> > /usr/lib64/libglusterfs.so.0 >> >>> >> > #4 0x000000303f82524c in dict_set_str () from >> >>> >> > /usr/lib64/libglusterfs.so.0 >> >>> >> > #5 0x00007f7a898da7fd in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >> >>> >> > #6 0x00007f7a8981b0df in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >> >>> >> > #7 0x00007f7a8981b47c in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >> >>> >> > #8 0x00007f7a89831edf in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >> >>> >> > #9 0x00007f7a897f28f7 in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >> >>> >> > #10 0x00007f7a897f0bb9 in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >> >>> >> > #11 0x00007f7a8984c89a in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >> >>> >> > #12 0x00007f7a898323ee in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/xlator/mgmt/glusterd.so >> >>> >> > #13 0x000000303f40fad5 in rpc_clnt_handle_reply () from >> >>> >> > /usr/lib64/libgfrpc.so.0 >> >>> >> > #14 0x000000303f410c85 in rpc_clnt_notify () from >> >>> >> > /usr/lib64/libgfrpc.so.0 >> >>> >> > #15 0x000000303f40bd68 in rpc_transport_notify () from >> >>> >> > /usr/lib64/libgfrpc.so.0 >> >>> >> > #16 0x00007f7a88a6fccd in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/rpc-transport/socket.so >> >>> >> > #17 0x00007f7a88a70ffe in ?? () from >> >>> >> > /usr/lib64/glusterfs/3.10.5/rpc-transport/socket.so >> >>> >> > #18 0x000000303f887806 in ?? () from /usr/lib64/libglusterfs.so.0 >> >>> >> > #19 0x0000003aa5c07aa1 in start_thread () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #20 0x0000003aa58e8bbd in clone () from /lib64/libc.so.6 >> >>> >> > Thread 1 (Thread 0x7f7a93844740 (LWP 43068)): >> >>> >> > #0 0x0000003aa5c082fd in pthread_join () from >> >>> >> > /lib64/libpthread.so.0 >> >>> >> > #1 0x000000303f8872d5 in ?? () from /usr/lib64/libglusterfs.so.0 >> >>> >> > #2 0x0000000000409020 in main () >> >>> >> > >> >>> >> > On Wed, Aug 23, 2017 at 8:46 PM, Atin Mukherjee >> >>> >> > <amukherj at redhat.com> >> >>> >> > wrote: >> >>> >> >> Could you be able to provide the pstack dump of the glusterd >> >>> >> >> process? >> >>> >> >> >> >>> >> >> On Wed, 23 Aug 2017 at 20:22, Atin Mukherjee >> >>> >> >> <amukherj at redhat.com> >> >>> >> >> wrote: >> >>> >> >>> >> >>> >> >>> Not yet. Gaurav will be taking a look at it tomorrow. >> >>> >> >>> >> >>> >> >>> On Wed, 23 Aug 2017 at 20:14, Serkan ?oban >> >>> >> >>> <cobanserkan at gmail.com> >> >>> >> >>> wrote: >> >>> >> >>>> >> >>> >> >>>> Hi Atin, >> >>> >> >>>> >> >>> >> >>>> Do you have time to check the logs? >> >>> >> >>>> >> >>> >> >>>> On Wed, Aug 23, 2017 at 10:02 AM, Serkan ?oban >> >>> >> >>>> <cobanserkan at gmail.com> >> >>> >> >>>> wrote: >> >>> >> >>>> > Same thing happens with 3.12.rc0. This time perf top shows >> >>> >> >>>> > hanging >> >>> >> >>>> > in >> >>> >> >>>> > libglusterfs.so and below is the glusterd logs, which are >> >>> >> >>>> > different >> >>> >> >>>> > from 3.10. >> >>> >> >>>> > With 3.10.5, after 60-70 minutes CPU usage becomes normal >> >>> >> >>>> > and >> >>> >> >>>> > we >> >>> >> >>>> > see >> >>> >> >>>> > brick processes come online and system starts to answer >> >>> >> >>>> > commands >> >>> >> >>>> > like >> >>> >> >>>> > "gluster peer status".. >> >>> >> >>>> > >> >>> >> >>>> > [2017-08-23 06:46:02.150472] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.152181] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.152287] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.153503] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.153647] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.153866] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.153948] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154018] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154108] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154162] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154250] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154322] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154425] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154494] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154575] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154649] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154705] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154774] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154852] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154903] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.154995] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.155052] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:02.155141] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:27.074052] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > [2017-08-23 06:46:27.077034] E >> >>> >> >>>> > [client_t.c:324:gf_client_ref] >> >>> >> >>>> > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_request_create+0xf1) >> >>> >> >>>> > [0x7f5ae2c091b1] >> >>> >> >>>> > -->/usr/lib64/libgfrpc.so.0(rpcsvc_request_init+0x9c) >> >>> >> >>>> > [0x7f5ae2c0851c] >> >>> >> >>>> > -->/usr/lib64/libglusterfs.so.0(gf_client_ref+0x1a9) >> >>> >> >>>> > [0x7f5ae2ea3949] ) 0-client_t: null client [Invalid >> >>> >> >>>> > argument] >> >>> >> >>>> > >> >>> >> >>>> > On Tue, Aug 22, 2017 at 7:00 PM, Serkan ?oban >> >>> >> >>>> > <cobanserkan at gmail.com> >> >>> >> >>>> > wrote: >> >>> >> >>>> >> I reboot multiple times, also I destroyed the gluster >> >>> >> >>>> >> configuration >> >>> >> >>>> >> and recreate multiple times. The behavior is same. >> >>> >> >>>> >> >> >>> >> >>>> >> On Tue, Aug 22, 2017 at 6:47 PM, Atin Mukherjee >> >>> >> >>>> >> <amukherj at redhat.com> >> >>> >> >>>> >> wrote: >> >>> >> >>>> >>> My guess is there is a corruption in vol list or peer list >> >>> >> >>>> >>> which >> >>> >> >>>> >>> has >> >>> >> >>>> >>> lead >> >>> >> >>>> >>> glusterd to get into a infinite loop of traversing a >> >>> >> >>>> >>> peer/volume >> >>> >> >>>> >>> list >> >>> >> >>>> >>> and >> >>> >> >>>> >>> CPU to hog up. Again this is a guess and I've not got a >> >>> >> >>>> >>> chance to >> >>> >> >>>> >>> take a >> >>> >> >>>> >>> detail look at the logs and the strace output. >> >>> >> >>>> >>> >> >>> >> >>>> >>> I believe if you get to reboot the node again the problem >> >>> >> >>>> >>> will >> >>> >> >>>> >>> disappear. >> >>> >> >>>> >>> >> >>> >> >>>> >>> On Tue, 22 Aug 2017 at 20:07, Serkan ?oban >> >>> >> >>>> >>> <cobanserkan at gmail.com> >> >>> >> >>>> >>> wrote: >> >>> >> >>>> >>>> >> >>> >> >>>> >>>> As an addition perf top shows %80 libc-2.12.so >> >>> >> >>>> >>>> __strcmp_sse42 >> >>> >> >>>> >>>> during >> >>> >> >>>> >>>> glusterd %100 cpu usage >> >>> >> >>>> >>>> Hope this helps... >> >>> >> >>>> >>>> >> >>> >> >>>> >>>> On Tue, Aug 22, 2017 at 2:41 PM, Serkan ?oban >> >>> >> >>>> >>>> <cobanserkan at gmail.com> >> >>> >> >>>> >>>> wrote: >> >>> >> >>>> >>>> > Hi there, >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > I have a strange problem. >> >>> >> >>>> >>>> > Gluster version in 3.10.5, I am testing new servers. >> >>> >> >>>> >>>> > Gluster >> >>> >> >>>> >>>> > configuration is 16+4 EC, I have three volumes, each >> >>> >> >>>> >>>> > have >> >>> >> >>>> >>>> > 1600 >> >>> >> >>>> >>>> > bricks. >> >>> >> >>>> >>>> > I can successfully create the cluster and volumes >> >>> >> >>>> >>>> > without >> >>> >> >>>> >>>> > any >> >>> >> >>>> >>>> > problems. I write data to cluster from 100 clients for >> >>> >> >>>> >>>> > 12 >> >>> >> >>>> >>>> > hours >> >>> >> >>>> >>>> > again >> >>> >> >>>> >>>> > no problem. But when I try to reboot a node, glusterd >> >>> >> >>>> >>>> > process >> >>> >> >>>> >>>> > hangs on >> >>> >> >>>> >>>> > %100 CPU usage and seems to do nothing, no brick >> >>> >> >>>> >>>> > processes >> >>> >> >>>> >>>> > come >> >>> >> >>>> >>>> > online. You can find strace of glusterd process for 1 >> >>> >> >>>> >>>> > minutes >> >>> >> >>>> >>>> > here: >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > https://www.dropbox.com/s/c7bxfnbqxze1yus/gluster_strace.out?dl=0 >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > Here is the glusterd logs: >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > https://www.dropbox.com/s/hkstb3mdeil9a5u/glusterd.log?dl=0 >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > >> >>> >> >>>> >>>> > By the way, reboot of one server completes without >> >>> >> >>>> >>>> > problem >> >>> >> >>>> >>>> > if >> >>> >> >>>> >>>> > I >> >>> >> >>>> >>>> > reboot >> >>> >> >>>> >>>> > the servers before creating any volumes. >> >>> >> >>>> >>>> _______________________________________________ >> >>> >> >>>> >>>> Gluster-users mailing list >> >>> >> >>>> >>>> Gluster-users at gluster.org >> >>> >> >>>> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >> >>> >> >>>> >>> >> >>> >> >>>> >>> -- >> >>> >> >>>> >>> - Atin (atinm) >> >>> >> >>> >> >>> >> >>> -- >> >>> >> >>> - Atin (atinm) >> >>> >> >> >> >>> >> >> -- >> >>> >> >> - Atin (atinm) >> >>> >> _______________________________________________ >> >>> >> Gluster-users mailing list >> >>> >> Gluster-users at gluster.org >> >>> >> http://lists.gluster.org/mailman/listinfo/gluster-users >> >>> > >> >>> > >> > >> > > >