Ning Adam
2018-Sep-14 07:45 UTC
[Gluster-users] gluster volume heal info hang on 3.10.6 & 3.12.11
(My last mail seemed to be dropped by the list, reposting in plain-text format) Hello: I have recently done a rolling upgraded on some of my 3.7.6 gluster (AFR + DHT) deploments to 3.10.6, having two problems afterwards: 1) IO error occationally 2) "gluster volume heal vol0 info" hangs after decent amount of files written to the cluster (vary from hours to days) All of my 7 upgraded 3.10 clusters reproduced these problems, and then I tried upgraded them to 3.12.11, but nothing improved. I have submited a bug on https://bugzilla.redhat.com/show_bug.cgi?id=1618932 , so far no one seams to look into it. The problem of heal hanging is most annoying because it makes me impossible to monitor my clusters. So I tried to look into the code and analyze with the DEBUG & TRACE log, found the following situation. 1) I have "network.ping-timeout: 120" in volume option . If I modify network.ping-timeout to 0, "gluster volume heal vol0 info" will finish eventually after 10~20 minutes (8x3 bricks setup). And it says no file to heal. It takes even longer when cluster is larger (36x3 bricks setup). It does not matter even I stop all my IO or restart all glusterd & glusterfsd process. heal info still hangs. "gluster volume status" is all normal. All my brick servers are in low latency LAN enviroment. 2) As one of my clusters having this problem is testing setup, currently I may do some debug there. I looked into heal/src/glfs-heal.c, compile the code and added some log, the initialize call on "ret = glfs_init (fs)" hangs. I traced to " ret = glfs_chdir (fs, "/");" and then "dht_lookup( path='/' )", it never get to "dht_readdirp_cbk()" I stopped all client IO operation on my gluster volume and then ran "glfsheal vol0". I looked into /var/log/gluster/glfshead_vol0.log, there is messive looping of reconnect attempt on all bricks. eg: [2018-09-14 05:56:09.395869] T [rpc-clnt.c:407:rpc_clnt_reconnect] 0-vol0-client-12: attempting reconnect [2018-09-14 05:56:09.396160] T [socket.c:3125:socket_connect] (--> /lib/libglusterfs.so.0(_gf_log_callingfn+0x1ee)[0x7f10d4e79778] (--> /usr/lib/glus terfs//rpc-transport/socket.so(+0xd442)[0x7f10c647c442] (--> /lib/libgfrpc.so.0(rpc_transport_connect+0x93)[0x7f10d437de43] (--> /lib/libgfrpc.so.0(r pc_clnt_reconnect+0x136)[0x7f10d438121b] (--> /lib/libglusterfs.so.0(+0x3e4e9)[0x7f10d4e894e9] ))))) 0-vol0-client-12: connect () called on transport already connected [2018-09-14 05:56:09.396194] T [rpc-clnt.c:407:rpc_clnt_reconnect] 0-vol0-client-10: attempting reconnect [2018-09-14 05:56:09.396424] T [socket.c:3125:socket_connect] (--> /lib/libglusterfs.so.0(_gf_log_callingfn+0x1ee)[0x7f10d4e79778] (--> /usr/lib/glus terfs//rpc-transport/socket.so(+0xd442)[0x7f10c647c442] (--> /lib/libgfrpc.so.0(rpc_transport_connect+0x93)[0x7f10d437de43] (--> /lib/libgfrpc.so.0(r pc_clnt_reconnect+0x136)[0x7f10d438121b] (--> /lib/libglusterfs.so.0(+0x3e4e9)[0x7f10d4e894e9] ))))) 0-vol0-client-10: connect () called on transport already connected It's also weird that 0-vol0-client-5 was having huge ammount RPC ping latency (29 seconds it's not possible because simple ICMP pinging that server is less then 0.1 second) root at storage0-cluster0:~# less /vgrep ping /var/log/glusterfs/glfsheal-vol0.log | grep laten [2018-09-14 05:13:16.008866] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-0: Ping latency is 0ms [2018-09-14 05:13:16.016669] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-1: Ping latency is 0ms [2018-09-14 05:13:16.021769] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-2: Ping latency is 0ms [2018-09-14 05:13:16.033761] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-3: Ping latency is 4ms [2018-09-14 05:13:16.101962] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-4: Ping latency is 62ms [2018-09-14 05:13:46.105502] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-5: Ping latency is 29996ms [2018-09-14 05:20:13.895285] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-0: Ping latency is 0ms [2018-09-14 05:20:13.901476] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-1: Ping latency is 0ms [2018-09-14 05:20:13.908373] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-2: Ping latency is 0ms [2018-09-14 05:20:13.918782] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-3: Ping latency is 4ms [2018-09-14 05:20:13.987162] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-4: Ping latency is 62ms [2018-09-14 05:20:43.992528] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-5: Ping latency is 29998ms [2018-09-14 05:25:11.983259] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-0: Ping latency is 0ms [2018-09-14 05:25:11.991199] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-1: Ping latency is 0ms [2018-09-14 05:25:11.996933] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-2: Ping latency is 0ms [2018-09-14 05:25:12.007867] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-3: Ping latency is 4ms [2018-09-14 05:25:12.076057] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-4: Ping latency is 62ms I checked vol0-client-5, nothing special comparing to other brick. CPU is not high, disk has no problem. No warning or error log in /var/log/gluster/brick/data-brk01-src.log, After I turn on TRACE level log on brick, the logging is too much to analyze, and no apparent difference against other brick. I hope someone can have some idea about this, or pointing the direction to look into may help. Thanks very much
Ning Adam
2018-Sep-14 15:44 UTC
[Gluster-users] gluster volume heal info hang on 3.10.6 & 3.12.11
I have made some progress: Using gluster-3.12.1, "glfsheal vol0" is fine. It tells me some file need to be heal Brick storage0-cluster0:/data/brk02/src <gfid:88dfb19a-92dd-4abd-bb15-4931e6e47cd9> <gfid:88dfb19a-92dd-4abd-bb15-4931e6e47cd9>/0001 Status: Connected Number of entries: 2 The hanging problem was introduced in commit 94faf8c4a1f5a020593ca65f44d68e1ba2f632eb Author: karthik-us <ksubrahm at redhat.com> Date: Wed Sep 20 18:30:21 2017 +0530 cluster/afr: Sending subvol up/down events when subvol comes up or goes down > BUG: 1493539 (cherry picked from commit 3bbb4fe4b33dc3a3ed068ed2284077f2a4d8265a) Change-Id: I6580351b245d5f868e9ddc6a4eb4dd6afa3bb6ec BUG: 1492066 Signed-off-by: karthik-us <ksubrahm at redhat.com> But while I using HEAD of relase-3.12 branch and do "git revert 94faf8c4", "glfsheal vol0" still hangs, so this is not a simple problem. I found the bug related to "client.event-threads", because previously I have the setting "client.event-threads: 6", there was only latency log about vol0-client 0-5 and [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-5: Ping latency is 29998ms No ping latency to other brick in the log. After I do "gluster volume set vol0 client.event-threads: 24" and install 3.12.12 (since I am debuging on 8x3 cluster), running "glfsheal vol0" finished in 30seconds (the time is still acceptable). And there was [2018-09-14 21:10:17.817764] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-23: Ping latency is 29876ms So it leads to a conclusion that connections to brick might be holding lock in client event-thread. My gluster volume info: Options Reconfigured: cluster.locking-scheme: full performance.open-behind: on cluster.lookup-unhashed: auto transport.address-family: inet cluster.lookup-optimize: on diagnostics.brick-log-level: TRACE performance.readdir-ahead: True performance.md-cache-timeout: 0 cluster.server-quorum-type: none performance.client-io-threads: True performance.read-ahead: False features.cache-invalidation: False network.ping-timeout: 120 cluster.quorum-type: auto nfs.disable: True performance.io-cache: False server.event-threads: 4 cluster.rebal-throttle: lazy transport.keepalive: True performance.write-behind: True cluster.self-heal-daemon: enable client.event-threads: 6 cluster.eager-lock: on diagnostics.client-log-level: INFO features.trash: off features.selinux: off cc ksubrahm at redhat.com, if you have any clue for this problem. Ning Adam <frostyplanet at gmail.com> ?2018?9?14??? ??3:45???> > (My last mail seemed to be dropped by the list, reposting in plain-text format) > > Hello: > > I have recently done a rolling upgraded on some of my 3.7.6 gluster > (AFR + DHT) deploments to 3.10.6, having two problems afterwards: > 1) IO error occationally > 2) "gluster volume heal vol0 info" hangs after decent amount of > files written to the cluster (vary from hours to days) > > All of my 7 upgraded 3.10 clusters reproduced these problems, and > then I tried upgraded them to 3.12.11, but nothing improved. > I have submited a bug on > https://bugzilla.redhat.com/show_bug.cgi?id=1618932 , so far no one > seams to look into it. > > The problem of heal hanging is most annoying because it makes me > impossible to monitor my clusters. So I tried to look into the code > and analyze with the DEBUG & TRACE log, found the following situation. > > 1) I have "network.ping-timeout: 120" in volume option . If I modify > network.ping-timeout to 0, > "gluster volume heal vol0 info" will finish eventually after 10~20 > minutes (8x3 bricks setup). And it says no file to heal. > It takes even longer when cluster is larger (36x3 bricks setup). > It does not matter even I stop all my IO or restart all glusterd & > glusterfsd process. heal info still hangs. > > "gluster volume status" is all normal. All my brick servers are in > low latency LAN enviroment. > > 2) As one of my clusters having this problem is testing setup, > currently I may do some debug there. > I looked into heal/src/glfs-heal.c, compile the code and added some log, > the initialize call on "ret = glfs_init (fs)" hangs. > I traced to " ret = glfs_chdir (fs, "/");" and then "dht_lookup( > path='/' )", it never get to "dht_readdirp_cbk()" > > I stopped all client IO operation on my gluster volume and then ran > "glfsheal vol0". > I looked into /var/log/gluster/glfshead_vol0.log, there is messive > looping of reconnect attempt on all bricks. eg: > > [2018-09-14 05:56:09.395869] T [rpc-clnt.c:407:rpc_clnt_reconnect] > 0-vol0-client-12: attempting reconnect > [2018-09-14 05:56:09.396160] T [socket.c:3125:socket_connect] (--> > /lib/libglusterfs.so.0(_gf_log_callingfn+0x1ee)[0x7f10d4e79778] (--> > /usr/lib/glus > terfs//rpc-transport/socket.so(+0xd442)[0x7f10c647c442] (--> > /lib/libgfrpc.so.0(rpc_transport_connect+0x93)[0x7f10d437de43] (--> > /lib/libgfrpc.so.0(r > pc_clnt_reconnect+0x136)[0x7f10d438121b] (--> > /lib/libglusterfs.so.0(+0x3e4e9)[0x7f10d4e894e9] ))))) > 0-vol0-client-12: connect () called on transport > already connected > [2018-09-14 05:56:09.396194] T [rpc-clnt.c:407:rpc_clnt_reconnect] > 0-vol0-client-10: attempting reconnect > [2018-09-14 05:56:09.396424] T [socket.c:3125:socket_connect] (--> > /lib/libglusterfs.so.0(_gf_log_callingfn+0x1ee)[0x7f10d4e79778] (--> > /usr/lib/glus > terfs//rpc-transport/socket.so(+0xd442)[0x7f10c647c442] (--> > /lib/libgfrpc.so.0(rpc_transport_connect+0x93)[0x7f10d437de43] (--> > /lib/libgfrpc.so.0(r > pc_clnt_reconnect+0x136)[0x7f10d438121b] (--> > /lib/libglusterfs.so.0(+0x3e4e9)[0x7f10d4e894e9] ))))) > 0-vol0-client-10: connect () called on transport > already connected > > It's also weird that 0-vol0-client-5 was having huge ammount RPC ping latency > (29 seconds it's not possible because simple ICMP pinging that server > is less then 0.1 second) > > root at storage0-cluster0:~# less /vgrep ping > /var/log/glusterfs/glfsheal-vol0.log | grep laten > [2018-09-14 05:13:16.008866] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-0: Ping latency > is 0ms > [2018-09-14 05:13:16.016669] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-1: Ping latency > is 0ms > [2018-09-14 05:13:16.021769] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-2: Ping latency > is 0ms > [2018-09-14 05:13:16.033761] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-3: Ping latency > is 4ms > [2018-09-14 05:13:16.101962] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-4: Ping latency > is 62ms > [2018-09-14 05:13:46.105502] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-5: Ping latency > is 29996ms > [2018-09-14 05:20:13.895285] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-0: Ping latency > is 0ms > [2018-09-14 05:20:13.901476] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-1: Ping latency > is 0ms > [2018-09-14 05:20:13.908373] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-2: Ping latency > is 0ms > [2018-09-14 05:20:13.918782] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-3: Ping latency > is 4ms > [2018-09-14 05:20:13.987162] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-4: Ping latency > is 62ms > [2018-09-14 05:20:43.992528] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-5: Ping latency > is 29998ms > [2018-09-14 05:25:11.983259] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-0: Ping latency > is 0ms > [2018-09-14 05:25:11.991199] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-1: Ping latency > is 0ms > [2018-09-14 05:25:11.996933] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-2: Ping latency > is 0ms > [2018-09-14 05:25:12.007867] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-3: Ping latency > is 4ms > [2018-09-14 05:25:12.076057] D > [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-4: Ping latency > is 62ms > > I checked vol0-client-5, nothing special comparing to other brick. CPU > is not high, disk has no problem. > No warning or error log in /var/log/gluster/brick/data-brk01-src.log, > After I turn on TRACE level log on brick, > the logging is too much to analyze, and no apparent difference > against other brick. > > I hope someone can have some idea about this, or pointing the > direction to look into may help. > > Thanks very much