Lindsay Mathieson
2016-Apr-20 06:07 UTC
[Gluster-users] 3.7.11 - Brick died, can't restart
A brick has died on node vnb of my cluster. Unfortnately it has left a zombie glusterfsd process which is holding the brick socket so I can't restart it. Any advice on how to work round that asap would be appreciated. Tail of brick logging: 2016-04-20 05:41:37.325846] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) [0x7f5ff77d239b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) [0x7f5feb9c88e7] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] [2016-04-20 05:41:37.328255] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) [0x7f5ff77d239b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) [0x7f5feb9c88e7] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] [2016-04-20 05:41:37.599402] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) [0x7f5ff77d239b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) [0x7f5feb9c88e7] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] [2016-04-20 05:41:37.601843] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) [0x7f5ff77d239b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) [0x7f5feb9c88e7] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] [2016-04-20 05:41:37.604164] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) [0x7f5ff77d239b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) [0x7f5feb9c88e7] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] [2016-04-20 05:41:37.682886] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) [0x7f5ff77d239b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) [0x7f5feb9c88e7] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] [2016-04-20 05:55:16.203806] W [glusterfsd.c:1251:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4) [0x7f5ff6a4f0a4] -->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xe5) [0x5629ffed26f5] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x59) [0x5629ffed2569] ) 0-: received signum (15), shutting down [2016-04-20 05:55:35.536514] I [MSGID: 100030] [glusterfsd.c:2332:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.7.11 (args: /usr/sbin/glusterfsd -s vnb.proxmox.softlog --volfile-id datastore4.vnb.proxmox.softlog.tank-vmdata-datastore4 -p /var/lib/glusterd/vols/datastore4/run/vnb.proxmox.softlog-tank-vmdata-datastore4.pid -S /var/run/gluster/5ca23018ece7b94960f0580687e60650.socket --brick-name /tank/vmdata/datastore4 -l /var/log/glusterfs/bricks/tank-vmdata-datastore4.log --xlator-option *-posix.glusterd-uuid=43a1bf8c-3e69-4581-8e16-f2e1462cfc36 --brick-port 49156 --xlator-option datastore4-server.listen-port=49156) [2016-04-20 05:55:35.541739] E [socket.c:770:__socket_server_bind] 0-socket.glusterfsd: binding to failed: Address already in use [2016-04-20 05:55:35.541777] E [socket.c:773:__socket_server_bind] 0-socket.glusterfsd: Port is already in use [2016-04-20 05:55:35.541794] W [rpcsvc.c:1604:rpcsvc_transport_create] 0-rpc-service: listening on transport failed [2016-04-20 05:55:35.547990] I [MSGID: 100030] [glusterfsd.c:2332:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.7.11 (args: /usr/sbin/glusterfsd -s vnb.proxmox.softlog --volfile-id datastore4.vnb.proxmox.softlog.tank-vmdata-datastore4 -p /var/lib/glusterd/vols/datastore4/run/vnb.proxmox.softlog-tank-vmdata-datastore4.pid -S /var/run/gluster/5ca23018ece7b94960f0580687e60650.socket --brick-name /tank/vmdata/datastore4 -l /var/log/glusterfs/bricks/tank-vmdata-datastore4.log --xlator-option *-posix.glusterd-uuid=43a1bf8c-3e69-4581-8e16-f2e1462cfc36 --brick-port 49156 --xlator-option datastore4-server.listen-port=49156) I did a quick check of the other bricks, they are all filled with "I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) [0x7f5ff77d239b] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) [0x7f5feb9c88e7] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument]" Thanks, -- Lindsay
try restarting glusterd. # service glistered restart if it didn?t work, try killing glusterfsd PID(s) # kill $(ps -ef | grep glusterfsd | awk '{print $2}?) t hen, restart glusterd # service glusterd restart PS: killing glusterfsd that way will kill all the bricks on that node, but restarting glusterd should return the bricks back online. ?Bishoy> On Apr 19, 2016, at 11:07 PM, Lindsay Mathieson <lindsay.mathieson at gmail.com> wrote: > > A brick has died on node vnb of my cluster. Unfortnately it has left a > zombie glusterfsd process which is holding the brick socket so I can't > restart it. Any advice on how to work round that asap would be > appreciated. > > Tail of brick logging: > > 2016-04-20 05:41:37.325846] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.328255] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.599402] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.601843] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.604164] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.682886] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:55:16.203806] W [glusterfsd.c:1251:cleanup_and_exit] > (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4) [0x7f5ff6a4f0a4] > -->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xe5) [0x5629ffed26f5] > -->/usr/sbin/glusterfsd(cleanup_and_exit+0x59) [0x5629ffed2569] ) 0-: > received signum (15), shutting down > [2016-04-20 05:55:35.536514] I [MSGID: 100030] > [glusterfsd.c:2332:main] 0-/usr/sbin/glusterfsd: Started running > /usr/sbin/glusterfsd version 3.7.11 (args: /usr/sbin/glusterfsd -s > vnb.proxmox.softlog --volfile-id > datastore4.vnb.proxmox.softlog.tank-vmdata-datastore4 -p > /var/lib/glusterd/vols/datastore4/run/vnb.proxmox.softlog-tank-vmdata-datastore4.pid > -S /var/run/gluster/5ca23018ece7b94960f0580687e60650.socket > --brick-name /tank/vmdata/datastore4 -l > /var/log/glusterfs/bricks/tank-vmdata-datastore4.log --xlator-option > *-posix.glusterd-uuid=43a1bf8c-3e69-4581-8e16-f2e1462cfc36 > --brick-port 49156 --xlator-option > datastore4-server.listen-port=49156) > [2016-04-20 05:55:35.541739] E [socket.c:770:__socket_server_bind] > 0-socket.glusterfsd: binding to failed: Address already in use > [2016-04-20 05:55:35.541777] E [socket.c:773:__socket_server_bind] > 0-socket.glusterfsd: Port is already in use > [2016-04-20 05:55:35.541794] W [rpcsvc.c:1604:rpcsvc_transport_create] > 0-rpc-service: listening on transport failed > [2016-04-20 05:55:35.547990] I [MSGID: 100030] > [glusterfsd.c:2332:main] 0-/usr/sbin/glusterfsd: Started running > /usr/sbin/glusterfsd version 3.7.11 (args: /usr/sbin/glusterfsd -s > vnb.proxmox.softlog --volfile-id > datastore4.vnb.proxmox.softlog.tank-vmdata-datastore4 -p > /var/lib/glusterd/vols/datastore4/run/vnb.proxmox.softlog-tank-vmdata-datastore4.pid > -S /var/run/gluster/5ca23018ece7b94960f0580687e60650.socket > --brick-name /tank/vmdata/datastore4 -l > /var/log/glusterfs/bricks/tank-vmdata-datastore4.log --xlator-option > *-posix.glusterd-uuid=43a1bf8c-3e69-4581-8e16-f2e1462cfc36 > --brick-port 49156 --xlator-option > datastore4-server.listen-port=49156) > > > I did a quick check of the other bricks, they are all filled with "I > [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument]" > > > Thanks, > -- > Lindsay > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160419/e6a9529f/attachment.html>
A zombied glusterfsd means it's stuck in a kernel operation, likely some io wait that was hung in the kernel. Since there's no way to clear that from the kernel, the only option was to reboot. On 04/19/2016 11:07 PM, Lindsay Mathieson wrote:> A brick has died on node vnb of my cluster. Unfortnately it has left a > zombie glusterfsd process which is holding the brick socket so I can't > restart it. Any advice on how to work round that asap would be > appreciated. > > Tail of brick logging: > > 2016-04-20 05:41:37.325846] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.328255] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.599402] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.601843] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.604164] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:41:37.682886] I [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument] > [2016-04-20 05:55:16.203806] W [glusterfsd.c:1251:cleanup_and_exit] > (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4) [0x7f5ff6a4f0a4] > -->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xe5) [0x5629ffed26f5] > -->/usr/sbin/glusterfsd(cleanup_and_exit+0x59) [0x5629ffed2569] ) 0-: > received signum (15), shutting down > [2016-04-20 05:55:35.536514] I [MSGID: 100030] > [glusterfsd.c:2332:main] 0-/usr/sbin/glusterfsd: Started running > /usr/sbin/glusterfsd version 3.7.11 (args: /usr/sbin/glusterfsd -s > vnb.proxmox.softlog --volfile-id > datastore4.vnb.proxmox.softlog.tank-vmdata-datastore4 -p > /var/lib/glusterd/vols/datastore4/run/vnb.proxmox.softlog-tank-vmdata-datastore4.pid > -S /var/run/gluster/5ca23018ece7b94960f0580687e60650.socket > --brick-name /tank/vmdata/datastore4 -l > /var/log/glusterfs/bricks/tank-vmdata-datastore4.log --xlator-option > *-posix.glusterd-uuid=43a1bf8c-3e69-4581-8e16-f2e1462cfc36 > --brick-port 49156 --xlator-option > datastore4-server.listen-port=49156) > [2016-04-20 05:55:35.541739] E [socket.c:770:__socket_server_bind] > 0-socket.glusterfsd: binding to failed: Address already in use > [2016-04-20 05:55:35.541777] E [socket.c:773:__socket_server_bind] > 0-socket.glusterfsd: Port is already in use > [2016-04-20 05:55:35.541794] W [rpcsvc.c:1604:rpcsvc_transport_create] > 0-rpc-service: listening on transport failed > [2016-04-20 05:55:35.547990] I [MSGID: 100030] > [glusterfsd.c:2332:main] 0-/usr/sbin/glusterfsd: Started running > /usr/sbin/glusterfsd version 3.7.11 (args: /usr/sbin/glusterfsd -s > vnb.proxmox.softlog --volfile-id > datastore4.vnb.proxmox.softlog.tank-vmdata-datastore4 -p > /var/lib/glusterd/vols/datastore4/run/vnb.proxmox.softlog-tank-vmdata-datastore4.pid > -S /var/run/gluster/5ca23018ece7b94960f0580687e60650.socket > --brick-name /tank/vmdata/datastore4 -l > /var/log/glusterfs/bricks/tank-vmdata-datastore4.log --xlator-option > *-posix.glusterd-uuid=43a1bf8c-3e69-4581-8e16-f2e1462cfc36 > --brick-port 49156 --xlator-option > datastore4-server.listen-port=49156) > > > I did a quick check of the other bricks, they are all filled with "I > [dict.c:473:dict_get] > (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xab) > [0x7f5ff77d239b] > -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) > [0x7f5feb9c88e7] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0x93) > [0x7f5ff77c30f3] ) 0-dict: !this || key=() [Invalid argument]" > > > Thanks,