likun
2017-Jan-16 04:49 UTC
[Gluster-users] some bricks didn't come up after a server reboot
Hi, everybody. I have glusterfs 3.8.5 setuped in a kubernetes enveironment. Recently, one of my glusterfs server rebooted, glusterfs service didn't come up successfully. Out of totally 6 bricks, 3 came up, 3 didn't. Here is the logs of the brick that have failed: [2016-12-29 03:46:50.240032] I [MSGID: 100030] [glusterfsd.c:2454:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.8.5 (args: /usr/sbin/glusterfsd -s 10.32.3.23 --volfile-id gvol0.10.32.3.23.mnt-brick1-vol -p /var/lib/glusterd/vols/gvol0/run/10.32.3.23-mnt-brick1-vol.pid -S /var/run/gluster/08da045f3e66eefc50c0ff9a035c6794.socket --brick-name /mnt/brick1/vol -l /var/log/glusterfs/bricks/mnt-brick1-vol.log --xlator-option *-posix.glusterd-uuid=58c3b462-a4b6-4655-b2ac-d0502e278e03 --brick-port 49152 --xlator-option gvol0-server.listen-port=49152) [2016-12-29 03:46:50.258772] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-12-29 03:47:08.153575] I [MSGID: 101173] [graph.c:269:gf_add_cmdline_options] 0-gvol0-server: adding option 'listen-port' for volume 'gvol0-server' with value '49152' [2016-12-29 03:47:08.153613] I [MSGID: 101173] [graph.c:269:gf_add_cmdline_options] 0-gvol0-posix: adding option 'glusterd-uuid' for volume 'gvol0-posix' with value '58c3b462-a4b6-4655-b2ac-d0502e278e03' [2016-12-29 03:47:08.153777] I [MSGID: 115034] [server.c:398:_check_for_auth_option] 0-gvol0-decompounder: skip format check for non-addr auth option auth.login./mnt/brick1/vol.allow [2016-12-29 03:47:08.153785] I [MSGID: 115034] [server.c:398:_check_for_auth_option] 0-gvol0-decompounder: skip format check for non-addr auth option auth.login.94bedfd1-619d-402a-9826-67dab7600f43.password [2016-12-29 03:47:08.153895] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2016-12-29 03:47:08.159776] I [rpcsvc.c:2214:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 [2016-12-29 03:47:08.159829] W [MSGID: 101002] [options.c:954:xl_opt_validate] 0-gvol0-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction [2016-12-29 03:47:08.159893] E [socket.c:793:__socket_server_bind] 0-tcp.gvol0-server: binding to failed: Address already in use [2016-12-29 03:47:08.159899] E [socket.c:796:__socket_server_bind] 0-tcp.gvol0-server: Port is already in use [2016-12-29 03:47:08.159907] W [rpcsvc.c:1645:rpcsvc_create_listener] 0-rpc-service: listening on transport failed [2016-12-29 03:47:08.159913] W [MSGID: 115045] [server.c:1061:init] 0-gvol0-server: creation of listener failed [2016-12-29 03:47:08.159919] E [MSGID: 101019] [xlator.c:433:xlator_init] 0-gvol0-server: Initialization of volume 'gvol0-server' failed, review your volfile again [2016-12-29 03:47:08.159924] E [MSGID: 101066] [graph.c:324:glusterfs_graph_init] 0-gvol0-server: initializing translator failed [2016-12-29 03:47:08.159929] E [MSGID: 101176] [graph.c:673:glusterfs_graph_activate] 0-graph: init failed [2016-12-29 03:47:08.160764] W [glusterfsd.c:1327:cleanup_and_exit] (-->/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x3c1) [0x55ead22bee51] -->/usr/sbin/glusterfsd(glusterfs_process_volfp+0x172) [0x55ead22b95d2] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x55ead22b8b4b] ) 0-: received signum (98), shutting down What's wrong with it ? "port is already in use" ? the system was just rebooted, it's clear, no other daemon was using this port . After restarted the glusterfs container, all six bricks came up. Sometime, two restart or more needed to take all the bricks back to work. Any idea ? Likun -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170116/42da9e4f/attachment.html>
Atin Mukherjee
2017-Jan-16 04:54 UTC
[Gluster-users] some bricks didn't come up after a server reboot
On Mon, Jan 16, 2017 at 10:19 AM, likun <kun.li at ucarinc.com> wrote:> Hi, everybody. > > > > I have glusterfs 3.8.5 setuped in a kubernetes enveironment. > > > > Recently, one of my glusterfs server rebooted, glusterfs service didn?t > come up successfully. > > > > Out of totally 6 bricks, 3 came up, 3 didn?t. > > > > Here is the logs of the brick that have failed: > > > > [2016-12-29 03:46:50.240032] I [MSGID: 100030] [glusterfsd.c:2454:main] > 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.8.5 > (args: /usr/sbin/glusterfsd -s 10.32.3.23 --volfile-id > gvol0.10.32.3.23.mnt-brick1-vol -p /var/lib/glusterd/vols/gvol0/ > run/10.32.3.23-mnt-brick1-vol.pid -S /var/run/gluster/ > 08da045f3e66eefc50c0ff9a035c6794.socket --brick-name /mnt/brick1/vol -l > /var/log/glusterfs/bricks/mnt-brick1-vol.log --xlator-option > *-posix.glusterd-uuid=58c3b462-a4b6-4655-b2ac-d0502e278e03 --brick-port > 49152 --xlator-option gvol0-server.listen-port=49152) >So 49152 port was allocated for this brick.> [2016-12-29 03:46:50.258772] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] > 0-epoll: Started thread with index 1 > > [2016-12-29 03:47:08.153575] I [MSGID: 101173] [graph.c:269:gf_add_cmdline_options] > 0-gvol0-server: adding option 'listen-port' for volume 'gvol0-server' with > value '49152' > > [2016-12-29 03:47:08.153613] I [MSGID: 101173] [graph.c:269:gf_add_cmdline_options] > 0-gvol0-posix: adding option 'glusterd-uuid' for volume 'gvol0-posix' with > value '58c3b462-a4b6-4655-b2ac-d0502e278e03' > > [2016-12-29 03:47:08.153777] I [MSGID: 115034] > [server.c:398:_check_for_auth_option] 0-gvol0-decompounder: skip format > check for non-addr auth option auth.login./mnt/brick1/vol.allow > > [2016-12-29 03:47:08.153785] I [MSGID: 115034] > [server.c:398:_check_for_auth_option] 0-gvol0-decompounder: skip format > check for non-addr auth option auth.login.94bedfd1-619d-402a- > 9826-67dab7600f43.password > > [2016-12-29 03:47:08.153895] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] > 0-epoll: Started thread with index 2 > > [2016-12-29 03:47:08.159776] I [rpcsvc.c:2214:rpcsvc_set_outstanding_rpc_limit] > 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 > > [2016-12-29 03:47:08.159829] W [MSGID: 101002] [options.c:954:xl_opt_validate] > 0-gvol0-server: option 'listen-port' is deprecated, preferred is > 'transport.socket.listen-port', continuing with correction > > [2016-12-29 03:47:08.159893] E [socket.c:793:__socket_server_bind] > 0-tcp.gvol0-server: binding to failed: Address already in use > > [2016-12-29 03:47:08.159899] E [socket.c:796:__socket_server_bind] > 0-tcp.gvol0-server: Port is already in use >What does 'netstat -nap | grep 49152' says at that time? Are we sure no other application has accidentally raced to consume the same port ? [2016-12-29 03:47:08.159907] W [rpcsvc.c:1645:rpcsvc_create_listener]> 0-rpc-service: listening on transport failed > > [2016-12-29 03:47:08.159913] W [MSGID: 115045] [server.c:1061:init] > 0-gvol0-server: creation of listener failed > > [2016-12-29 03:47:08.159919] E [MSGID: 101019] [xlator.c:433:xlator_init] > 0-gvol0-server: Initialization of volume 'gvol0-server' failed, review your > volfile again > > [2016-12-29 03:47:08.159924] E [MSGID: 101066] > [graph.c:324:glusterfs_graph_init] 0-gvol0-server: initializing > translator failed > > [2016-12-29 03:47:08.159929] E [MSGID: 101176] > [graph.c:673:glusterfs_graph_activate] 0-graph: init failed > > [2016-12-29 03:47:08.160764] W [glusterfsd.c:1327:cleanup_and_exit] > (-->/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x3c1) [0x55ead22bee51] > -->/usr/sbin/glusterfsd(glusterfs_process_volfp+0x172) [0x55ead22b95d2] > -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x55ead22b8b4b] ) 0-: > received signum (98), shutting down > > > > What?s wrong with it ? ?port is already in use? ? the system was just > rebooted, it?s clear, no other daemon was using this port . > > After restarted the glusterfs container, all six bricks came up. Sometime, > two restart or more needed to take all the bricks back to work. Any idea ? > > > > Likun > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-- ~ Atin (atinm) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170116/94406f4e/attachment.html>