srinivas jonn
2013-Jun-03 09:34 UTC
[Gluster-users] recovering gluster volume || startup failure
Hello Gluster users: sorry for long post, I have run out of ideas here, kindly let me know if i am looking at right places for logs and any suggested actions.....thanks a sudden power loss casued hard reboot - now the volume does not start Glusterfs- 3.3.1 on Centos 6.1 transport: TCP sharing volume over NFS for VM storage - VHD Files Type: distributed - only 1 node (brick) XFS (LVM) mount /dev/datastore1/mylv1 /export/brick1 -?mounts VHD files.......is there a way to recover these files? ?cat export-brick1.log [2013-06-02 09:29:00.832914] I [glusterfsd.c:1666:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.3.1 [2013-06-02 09:29:00.845515] I [graph.c:241:gf_add_cmdline_options] 0-gvol1-server: adding option 'listen-port' for volume 'gvol1-server' with value '24009' [2013-06-02 09:29:00.845558] I [graph.c:241:gf_add_cmdline_options] 0-gvol1-posix: adding option 'glusterd-uuid' for volume 'gvol1-posix' with value '16ee7a4e-ee9b-4543-bd61-9b444100693d' [2013-06-02 09:29:00.846654] W [options.c:782:xl_opt_validate] 0-gvol1-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction Given volfile: +------------------------------------------------------------------------------+ ? 1: volume gvol1-posix ? 2:???? type storage/posix ? 3:???? option directory /export/brick1 ? 4:???? option volume-id aa25aa58-d191-432a-a84b-325051347af6 ? 5: end-volume ? 6: ? 7: volume gvol1-access-control ? 8:???? type features/access-control ? 9:???? subvolumes gvol1-posix ?10: end-volume ?11: ?12: volume gvol1-locks ?13:???? type features/locks ?14:???? subvolumes gvol1-access-control ?---------- ----------------- ?46:???? option transport-type tcp ?47:???? option auth.login./export/brick1.allow 6c4653bb-b708-46e8-b3f9-177b4cdbbf28 ?48:???? option auth.login.6c4653bb-b708-46e8-b3f9-177b4cdbbf28.password 091ae3b1-40c2-4d48-8870-6ad7884457ac ?49:???? option auth.addr./export/brick1.allow * ?50:???? subvolumes /export/brick1 ?51: end-volume +------------------------------------------------------------------------------+ [2013-06-02 09:29:03.963001] W [socket.c:410:__socket_keepalive] 0-socket: failed to set keep idle on socket 8 [2013-06-02 09:29:03.963046] W [socket.c:1876:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported [2013-06-02 09:29:04.850120] I [server-handshake.c:571:server_setvolume] 0-gvol1-server: accepted client from iiclab-oel1-9347-2013/06/02-09:29:00:835397-gvol1-client-0-0 (version: 3.3.1) [2013-06-02 09:32:16.973786] W [glusterfsd.c:831:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x93) [0x30cac0a5b3] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293) [0x30cac0a443] (-->/usr/sbin/glusterfsd(glusterfs_handle_terminate+0x15) [0x40a955]))) 0-: received signum (15), shutting down [2013-06-02 09:32:16.973895] W [glusterfsd.c:831:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x3ef56e68ed] (-->/lib64/libpthread.so.0() [0x3ef5a077e1] (-->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdd) [0x405d4d]))) 0-: received signum (15), shutting down NFS LOG [2013-06-02 09:29:00.918906] I [rpc-clnt.c:1657:rpc_clnt_reconfig] 0-gvol1-client-0: changing port to 24009 (from 0) [2013-06-02 09:29:03.963023] W [socket.c:410:__socket_keepalive] 0-socket: failed to set keep idle on socket 8 [2013-06-02 09:29:03.963062] W [socket.c:1876:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported [2013-06-02 09:29:04.849941] I [client-handshake.c:1636:select_server_supported_programs] 0-gvol1-client-0: Using Program GlusterFS 3.3.1, Num (1298437), Version (330) [2013-06-02 09:29:04.853016] I [client-handshake.c:1433:client_setvolume_cbk] 0-gvol1-client-0: Connected to 10.0.0.30:24009, attached to remote volume '/export/brick1'. [2013-06-02 09:29:04.853048] I [client-handshake.c:1445:client_setvolume_cbk] 0-gvol1-client-0: Server and Client lk-version numbers are not same, reopening the fds [2013-06-02 09:29:04.853262] I [client-handshake.c:453:client_set_lk_version_cbk] 0-gvol1-client-0: Server lk version = 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130603/ed6b3792/attachment.html>
Krishnan Parthasarathi
2013-Jun-03 09:44 UTC
[Gluster-users] recovering gluster volume || startup failure
Srinivas, Could you paste the output of "gluster volume info gvol1"? This should give us an idea as to what was the state of the volume before the power loss. thanks, krish ----- Original Message -----> Hello Gluster users: > sorry for long post, I have run out of ideas here, kindly let me know if i am > looking at right places for logs and any suggested actions.....thanks > a sudden power loss casued hard reboot - now the volume does not start > Glusterfs- 3.3.1 on Centos 6.1 transport: TCP > sharing volume over NFS for VM storage - VHD Files > Type: distributed - only 1 node (brick) > XFS (LVM) > mount /dev/datastore1/mylv1 /export/brick1 - mounts VHD files.......is there > a way to recover these files? > cat export-brick1.log > [2013-06-02 09:29:00.832914] I [glusterfsd.c:1666:main] > 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.3.1 > [2013-06-02 09:29:00.845515] I [graph.c:241:gf_add_cmdline_options] > 0-gvol1-server: adding option 'listen-port' for volume 'gvol1-server' with > value '24009' > [2013-06-02 09:29:00.845558] I [graph.c:241:gf_add_cmdline_options] > 0-gvol1-posix: adding option 'glusterd-uuid' for volume 'gvol1-posix' with > value '16ee7a4e-ee9b-4543-bd61-9b444100693d' > [2013-06-02 09:29:00.846654] W [options.c:782:xl_opt_validate] > 0-gvol1-server: option 'listen-port' is deprecated, preferred is > 'transport.socket.listen-port', continuing with correction > Given volfile: > +------------------------------------------------------------------------------+ > 1: volume gvol1-posix > 2: type storage/posix > 3: option directory /export/brick1 > 4: option volume-id aa25aa58-d191-432a-a84b-325051347af6 > 5: end-volume > 6: > 7: volume gvol1-access-control > 8: type features/access-control > 9: subvolumes gvol1-posix > 10: end-volume > 11: > 12: volume gvol1-locks > 13: type features/locks > 14: subvolumes gvol1-access-control > ---------- > ----------------- > > 46: option transport-type tcp > 47: option auth.login./export/brick1.allow > 6c4653bb-b708-46e8-b3f9-177b4cdbbf28 > 48: option auth.login.6c4653bb-b708-46e8-b3f9-177b4cdbbf28.password > 091ae3b1-40c2-4d48-8870-6ad7884457ac > 49: option auth.addr./export/brick1.allow * > 50: subvolumes /export/brick1 > 51: end-volume > +------------------------------------------------------------------------------+ > [2013-06-02 09:29:03.963001] W [socket.c:410:__socket_keepalive] 0-socket: > failed to set keep idle on socket 8 > [2013-06-02 09:29:03.963046] W [socket.c:1876:socket_server_event_handler] > 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported > [2013-06-02 09:29:04.850120] I [server-handshake.c:571:server_setvolume] > 0-gvol1-server: accepted client from > iiclab-oel1-9347-2013/06/02-09:29:00:835397-gvol1-client-0-0 (version: > 3.3.1) > [2013-06-02 09:32:16.973786] W [glusterfsd.c:831:cleanup_and_exit] > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x93) [0x30cac0a5b3] > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293) [0x30cac0a443] > (-->/usr/sbin/glusterfsd(glusterfs_handle_terminate+0x15) [0x40a955]))) 0-: > received signum (15), shutting down > [2013-06-02 09:32:16.973895] W [glusterfsd.c:831:cleanup_and_exit] > (-->/lib64/libc.so.6(clone+0x6d) [0x3ef56e68ed] (-->/lib64/libpthread.so.0() > [0x3ef5a077e1] (-->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdd) > [0x405d4d]))) 0-: received signum (15), shutting down > NFS LOG > [2013-06-02 09:29:00.918906] I [rpc-clnt.c:1657:rpc_clnt_reconfig] > 0-gvol1-client-0: changing port to 24009 (from 0) > [2013-06-02 09:29:03.963023] W [socket.c:410:__socket_keepalive] 0-socket: > failed to set keep idle on socket 8 > [2013-06-02 09:29:03.963062] W [socket.c:1876:socket_server_event_handler] > 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported > [2013-06-02 09:29:04.849941] I > [client-handshake.c:1636:select_server_supported_programs] 0-gvol1-client-0: > Using Program GlusterFS 3.3.1, Num (1298437), Version (330) > [2013-06-02 09:29:04.853016] I [client-handshake.c:1433:client_setvolume_cbk] > 0-gvol1-client-0: Connected to 10.0.0.30:24009, attached to remote volume > '/export/brick1'. > [2013-06-02 09:29:04.853048] I [client-handshake.c:1445:client_setvolume_cbk] > 0-gvol1-client-0: Server and Client lk-version numbers are not same, > reopening the fds > [2013-06-02 09:29:04.853262] I > [client-handshake.c:453:client_set_lk_version_cbk] 0-gvol1-client-0: Server > lk version = 1 > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users
Hello Gluster users, thought of posing a more refined question, thanks to the support of Krish. problem statement: Gluster volume start - failure RPM installation of 3.3.0 on CentOS 6.1 - XFS is filesystem layer - NFS export , distributed single node, TCP this server has experienced a accidental powerloss while in operation. any help in resolving or debug issue is appreciated. ? glusterd logs indicate failure to resolve the brick: [2013-06-03 12:03:24.660330] I [glusterd-volume-ops.c:290:glusterd_handle_cli_start_volume] 0-glusterd: Received start vol reqfor volume gvol1 [2013-06-03 12:03:24.660384] I [glusterd-utils.c:285:glusterd_lock] 0-glusterd: Cluster lock held by 16ee7a4e-ee9b-4543-bd61-9b444100693d [2013-06-03 12:03:24.660398] I [glusterd-handler.c:463:glusterd_op_txn_begin] 0-management: Acquired local lock [2013-06-03 12:03:24.842904] E [glusterd-volume-ops.c:842:glusterd_op_stage_start_volume] 0-: Unable to resolve brick 10.0.0.30:/export/brick1 [2013-06-03 12:03:24.842938] E [glusterd-op-sm.c:1999:glusterd_op_ac_send_stage_op] 0-: Staging failed [2013-06-03 12:03:24.842959] I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 0 peers [2013-06-03 12:03:24.842982] I [glusterd-op-sm.c:2653:glusterd_op_txn_complete] 0-glusterd: Cleared local lock ________________________________ From: Krishnan Parthasarathi <kparthas at redhat.com> To: srinivas jonn <jmsrinivas at yahoo.com> Sent: Monday, 3 June 2013 4:24 PM Subject: Re: [Gluster-users] recovering gluster volume || startup failure Is this a source install or an rpm install? If it is a source install, the logs would be present under <install-prefix>/var/log/glusterfs Having said that, could you attach etc-glusterfs-glusterd.log file? Does the gluster CLI print any error messages to the terminal, when volume-start fails? thanks, krish ----- Original Message -----> there is no /var/log/glusterfs/.cmd_log_history file . > > gluster volume start <volume> - volume start has been unsuccessful > > > let me know for any specific log, I am trying to debug why volume is not > starting - > > feel free to copy the gluster-user DL if you think right > > > > ________________________________ >? From: Krishnan Parthasarathi <kparthas at redhat.com> > To: srinivas jonn <jmsrinivas at yahoo.com> > Cc: gluster-users at gluster.org > Sent: Monday, 3 June 2013 3:56 PM > Subject: Re: [Gluster-users] recovering gluster volume || startup failure >? > > Did you run "gluster volume start gvol1"? Could you attach > /var/log/glusterfs/.cmd_log_history (log file)? > From the logs you have pasted, it looks like volume-stop is the last command > you executed. > > thanks, > krish > > ----- Original Message ----- > > the volume is not starting - this was the issue.. please let mw know the > > diagnostic or debug procedures, > > ? > > ? > > logs: > > ? > > usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293) [0x30cac0a443] > > ?/usr/sbin/glusterfsd(glusterfs_handle_terminate+0x15) [0x40a955]))) 0-: > > ?received signum (15), shutting down > > ?[2013-06-02 09:32:16.973895] W [glusterfsd.c:831:cleanup_and_exit] > > ?(-->/lib64/libc.so.6(clone+0x6d) [0x3ef56e68ed] > > ?(-->/lib64/libpthread.so.0() > > ?[0x3ef5a077e1] (-->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdd) > > ?[0x405d4d]))) 0-: received signum (15), shutting down > >? > > > > ________________________________ > >? From: Krishnan Parthasarathi <kparthas at redhat.com> > > To: srinivas jonn <jmsrinivas at yahoo.com> > > Cc: gluster-users at gluster.org > > Sent: Monday, 3 June 2013 3:27 PM > > Subject: Re: [Gluster-users] recovering gluster volume || startup failure > >? > > > > Srinivas, > > > > The volume is in stopped state. You could start the volume by running > > "gluster volume start gvol1". This should make your attempts at mounting > > the volume successful. > > > > thanks, > > krish > > > > ----- Original Message ----- > > > Krish, > > > this is giving general volume information , can the state of volume known > > > from any specific logs? > > > #gluster volume info gvol1 > > > Volume Name: gvol1 > > > Type: Distribute > > > Volume ID: aa25aa58-d191-432a-a84b-325051347af6 > > > Status: Stopped > > > Number of Bricks: 1 > > > Transport-type: tcp > > > Bricks: > > > Brick1: 10.0.0.30:/export/brick1 > > > Options Reconfigured: > > > nfs.addr-namelookup: off > > > nfs.port: 2049 > > > From: Krishnan Parthasarathi <kparthas at redhat.com> > > > To: srinivas jonn <jmsrinivas at yahoo.com> > > > Cc: gluster-users at gluster.org > > > Sent: Monday, 3 June 2013 3:14 PM > > > Subject: Re: [Gluster-users] recovering gluster volume || startup failure > > > > > > Srinivas, > > > > > > Could you paste the output of "gluster volume info gvol1"? > > > This should give us an idea as to what was the state of the volume > > > before the power loss. > > > > > > thanks, > > > krish > > > > > > ----- Original Message ----- > > > > Hello Gluster users: > > > > sorry for long post, I have run out of ideas here, kindly let me know > > > > if > > > > i > > > > am > > > > looking at right places for logs and any suggested actions.....thanks > > > > a sudden power loss casued hard reboot - now the volume does not start > > > > Glusterfs- 3.3.1 on Centos 6.1 transport: TCP > > > > sharing volume over NFS for VM storage - VHD Files > > > > Type: distributed - only 1 node (brick) > > > > XFS (LVM) > > > > mount /dev/datastore1/mylv1 /export/brick1 - mounts VHD files.......is > > > > there > > > > a way to recover these files? > > > > cat export-brick1.log > > > > [2013-06-02 09:29:00.832914] I [glusterfsd.c:1666:main] > > > > 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version > > > > 3.3.1 > > > > [2013-06-02 09:29:00.845515] I [graph.c:241:gf_add_cmdline_options] > > > > 0-gvol1-server: adding option 'listen-port' for volume 'gvol1-server' > > > > with > > > > value '24009' > > > > [2013-06-02 09:29:00.845558] I [graph.c:241:gf_add_cmdline_options] > > > > 0-gvol1-posix: adding option 'glusterd-uuid' for volume 'gvol1-posix' > > > > with > > > > value '16ee7a4e-ee9b-4543-bd61-9b444100693d' > > > > [2013-06-02 09:29:00.846654] W [options.c:782:xl_opt_validate] > > > > 0-gvol1-server: option 'listen-port' is deprecated, preferred is > > > > 'transport.socket.listen-port', continuing with correction > > > > Given volfile: > > > > +------------------------------------------------------------------------------+ > > > > 1: volume gvol1-posix > > > > 2: type storage/posix > > > > 3: option directory /export/brick1 > > > > 4: option volume-id aa25aa58-d191-432a-a84b-325051347af6 > > > > 5: end-volume > > > > 6: > > > > 7: volume gvol1-access-control > > > > 8: type features/access-control > > > > 9: subvolumes gvol1-posix > > > > 10: end-volume > > > > 11: > > > > 12: volume gvol1-locks > > > > 13: type features/locks > > > > 14: subvolumes gvol1-access-control > > > > ---------- > > > > ----------------- > > > > > > > > 46: option transport-type tcp > > > > 47: option auth.login./export/brick1.allow > > > > 6c4653bb-b708-46e8-b3f9-177b4cdbbf28 > > > > 48: option auth.login.6c4653bb-b708-46e8-b3f9-177b4cdbbf28.password > > > > 091ae3b1-40c2-4d48-8870-6ad7884457ac > > > > 49: option auth.addr./export/brick1.allow * > > > > 50: subvolumes /export/brick1 > > > > 51: end-volume > > > > +------------------------------------------------------------------------------+ > > > > [2013-06-02 09:29:03.963001] W [socket.c:410:__socket_keepalive] > > > > 0-socket: > > > > failed to set keep idle on socket 8 > > > > [2013-06-02 09:29:03.963046] W > > > > [socket.c:1876:socket_server_event_handler] > > > > 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported > > > > [2013-06-02 09:29:04.850120] I > > > > [server-handshake.c:571:server_setvolume] > > > > 0-gvol1-server: accepted client from > > > > iiclab-oel1-9347-2013/06/02-09:29:00:835397-gvol1-client-0-0 (version: > > > > 3.3.1) > > > > [2013-06-02 09:32:16.973786] W [glusterfsd.c:831:cleanup_and_exit] > > > > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x93) [0x30cac0a5b3] > > > > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293) > > > > [0x30cac0a443] > > > > (-->/usr/sbin/glusterfsd(glusterfs_handle_terminate+0x15) [0x40a955]))) > > > > 0-: > > > > received signum (15), shutting down > > > > [2013-06-02 09:32:16.973895] W [glusterfsd.c:831:cleanup_and_exit] > > > > (-->/lib64/libc.so.6(clone+0x6d) [0x3ef56e68ed] > > > > (-->/lib64/libpthread.so.0() > > > > [0x3ef5a077e1] (-->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdd) > > > > [0x405d4d]))) 0-: received signum (15), shutting down > > > > NFS LOG > > > > [2013-06-02 09:29:00.918906] I [rpc-clnt.c:1657:rpc_clnt_reconfig] > > > > 0-gvol1-client-0: changing port to 24009 (from 0) > > > > [2013-06-02 09:29:03.963023] W [socket.c:410:__socket_keepalive] > > > > 0-socket: > > > > failed to set keep idle on socket 8 > > > > [2013-06-02 09:29:03.963062] W > > > > [socket.c:1876:socket_server_event_handler] > > > > 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported > > > > [2013-06-02 09:29:04.849941] I > > > > [client-handshake.c:1636:select_server_supported_programs] > > > > 0-gvol1-client-0: > > > > Using Program GlusterFS 3.3.1, Num (1298437), Version (330) > > > > [2013-06-02 09:29:04.853016] I > > > > [client-handshake.c:1433:client_setvolume_cbk] > > > > 0-gvol1-client-0: Connected to 10.0.0.30:24009, attached to remote > > > > volume > > > > '/export/brick1'. > > > > [2013-06-02 09:29:04.853048] I > > > > [client-handshake.c:1445:client_setvolume_cbk] > > > > 0-gvol1-client-0: Server and Client lk-version numbers are not same, > > > > reopening the fds > > > > [2013-06-02 09:29:04.853262] I > > > > [client-handshake.c:453:client_set_lk_version_cbk] 0-gvol1-client-0: > > > > Server > > > > lk version = 1 > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users at gluster.org > > > http://supercolony.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130604/0e89f7ab/attachment.html>