thr3ads.net - Gluster users - [Gluster-users] recovering gluster volume || startup failure [Jun 2013]

If this information is useful, please help other people find it:
Share via:

srinivas jonn

2013-Jun-03 09:34 UTC

[Gluster-users] recovering gluster volume || startup failure

Hello Gluster users:


sorry for long post, I have run out of ideas here, kindly let me know if i am
looking at right places for logs and any suggested actions.....thanks


a sudden power loss casued hard reboot - now the  volume does not start

Glusterfs- 3.3.1 on Centos 6.1 transport: TCP

sharing volume over NFS for VM storage - VHD Files 
Type: distributed - only 1 node (brick)


XFS (LVM)

mount /dev/datastore1/mylv1 /export/brick1 -?mounts VHD files.......is there a
way to recover these files?


?cat export-brick1.log
[2013-06-02 09:29:00.832914] I [glusterfsd.c:1666:main] 0-/usr/sbin/glusterfsd:
Started running /usr/sbin/glusterfsd version 3.3.1
[2013-06-02 09:29:00.845515] I [graph.c:241:gf_add_cmdline_options]
0-gvol1-server: adding option 'listen-port' for volume
'gvol1-server' with value '24009'
[2013-06-02 09:29:00.845558] I [graph.c:241:gf_add_cmdline_options]
0-gvol1-posix: adding option 'glusterd-uuid' for volume
'gvol1-posix' with value '16ee7a4e-ee9b-4543-bd61-9b444100693d'
[2013-06-02 09:29:00.846654] W [options.c:782:xl_opt_validate] 0-gvol1-server:
option 'listen-port' is deprecated, preferred is
'transport.socket.listen-port', continuing with correction
Given volfile:
+------------------------------------------------------------------------------+
? 1: volume gvol1-posix
? 2:???? type storage/posix
? 3:???? option directory /export/brick1
? 4:???? option volume-id aa25aa58-d191-432a-a84b-325051347af6
? 5: end-volume
? 6:
? 7: volume gvol1-access-control
? 8:???? type features/access-control
? 9:???? subvolumes gvol1-posix
?10: end-volume
?11:
?12: volume gvol1-locks
?13:???? type features/locks
?14:???? subvolumes gvol1-access-control
?----------
-----------------

?46:???? option transport-type tcp
?47:???? option auth.login./export/brick1.allow
6c4653bb-b708-46e8-b3f9-177b4cdbbf28
?48:???? option auth.login.6c4653bb-b708-46e8-b3f9-177b4cdbbf28.password
091ae3b1-40c2-4d48-8870-6ad7884457ac
?49:???? option auth.addr./export/brick1.allow *
?50:???? subvolumes /export/brick1
?51: end-volume
+------------------------------------------------------------------------------+
[2013-06-02 09:29:03.963001] W [socket.c:410:__socket_keepalive] 0-socket:
failed to set keep idle on socket 8
[2013-06-02 09:29:03.963046] W [socket.c:1876:socket_server_event_handler]
0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2013-06-02 09:29:04.850120] I [server-handshake.c:571:server_setvolume]
0-gvol1-server: accepted client from
iiclab-oel1-9347-2013/06/02-09:29:00:835397-gvol1-client-0-0 (version: 3.3.1)
[2013-06-02 09:32:16.973786] W [glusterfsd.c:831:cleanup_and_exit]
(-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x93) [0x30cac0a5b3]
(-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293) [0x30cac0a443]
(-->/usr/sbin/glusterfsd(glusterfs_handle_terminate+0x15) [0x40a955]))) 0-:
received signum (15), shutting down
[2013-06-02 09:32:16.973895] W [glusterfsd.c:831:cleanup_and_exit]
(-->/lib64/libc.so.6(clone+0x6d) [0x3ef56e68ed]
(-->/lib64/libpthread.so.0() [0x3ef5a077e1]
(-->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdd) [0x405d4d]))) 0-: received
signum (15), shutting down


NFS LOG
[2013-06-02 09:29:00.918906] I [rpc-clnt.c:1657:rpc_clnt_reconfig]
0-gvol1-client-0: changing port to 24009 (from 0)
[2013-06-02 09:29:03.963023] W [socket.c:410:__socket_keepalive] 0-socket:
failed to set keep idle on socket 8
[2013-06-02 09:29:03.963062] W [socket.c:1876:socket_server_event_handler]
0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2013-06-02 09:29:04.849941] I
[client-handshake.c:1636:select_server_supported_programs] 0-gvol1-client-0:
Using Program GlusterFS 3.3.1, Num (1298437), Version (330)
[2013-06-02 09:29:04.853016] I [client-handshake.c:1433:client_setvolume_cbk]
0-gvol1-client-0: Connected to 10.0.0.30:24009, attached to remote volume
'/export/brick1'.
[2013-06-02 09:29:04.853048] I [client-handshake.c:1445:client_setvolume_cbk]
0-gvol1-client-0: Server and Client lk-version numbers are not same, reopening
the fds
[2013-06-02 09:29:04.853262] I
[client-handshake.c:453:client_set_lk_version_cbk] 0-gvol1-client-0: Server lk
version = 1
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130603/ed6b3792/attachment.html>

Krishnan Parthasarathi

2013-Jun-03 09:44 UTC

head link

[Gluster-users] recovering gluster volume || startup failure

Srinivas,

Could you paste the output of "gluster volume info gvol1"?
This should give us an idea as to what was the state of the volume
before the power loss.

thanks,
krish

----- Original Message -----> Hello Gluster users:
> sorry for long post, I have run out of ideas here, kindly let me know if i
am
> looking at right places for logs and any suggested actions.....thanks
> a sudden power loss casued hard reboot - now the volume does not start
> Glusterfs- 3.3.1 on Centos 6.1 transport: TCP
> sharing volume over NFS for VM storage - VHD Files
> Type: distributed - only 1 node (brick)
> XFS (LVM)
> mount /dev/datastore1/mylv1 /export/brick1 - mounts VHD files.......is
there
> a way to recover these files?
> cat export-brick1.log
> [2013-06-02 09:29:00.832914] I [glusterfsd.c:1666:main]
> 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.3.1
> [2013-06-02 09:29:00.845515] I [graph.c:241:gf_add_cmdline_options]
> 0-gvol1-server: adding option 'listen-port' for volume
'gvol1-server' with
> value '24009'
> [2013-06-02 09:29:00.845558] I [graph.c:241:gf_add_cmdline_options]
> 0-gvol1-posix: adding option 'glusterd-uuid' for volume
'gvol1-posix' with
> value '16ee7a4e-ee9b-4543-bd61-9b444100693d'
> [2013-06-02 09:29:00.846654] W [options.c:782:xl_opt_validate]
> 0-gvol1-server: option 'listen-port' is deprecated, preferred is
> 'transport.socket.listen-port', continuing with correction
> Given volfile:
>
+------------------------------------------------------------------------------+
> 1: volume gvol1-posix
> 2: type storage/posix
> 3: option directory /export/brick1
> 4: option volume-id aa25aa58-d191-432a-a84b-325051347af6
> 5: end-volume
> 6:
> 7: volume gvol1-access-control
> 8: type features/access-control
> 9: subvolumes gvol1-posix
> 10: end-volume
> 11:
> 12: volume gvol1-locks
> 13: type features/locks
> 14: subvolumes gvol1-access-control
> ----------
> -----------------
> 
> 46: option transport-type tcp
> 47: option auth.login./export/brick1.allow
> 6c4653bb-b708-46e8-b3f9-177b4cdbbf28
> 48: option auth.login.6c4653bb-b708-46e8-b3f9-177b4cdbbf28.password
> 091ae3b1-40c2-4d48-8870-6ad7884457ac
> 49: option auth.addr./export/brick1.allow *
> 50: subvolumes /export/brick1
> 51: end-volume
>
+------------------------------------------------------------------------------+
> [2013-06-02 09:29:03.963001] W [socket.c:410:__socket_keepalive] 0-socket:
> failed to set keep idle on socket 8
> [2013-06-02 09:29:03.963046] W [socket.c:1876:socket_server_event_handler]
> 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
> [2013-06-02 09:29:04.850120] I [server-handshake.c:571:server_setvolume]
> 0-gvol1-server: accepted client from
> iiclab-oel1-9347-2013/06/02-09:29:00:835397-gvol1-client-0-0 (version:
> 3.3.1)
> [2013-06-02 09:32:16.973786] W [glusterfsd.c:831:cleanup_and_exit]
> (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x93) [0x30cac0a5b3]
> (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293)
[0x30cac0a443]
> (-->/usr/sbin/glusterfsd(glusterfs_handle_terminate+0x15) [0x40a955])))
0-:
> received signum (15), shutting down
> [2013-06-02 09:32:16.973895] W [glusterfsd.c:831:cleanup_and_exit]
> (-->/lib64/libc.so.6(clone+0x6d) [0x3ef56e68ed]
(-->/lib64/libpthread.so.0()
> [0x3ef5a077e1] (-->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdd)
> [0x405d4d]))) 0-: received signum (15), shutting down
> NFS LOG
> [2013-06-02 09:29:00.918906] I [rpc-clnt.c:1657:rpc_clnt_reconfig]
> 0-gvol1-client-0: changing port to 24009 (from 0)
> [2013-06-02 09:29:03.963023] W [socket.c:410:__socket_keepalive] 0-socket:
> failed to set keep idle on socket 8
> [2013-06-02 09:29:03.963062] W [socket.c:1876:socket_server_event_handler]
> 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
> [2013-06-02 09:29:04.849941] I
> [client-handshake.c:1636:select_server_supported_programs]
0-gvol1-client-0:
> Using Program GlusterFS 3.3.1, Num (1298437), Version (330)
> [2013-06-02 09:29:04.853016] I
[client-handshake.c:1433:client_setvolume_cbk]
> 0-gvol1-client-0: Connected to 10.0.0.30:24009, attached to remote volume
> '/export/brick1'.
> [2013-06-02 09:29:04.853048] I
[client-handshake.c:1445:client_setvolume_cbk]
> 0-gvol1-client-0: Server and Client lk-version numbers are not same,
> reopening the fds
> [2013-06-02 09:29:04.853262] I
> [client-handshake.c:453:client_set_lk_version_cbk] 0-gvol1-client-0: Server
> lk version = 1
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

srinivas jonn

2013-Jun-03 18:12 UTC

head link

[Gluster-users] gluster startup failure

Hello Gluster users,
thought of posing a more refined question, thanks to the support of Krish.

problem statement: Gluster volume start - failure

RPM installation of 3.3.0 on CentOS 6.1 - XFS is filesystem layer - 
NFS export , distributed single node, TCP



this server has experienced a accidental powerloss while in operation. any help
in resolving or debug issue is appreciated.
?
glusterd logs indicate failure to resolve the brick:

[2013-06-03 12:03:24.660330] I
[glusterd-volume-ops.c:290:glusterd_handle_cli_start_volume] 0-glusterd:
Received start vol reqfor volume gvol1
[2013-06-03 12:03:24.660384] I [glusterd-utils.c:285:glusterd_lock] 0-glusterd:
Cluster lock held by 16ee7a4e-ee9b-4543-bd61-9b444100693d
[2013-06-03 12:03:24.660398] I [glusterd-handler.c:463:glusterd_op_txn_begin]
0-management: Acquired local lock
[2013-06-03 12:03:24.842904] E
[glusterd-volume-ops.c:842:glusterd_op_stage_start_volume] 0-: Unable to resolve
brick 10.0.0.30:/export/brick1
[2013-06-03 12:03:24.842938] E
[glusterd-op-sm.c:1999:glusterd_op_ac_send_stage_op] 0-: Staging failed
[2013-06-03 12:03:24.842959] I
[glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to
0 peers
[2013-06-03 12:03:24.842982] I [glusterd-op-sm.c:2653:glusterd_op_txn_complete]
0-glusterd: Cleared local lock



________________________________
From: Krishnan Parthasarathi <kparthas at redhat.com>
To: srinivas jonn <jmsrinivas at yahoo.com> 
Sent: Monday, 3 June 2013 4:24 PM
Subject: Re: [Gluster-users] recovering gluster volume || startup failure


Is this a source install or an rpm install? If it is a source install,
the logs would be present under <install-prefix>/var/log/glusterfs

Having said that, could you attach etc-glusterfs-glusterd.log file?
Does the gluster CLI print any error messages to the terminal, when
volume-start fails?

thanks,
krish

----- Original Message -----> there is no /var/log/glusterfs/.cmd_log_history file .
> 
> gluster volume start <volume> - volume start has been unsuccessful
> 
> 
> let me know for any specific log, I am trying to debug why volume is not
> starting -
> 
> feel free to copy the gluster-user DL if you think right
> 
> 
> 
> ________________________________
>? From: Krishnan Parthasarathi <kparthas at redhat.com>
> To: srinivas jonn <jmsrinivas at yahoo.com>
> Cc: gluster-users at gluster.org
> Sent: Monday, 3 June 2013 3:56 PM
> Subject: Re: [Gluster-users] recovering gluster volume || startup failure
>? 
> 
> Did you run "gluster volume start gvol1"? Could you attach
> /var/log/glusterfs/.cmd_log_history (log file)?
> From the logs you have pasted, it looks like volume-stop is the last
command
> you executed.
> 
> thanks,
> krish
> 
> ----- Original Message -----
> > the volume is not starting - this was the issue.. please let mw know
the
> > diagnostic or debug procedures,
> > ?
> > ?
> > logs:
> > ?
> > usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293) [0x30cac0a443]
> > ?/usr/sbin/glusterfsd(glusterfs_handle_terminate+0x15) [0x40a955])))
0-:
> > ?received signum (15), shutting down
> > ?[2013-06-02 09:32:16.973895] W [glusterfsd.c:831:cleanup_and_exit]
> > ?(-->/lib64/libc.so.6(clone+0x6d) [0x3ef56e68ed]
> > ?(-->/lib64/libpthread.so.0()
> > ?[0x3ef5a077e1] (-->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdd)
> > ?[0x405d4d]))) 0-: received signum (15), shutting down
> >? 
> > 
> > ________________________________
> >? From: Krishnan Parthasarathi <kparthas at redhat.com>
> > To: srinivas jonn <jmsrinivas at yahoo.com>
> > Cc: gluster-users at gluster.org
> > Sent: Monday, 3 June 2013 3:27 PM
> > Subject: Re: [Gluster-users] recovering gluster volume || startup
failure
> >? 
> > 
> > Srinivas,
> > 
> > The volume is in stopped state. You could start the volume by running
> > "gluster volume start gvol1". This should make your attempts
at mounting
> > the volume successful.
> > 
> > thanks,
> > krish
> > 
> > ----- Original Message -----
> > > Krish,
> > > this is giving general volume information , can the state of
volume known
> > > from any specific logs?
> > > #gluster volume info gvol1
> > > Volume Name: gvol1
> > > Type: Distribute
> > > Volume ID: aa25aa58-d191-432a-a84b-325051347af6
> > > Status: Stopped
> > > Number of Bricks: 1
> > > Transport-type: tcp
> > > Bricks:
> > > Brick1: 10.0.0.30:/export/brick1
> > > Options Reconfigured:
> > > nfs.addr-namelookup: off
> > > nfs.port: 2049
> > > From: Krishnan Parthasarathi <kparthas at redhat.com>
> > > To: srinivas jonn <jmsrinivas at yahoo.com>
> > > Cc: gluster-users at gluster.org
> > > Sent: Monday, 3 June 2013 3:14 PM
> > > Subject: Re: [Gluster-users] recovering gluster volume || startup
failure
> > > 
> > > Srinivas,
> > > 
> > > Could you paste the output of "gluster volume info
gvol1"?
> > > This should give us an idea as to what was the state of the
volume
> > > before the power loss.
> > > 
> > > thanks,
> > > krish
> > > 
> > > ----- Original Message -----
> > > > Hello Gluster users:
> > > > sorry for long post, I have run out of ideas here, kindly
let me know
> > > > if
> > > > i
> > > > am
> > > > looking at right places for logs and any suggested
actions.....thanks
> > > > a sudden power loss casued hard reboot - now the volume does
not start
> > > > Glusterfs- 3.3.1 on Centos 6.1 transport: TCP
> > > > sharing volume over NFS for VM storage - VHD Files
> > > > Type: distributed - only 1 node (brick)
> > > > XFS (LVM)
> > > > mount /dev/datastore1/mylv1 /export/brick1 - mounts VHD
files.......is
> > > > there
> > > > a way to recover these files?
> > > > cat export-brick1.log
> > > > [2013-06-02 09:29:00.832914] I [glusterfsd.c:1666:main]
> > > > 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd
version
> > > > 3.3.1
> > > > [2013-06-02 09:29:00.845515] I
[graph.c:241:gf_add_cmdline_options]
> > > > 0-gvol1-server: adding option 'listen-port' for
volume 'gvol1-server'
> > > > with
> > > > value '24009'
> > > > [2013-06-02 09:29:00.845558] I
[graph.c:241:gf_add_cmdline_options]
> > > > 0-gvol1-posix: adding option 'glusterd-uuid' for
volume 'gvol1-posix'
> > > > with
> > > > value '16ee7a4e-ee9b-4543-bd61-9b444100693d'
> > > > [2013-06-02 09:29:00.846654] W
[options.c:782:xl_opt_validate]
> > > > 0-gvol1-server: option 'listen-port' is deprecated,
preferred is
> > > > 'transport.socket.listen-port', continuing with
correction
> > > > Given volfile:
> > > >
+------------------------------------------------------------------------------+
> > > > 1: volume gvol1-posix
> > > > 2: type storage/posix
> > > > 3: option directory /export/brick1
> > > > 4: option volume-id aa25aa58-d191-432a-a84b-325051347af6
> > > > 5: end-volume
> > > > 6:
> > > > 7: volume gvol1-access-control
> > > > 8: type features/access-control
> > > > 9: subvolumes gvol1-posix
> > > > 10: end-volume
> > > > 11:
> > > > 12: volume gvol1-locks
> > > > 13: type features/locks
> > > > 14: subvolumes gvol1-access-control
> > > > ----------
> > > > -----------------
> > > > 
> > > > 46: option transport-type tcp
> > > > 47: option auth.login./export/brick1.allow
> > > > 6c4653bb-b708-46e8-b3f9-177b4cdbbf28
> > > > 48: option
auth.login.6c4653bb-b708-46e8-b3f9-177b4cdbbf28.password
> > > > 091ae3b1-40c2-4d48-8870-6ad7884457ac
> > > > 49: option auth.addr./export/brick1.allow *
> > > > 50: subvolumes /export/brick1
> > > > 51: end-volume
> > > >
+------------------------------------------------------------------------------+
> > > > [2013-06-02 09:29:03.963001] W
[socket.c:410:__socket_keepalive]
> > > > 0-socket:
> > > > failed to set keep idle on socket 8
> > > > [2013-06-02 09:29:03.963046] W
> > > > [socket.c:1876:socket_server_event_handler]
> > > > 0-socket.glusterfsd: Failed to set keep-alive: Operation not
supported
> > > > [2013-06-02 09:29:04.850120] I
> > > > [server-handshake.c:571:server_setvolume]
> > > > 0-gvol1-server: accepted client from
> > > > iiclab-oel1-9347-2013/06/02-09:29:00:835397-gvol1-client-0-0
(version:
> > > > 3.3.1)
> > > > [2013-06-02 09:32:16.973786] W
[glusterfsd.c:831:cleanup_and_exit]
> > > > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x93)
[0x30cac0a5b3]
> > > >
(-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x293)
> > > > [0x30cac0a443]
> > > > (-->/usr/sbin/glusterfsd(glusterfs_handle_terminate+0x15)
[0x40a955])))
> > > > 0-:
> > > > received signum (15), shutting down
> > > > [2013-06-02 09:32:16.973895] W
[glusterfsd.c:831:cleanup_and_exit]
> > > > (-->/lib64/libc.so.6(clone+0x6d) [0x3ef56e68ed]
> > > > (-->/lib64/libpthread.so.0()
> > > > [0x3ef5a077e1]
(-->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdd)
> > > > [0x405d4d]))) 0-: received signum (15), shutting down
> > > > NFS LOG
> > > > [2013-06-02 09:29:00.918906] I
[rpc-clnt.c:1657:rpc_clnt_reconfig]
> > > > 0-gvol1-client-0: changing port to 24009 (from 0)
> > > > [2013-06-02 09:29:03.963023] W
[socket.c:410:__socket_keepalive]
> > > > 0-socket:
> > > > failed to set keep idle on socket 8
> > > > [2013-06-02 09:29:03.963062] W
> > > > [socket.c:1876:socket_server_event_handler]
> > > > 0-socket.glusterfsd: Failed to set keep-alive: Operation not
supported
> > > > [2013-06-02 09:29:04.849941] I
> > > > [client-handshake.c:1636:select_server_supported_programs]
> > > > 0-gvol1-client-0:
> > > > Using Program GlusterFS 3.3.1, Num (1298437), Version (330)
> > > > [2013-06-02 09:29:04.853016] I
> > > > [client-handshake.c:1433:client_setvolume_cbk]
> > > > 0-gvol1-client-0: Connected to 10.0.0.30:24009, attached to
remote
> > > > volume
> > > > '/export/brick1'.
> > > > [2013-06-02 09:29:04.853048] I
> > > > [client-handshake.c:1445:client_setvolume_cbk]
> > > > 0-gvol1-client-0: Server and Client lk-version numbers are
not same,
> > > > reopening the fds
> > > > [2013-06-02 09:29:04.853262] I
> > > > [client-handshake.c:453:client_set_lk_version_cbk]
0-gvol1-client-0:
> > > > Server
> > > > lk version = 1
> > > > 
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users at gluster.org
> > > >
http://supercolony.gluster.org/mailman/listinfo/gluster-users
> > > 
> > > 
> > > 
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > http://supercolony.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130604/0e89f7ab/attachment.html>

Seemingly Similar Threads

Search for more apparently analagous threads

Gluster users - Jun 2013 - recovering gluster volume || startup failure

[Gluster-users] recovering gluster volume || startup failure

[Gluster-users] recovering gluster volume || startup failure

[Gluster-users] gluster startup failure

Seemingly Similar Threads