thr3ads.net - Gluster users - [Gluster-users] glusterd crashing [Oct 2015]

If this information is useful, please help other people find it:
Share via:

Gaurav Garg

2015-Oct-02 15:18 UTC

[Gluster-users] glusterd crashing

>> Pulling those logs now but how do I generate the core file you are
askingfor?

When there is crash then core file automatically generated based on your
*ulimit* set option. you can find location of core file in your root or current
working directory or where ever you have set your core dump file location. core
file gives you information regarding crash, where exactly crash happened.
you can find appropriate core file by looking at crash time in glusterd
log's by searching "crash" keyword. you can also paste few
line's just above latest "crash" keyword in glusterd logs.

Just for your curiosity if you willing to look where it crash then you can debug
it by #gdb -c <location of core file> glusterd

Thank you...

Regards,
Gaurav  

----- Original Message -----
From: "Gene Liverman" <gliverma at westga.edu>
To: "Gaurav Garg" <ggarg at redhat.com>
Cc: "gluster-users" <gluster-users at gluster.org>
Sent: Friday, October 2, 2015 8:28:49 PM
Subject: Re: [Gluster-users] glusterd crashing

Pulling those logs now but how do I generate the core file you are asking
for?





--
*Gene Liverman*
Systems Integration Architect
Information Technology Services
University of West Georgia
gliverma at westga.edu
678.839.5492

ITS: Making Technology Work for You!




On Fri, Oct 2, 2015 at 2:25 AM, Gaurav Garg <ggarg at redhat.com> wrote:
> Hi Gene,
>
> you have paste glustershd log. we asked you to paste glusterd log.
> glusterd and glustershd both are different process. with this information
> we can't find out why your glusterd crashed. could you paste *glusterd*
> logs (/var/log/glusterfs/usr-local-etc-glusterfs-glusterd.vol.log*) in
> pastebin (not in this mail thread) and give the link of pastebin in this
> mail thread. Can you also attach core file or you can paste backtrace of
> that core dump file.
> It will be great if you give us sos report of the node where the crash
> happen.
>
> Thanx,
>
> ~Gaurav
>
> ----- Original Message -----
> From: "Gene Liverman" <gliverma at westga.edu>
> To: "gluster-users" <gluster-users at gluster.org>
> Sent: Friday, October 2, 2015 4:47:00 AM
> Subject: Re: [Gluster-users] glusterd crashing
>
> Sorry for the delay. Here is what's installed:
> # rpm -qa | grep gluster
> glusterfs-geo-replication-3.7.4-2.el6.x86_64
> glusterfs-client-xlators-3.7.4-2.el6.x86_64
> glusterfs-3.7.4-2.el6.x86_64
> glusterfs-libs-3.7.4-2.el6.x86_64
> glusterfs-api-3.7.4-2.el6.x86_64
> glusterfs-fuse-3.7.4-2.el6.x86_64
> glusterfs-server-3.7.4-2.el6.x86_64
> glusterfs-cli-3.7.4-2.el6.x86_64
>
> The cmd_history.log file is attached.
> In gluster.log I have filtered out a bunch of lines like the one below due
> to make them more readable. I had a node down for multiple days due to
> maintenance and another one went down due to a hardware failure during that
> time too.
> [2015-10-01 00:16:09.643631] W [MSGID: 114031]
> [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-gv0-client-0: remote
> operation failed. Path: <gfid:31f17f8c-6c96-4440-88c0-f813b3c8d364>
> (31f17f8c-6c96-4440-88c0-f813b3c8d364) [No such file or directory]
>
> I also filtered out a boat load of self heal lines like these two:
> [2015-10-01 15:14:14.851015] I [MSGID: 108026]
> [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-gv0-replicate-0:
> performing metadata selfheal on f78a47db-a359-430d-a655-1d217eb848c3
> [2015-10-01 15:14:14.856392] I [MSGID: 108026]
> [afr-self-heal-common.c:651:afr_log_selfheal] 0-gv0-replicate-0: Completed
> metadata selfheal on f78a47db-a359-430d-a655-1d217eb848c3. source=0 sinks=1
>
>
> [root at eapps-gluster01 glusterfs]# cat glustershd.log |grep -v
'remote
> operation failed' |grep -v 'self-heal'
> [2015-09-27 08:46:56.893125] E [rpc-clnt.c:201:call_bail] 0-glusterfs:
> bailing out frame type(GlusterFS Handshake) op(GETSPEC(2)) xid = 0x6 sent
> 2015-09-27 08:16:51.742731. timeout = 1800 for 127.0.0.1:24007
> [2015-09-28 12:54:17.524924] W [socket.c:588:__socket_rwv] 0-glusterfs:
> readv on 127.0.0.1:24007 failed (Connection reset by peer)
> [2015-09-28 12:54:27.844374] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2015-09-28 12:57:03.485027] W [socket.c:588:__socket_rwv] 0-gv0-client-2:
> readv on 160.10.31.227:24007 failed (Connection reset by peer)
> [2015-09-28 12:57:05.872973] E [socket.c:2278:socket_connect_finish]
> 0-gv0-client-2: connection to 160.10.31.227:24007 failed (Connection
> refused)
> [2015-09-28 12:57:38.490578] W [socket.c:588:__socket_rwv] 0-glusterfs:
> readv on 127.0.0.1:24007 failed (No data available)
> [2015-09-28 12:57:49.054475] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2015-09-28 13:01:12.062960] W [glusterfsd.c:1219:cleanup_and_exit]
> (-->/lib64/libpthread.so.0() [0x3c65e07a51]
> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x405e4d]
> -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-: received
> signum (15), shutting down
> [2015-09-28 13:01:12.981945] I [MSGID: 100030] [glusterfsd.c:2301:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.4
> (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p
> /var/lib/glusterd/glustershd/run/glustershd.pid -l
> /var/log/glusterfs/glustershd.log -S
> /var/run/gluster/9a9819e90404187e84e67b01614bbe10.socket --xlator-option
> *replicate*.node-uuid=416d712a-06fc-4b3c-a92f-8c82145626ff)
> [2015-09-28 13:01:13.009171] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2015-09-28 13:01:13.092483] I [graph.c:269:gf_add_cmdline_options]
> 0-gv0-replicate-0: adding option 'node-uuid' for volume
'gv0-replicate-0'
> with value '416d712a-06fc-4b3c-a92f-8c82145626ff'
> [2015-09-28 13:01:13.100856] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 2
> [2015-09-28 13:01:13.103995] I [MSGID: 114020] [client.c:2118:notify]
> 0-gv0-client-0: parent translators are ready, attempting connect on
> transport
> [2015-09-28 13:01:13.114745] I [MSGID: 114020] [client.c:2118:notify]
> 0-gv0-client-1: parent translators are ready, attempting connect on
> transport
> [2015-09-28 13:01:13.115725] I [rpc-clnt.c:1851:rpc_clnt_reconfig]
> 0-gv0-client-0: changing port to 49152 (from 0)
> [2015-09-28 13:01:13.125619] I [MSGID: 114020] [client.c:2118:notify]
> 0-gv0-client-2: parent translators are ready, attempting connect on
> transport
> [2015-09-28 13:01:13.132316] E [socket.c:2278:socket_connect_finish]
> 0-gv0-client-1: connection to 160.10.31.64:24007 failed (Connection
> refused)
> [2015-09-28 13:01:13.132650] I [MSGID: 114057]
> [client-handshake.c:1437:select_server_supported_programs] 0-gv0-client-0:
> Using Program GlusterFS 3.3, Num (1298437), Version (330)
> [2015-09-28 13:01:13.133322] I [MSGID: 114046]
> [client-handshake.c:1213:client_setvolume_cbk] 0-gv0-client-0: Connected to
> gv0-client-0, attached to remote volume '/export/sdb1/gv0'.
> [2015-09-28 13:01:13.133365] I [MSGID: 114047]
> [client-handshake.c:1224:client_setvolume_cbk] 0-gv0-client-0: Server and
> Client lk-version numbers are not same, reopening the fds
> [2015-09-28 13:01:13.133782] I [MSGID: 108005]
> [afr-common.c:3998:afr_notify] 0-gv0-replicate-0: Subvolume
'gv0-client-0'
> came back up; going online.
> [2015-09-28 13:01:13.133863] I [MSGID: 114035]
> [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-0: Server
> lk version = 1
> Final graph:
>
>
+------------------------------------------------------------------------------+
> 1: volume gv0-client-0
> 2: type protocol/client
> 3: option clnt-lk-version 1
> 4: option volfile-checksum 0
> 5: option volfile-key gluster/glustershd
> 6: option client-version 3.7.4
> 7: option process-uuid
> eapps-gluster01-65147-2015/09/28-13:01:12:970131-gv0-client-0-0-0
> 8: option fops-version 1298437
> 9: option ping-timeout 42
> 10: option remote-host eapps-gluster01.uwg.westga.edu
> 11: option remote-subvolume /export/sdb1/gv0
> 12: option transport-type socket
> 13: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> 14: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> 15: end-volume
> 16:
> 17: volume gv0-client-1
> 18: type protocol/client
> 19: option ping-timeout 42
> 20: option remote-host eapps-gluster02.uwg.westga.edu
> 21: option remote-subvolume /export/sdb1/gv0
> 22: option transport-type socket
> 23: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> 24: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> 25: end-volume
> 26:
> 27: volume gv0-client-2
> 28: type protocol/client
> 29: option ping-timeout 42
> 30: option remote-host eapps-gluster03.uwg.westga.edu
> 31: option remote-subvolume /export/sdb1/gv0
> 32: option transport-type socket
> 33: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> 34: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> 35: end-volume
> 36:
> 37: volume gv0-replicate-0
> 38: type cluster/replicate
> 39: option node-uuid 416d712a-06fc-4b3c-a92f-8c82145626ff
> 46: subvolumes gv0-client-0 gv0-client-1 gv0-client-2
> 47: end-volume
> 48:
> 49: volume glustershd
> 50: type debug/io-stats
> 51: subvolumes gv0-replicate-0
> 52: end-volume
> 53:
>
>
+------------------------------------------------------------------------------+
> [2015-09-28 13:01:13.154898] E [MSGID: 114058]
> [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-2: failed
> to get the port number for remote subvolume. Please run 'gluster volume
> status' on server to see if brick process is running.
> [2015-09-28 13:01:13.155031] I [MSGID: 114018]
> [client.c:2042:client_rpc_notify] 0-gv0-client-2: disconnected from
> gv0-client-2. Client process will keep trying to connect to glusterd until
> brick's port is available
> [2015-09-28 13:01:13.155080] W [MSGID: 108001]
> [afr-common.c:4081:afr_notify] 0-gv0-replicate-0: Client-quorum is not met
> [2015-09-29 08:11:24.728797] I [MSGID: 100011]
> [glusterfsd.c:1291:reincarnate] 0-glusterfsd: Fetching the volume file from
> server...
> [2015-09-29 08:11:24.763338] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2015-09-29 12:50:41.915254] E [rpc-clnt.c:201:call_bail] 0-gv0-client-2:
> bailing out frame type(GF-DUMP) op(DUMP(1)) xid = 0xd91f sent = 2015-09-29
> 12:20:36.092734. timeout = 1800 for 160.10.31.227:24007
> [2015-09-29 12:50:41.923550] W [MSGID: 114032]
> [client-handshake.c:1623:client_dump_version_cbk] 0-gv0-client-2: received
> RPC status error [Transport endpoint is not connected]
> [2015-09-30 23:54:36.547979] W [socket.c:588:__socket_rwv] 0-glusterfs:
> readv on 127.0.0.1:24007 failed (No data available)
> [2015-09-30 23:54:46.812870] E [socket.c:2278:socket_connect_finish]
> 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection refused)
> [2015-10-01 00:14:20.997081] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile, continuing
> [2015-10-01 00:15:36.770579] W [socket.c:588:__socket_rwv] 0-gv0-client-2:
> readv on 160.10.31.227:24007 failed (Connection reset by peer)
> [2015-10-01 00:15:37.906708] E [socket.c:2278:socket_connect_finish]
> 0-gv0-client-2: connection to 160.10.31.227:24007 failed (Connection
> refused)
> [2015-10-01 00:15:53.008130] W [glusterfsd.c:1219:cleanup_and_exit]
> (-->/lib64/libpthread.so.0() [0x3b91807a51]
> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x405e4d]
> -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-: received
> signum (15), shutting down
> [2015-10-01 00:15:53.008697] I [timer.c:48:gf_timer_call_after]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3e2) [0x3b9480f992]
> -->/usr/lib64/libgfrpc.so.0(__save_frame+0x76) [0x3b9480f046]
> -->/usr/lib64/libglusterfs.so.0(gf_timer_call_after+0x1b1)
[0x3b93447881] )
> 0-timer: ctx cleanup started
> [2015-10-01 00:15:53.994698] I [MSGID: 100030] [glusterfsd.c:2301:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.4
> (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p
> /var/lib/glusterd/glustershd/run/glustershd.pid -l
> /var/log/glusterfs/glustershd.log -S
> /var/run/gluster/9a9819e90404187e84e67b01614bbe10.socket --xlator-option
> *replicate*.node-uuid=416d712a-06fc-4b3c-a92f-8c82145626ff)
> [2015-10-01 00:15:54.020401] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2015-10-01 00:15:54.086777] I [graph.c:269:gf_add_cmdline_options]
> 0-gv0-replicate-0: adding option 'node-uuid' for volume
'gv0-replicate-0'
> with value '416d712a-06fc-4b3c-a92f-8c82145626ff'
> [2015-10-01 00:15:54.093004] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 2
> [2015-10-01 00:15:54.098144] I [MSGID: 114020] [client.c:2118:notify]
> 0-gv0-client-0: parent translators are ready, attempting connect on
> transport
> [2015-10-01 00:15:54.107432] I [MSGID: 114020] [client.c:2118:notify]
> 0-gv0-client-1: parent translators are ready, attempting connect on
> transport
> [2015-10-01 00:15:54.115962] I [MSGID: 114020] [client.c:2118:notify]
> 0-gv0-client-2: parent translators are ready, attempting connect on
> transport
> [2015-10-01 00:15:54.120474] E [socket.c:2278:socket_connect_finish]
> 0-gv0-client-1: connection to 160.10.31.64:24007 failed (Connection
> refused)
> [2015-10-01 00:15:54.120639] I [rpc-clnt.c:1851:rpc_clnt_reconfig]
> 0-gv0-client-0: changing port to 49152 (from 0)
> Final graph:
>
>
+------------------------------------------------------------------------------+
> 1: volume gv0-client-0
> 2: type protocol/client
> 3: option ping-timeout 42
> 4: option remote-host eapps-gluster01.uwg.westga.edu
> 5: option remote-subvolume /export/sdb1/gv0
> 6: option transport-type socket
> 7: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> 8: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> 9: end-volume
> 10:
> 11: volume gv0-client-1
> 12: type protocol/client
> 13: option ping-timeout 42
> 14: option remote-host eapps-gluster02.uwg.westga.edu
> 15: option remote-subvolume /export/sdb1/gv0
> 16: option transport-type socket
> 17: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> 18: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> 19: end-volume
> 20:
> 21: volume gv0-client-2
> 22: type protocol/client
> 23: option ping-timeout 42
> 24: option remote-host eapps-gluster03.uwg.westga.edu
> 25: option remote-subvolume /export/sdb1/gv0
> 26: option transport-type socket
> 27: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> 28: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> 29: end-volume
> 30:
> 31: volume gv0-replicate-0
> 32: type cluster/replicate
> 33: option node-uuid 416d712a-06fc-4b3c-a92f-8c82145626ff
> 40: subvolumes gv0-client-0 gv0-client-1 gv0-client-2
> 41: end-volume
> 42:
> 43: volume glustershd
> 44: type debug/io-stats
> 45: subvolumes gv0-replicate-0
> 46: end-volume
> 47:
>
>
+------------------------------------------------------------------------------+
> [2015-10-01 00:15:54.135650] I [MSGID: 114057]
> [client-handshake.c:1437:select_server_supported_programs] 0-gv0-client-0:
> Using Program GlusterFS 3.3, Num (1298437), Version (330)
> [2015-10-01 00:15:54.136223] I [MSGID: 114046]
> [client-handshake.c:1213:client_setvolume_cbk] 0-gv0-client-0: Connected to
> gv0-client-0, attached to remote volume '/export/sdb1/gv0'.
> [2015-10-01 00:15:54.136262] I [MSGID: 114047]
> [client-handshake.c:1224:client_setvolume_cbk] 0-gv0-client-0: Server and
> Client lk-version numbers are not same, reopening the fds
> [2015-10-01 00:15:54.136410] I [MSGID: 108005]
> [afr-common.c:3998:afr_notify] 0-gv0-replicate-0: Subvolume
'gv0-client-0'
> came back up; going online.
> [2015-10-01 00:15:54.136500] I [MSGID: 114035]
> [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-0: Server
> lk version = 1
> [2015-10-01 00:15:54.401702] E [MSGID: 114058]
> [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-2: failed
> to get the port number for remote subvolume. Please run 'gluster volume
> status' on server to see if brick process is running.
> [2015-10-01 00:15:54.401834] I [MSGID: 114018]
> [client.c:2042:client_rpc_notify] 0-gv0-client-2: disconnected from
> gv0-client-2. Client process will keep trying to connect to glusterd until
> brick's port is available
> [2015-10-01 00:15:54.401878] W [MSGID: 108001]
> [afr-common.c:4081:afr_notify] 0-gv0-replicate-0: Client-quorum is not met
> [2015-10-01 03:57:52.755426] E [socket.c:2278:socket_connect_finish]
> 0-gv0-client-2: connection to 160.10.31.227:24007 failed (Connection
> refused)
> [2015-10-01 13:50:49.000708] E [socket.c:2278:socket_connect_finish]
> 0-gv0-client-2: connection to 160.10.31.227:24007 failed (Connection
> timed out)
> [2015-10-01 14:36:40.481673] E [MSGID: 114058]
> [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-1: failed
> to get the port number for remote subvolume. Please run 'gluster volume
> status' on server to see if brick process is running.
> [2015-10-01 14:36:40.481833] I [MSGID: 114018]
> [client.c:2042:client_rpc_notify] 0-gv0-client-1: disconnected from
> gv0-client-1. Client process will keep trying to connect to glusterd until
> brick's port is available
> [2015-10-01 14:36:41.982037] I [rpc-clnt.c:1851:rpc_clnt_reconfig]
> 0-gv0-client-1: changing port to 49152 (from 0)
> [2015-10-01 14:36:41.993478] I [MSGID: 114057]
> [client-handshake.c:1437:select_server_supported_programs] 0-gv0-client-1:
> Using Program GlusterFS 3.3, Num (1298437), Version (330)
> [2015-10-01 14:36:41.994568] I [MSGID: 114046]
> [client-handshake.c:1213:client_setvolume_cbk] 0-gv0-client-1: Connected to
> gv0-client-1, attached to remote volume '/export/sdb1/gv0'.
> [2015-10-01 14:36:41.994647] I [MSGID: 114047]
> [client-handshake.c:1224:client_setvolume_cbk] 0-gv0-client-1: Server and
> Client lk-version numbers are not same, reopening the fds
> [2015-10-01 14:36:41.994899] I [MSGID: 108002]
> [afr-common.c:4077:afr_notify] 0-gv0-replicate-0: Client-quorum is met
> [2015-10-01 14:36:42.002275] I [MSGID: 114035]
> [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-1: Server
> lk version = 1
>
>
>
>
> Thanks,
> Gene Liverman
> Systems Integration Architect
> Information Technology Services
> University of West Georgia
> gliverma at westga.edu
>
> ITS: Making Technology Work for You!
>
>
>
> On Wed, Sep 30, 2015 at 10:54 PM, Gaurav Garg < ggarg at redhat.com >
wrote:
>
>
> Hi Gene,
>
> Could you paste or attach core file/glusterd log file/cmd history to find
> out actual RCA of the crash. What steps you performed for this crash.
>
> >> How can I troubleshoot this?
>
> If you want to troubleshoot this then you can look into the glusterd log
> file, core file.
>
> Thank you..
>
> Regards,
> Gaurav
>
> ----- Original Message -----
> From: "Gene Liverman" < gliverma at westga.edu >
> To: gluster-users at gluster.org
> Sent: Thursday, October 1, 2015 7:59:47 AM
> Subject: [Gluster-users] glusterd crashing
>
> In the last few days I've started having issues with my glusterd
service
> crashing. When it goes down it seems to do so on all nodes in my replicated
> volume. How can I troubleshoot this? I'm on a mix of CentOS 6 and RHEL
6.
> Thanks!
>
>
>
> Gene Liverman
> Systems Integration Architect
> Information Technology Services
> University of West Georgia
> gliverma at westga.edu
>
>
> Sent from Outlook on my iPhone
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

Gene Liverman

2015-Oct-06 14:45 UTC

head link

[Gluster-users] glusterd crashing

Sorry for the delay... they joys of multiple proverbial fires at once. In
/var/log/messages I found this for our most recent crash:

Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: pending
frames:
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]:
patchset: git://git.gluster.com/glusterfs.git
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: signal
received: 6
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: time of
crash:
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]:
2015-10-03 04:26:21
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]:
configuration details:
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: argp 1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]:
backtrace 1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: dlfcn 1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]:
libpthread 1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]:
llistxattr 1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: setfsid 1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: spinlock
1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: epoll.h 1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: xattr.h 1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]:
st_atim.tv_nsec 1
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]:
package-string: glusterfs 3.7.4
Oct  3 00:26:21 eapps-gluster01 etc-glusterfs-glusterd.vol[36992]: ---------


I have posted etc-glusterfs-glusterd.vol.log to http://pastebin.com/Pzq1j5J3.
I also put the core file and an sosreport on my web server for you but
don't want to leave them there for long so I'd appreciate it if
you'd let
me know once you get them. They are at the following url's:
http://www.westga.edu/~gliverma/tmp-files/core.36992
http://www.westga.edu/~gliverma/tmp-files/sosreport-gliverman.gluster-crashing-20151006101239.tar.xz
http://www.westga.edu/~gliverma/tmp-files/sosreport-gliverman.gluster-crashing-20151006101239.tar.xz.md5




Thanks again for the help!
*Gene Liverman*
Systems Integration Architect
Information Technology Services
University of West Georgia
gliverma at westga.edu

ITS: Making Technology Work for You!




On Fri, Oct 2, 2015 at 11:18 AM, Gaurav Garg <ggarg at redhat.com> wrote:
> >> Pulling those logs now but how do I generate the core file you are
> asking
> for?
>
> When there is crash then core file automatically generated based on your
> *ulimit* set option. you can find location of core file in your root or
> current working directory or where ever you have set your core dump file
> location. core file gives you information regarding crash, where exactly
> crash happened.
> you can find appropriate core file by looking at crash time in glusterd
> log's by searching "crash" keyword. you can also paste few
line's just
> above latest "crash" keyword in glusterd logs.
>
> Just for your curiosity if you willing to look where it crash then you can
> debug it by #gdb -c <location of core file> glusterd
>
> Thank you...
>
> Regards,
> Gaurav
>
> ----- Original Message -----
> From: "Gene Liverman" <gliverma at westga.edu>
> To: "Gaurav Garg" <ggarg at redhat.com>
> Cc: "gluster-users" <gluster-users at gluster.org>
> Sent: Friday, October 2, 2015 8:28:49 PM
> Subject: Re: [Gluster-users] glusterd crashing
>
> Pulling those logs now but how do I generate the core file you are asking
> for?
>
>
>
>
>
> --
> *Gene Liverman*
> Systems Integration Architect
> Information Technology Services
> University of West Georgia
> gliverma at westga.edu
> 678.839.5492
>
> ITS: Making Technology Work for You!
>
>
>
>
> On Fri, Oct 2, 2015 at 2:25 AM, Gaurav Garg <ggarg at redhat.com>
wrote:
>
> > Hi Gene,
> >
> > you have paste glustershd log. we asked you to paste glusterd log.
> > glusterd and glustershd both are different process. with this
information
> > we can't find out why your glusterd crashed. could you paste
*glusterd*
> > logs (/var/log/glusterfs/usr-local-etc-glusterfs-glusterd.vol.log*) in
> > pastebin (not in this mail thread) and give the link of pastebin in
this
> > mail thread. Can you also attach core file or you can paste backtrace
of
> > that core dump file.
> > It will be great if you give us sos report of the node where the crash
> > happen.
> >
> > Thanx,
> >
> > ~Gaurav
> >
> > ----- Original Message -----
> > From: "Gene Liverman" <gliverma at westga.edu>
> > To: "gluster-users" <gluster-users at gluster.org>
> > Sent: Friday, October 2, 2015 4:47:00 AM
> > Subject: Re: [Gluster-users] glusterd crashing
> >
> > Sorry for the delay. Here is what's installed:
> > # rpm -qa | grep gluster
> > glusterfs-geo-replication-3.7.4-2.el6.x86_64
> > glusterfs-client-xlators-3.7.4-2.el6.x86_64
> > glusterfs-3.7.4-2.el6.x86_64
> > glusterfs-libs-3.7.4-2.el6.x86_64
> > glusterfs-api-3.7.4-2.el6.x86_64
> > glusterfs-fuse-3.7.4-2.el6.x86_64
> > glusterfs-server-3.7.4-2.el6.x86_64
> > glusterfs-cli-3.7.4-2.el6.x86_64
> >
> > The cmd_history.log file is attached.
> > In gluster.log I have filtered out a bunch of lines like the one below
> due
> > to make them more readable. I had a node down for multiple days due to
> > maintenance and another one went down due to a hardware failure during
> that
> > time too.
> > [2015-10-01 00:16:09.643631] W [MSGID: 114031]
> > [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-gv0-client-0: remote
> > operation failed. Path:
<gfid:31f17f8c-6c96-4440-88c0-f813b3c8d364>
> > (31f17f8c-6c96-4440-88c0-f813b3c8d364) [No such file or directory]
> >
> > I also filtered out a boat load of self heal lines like these two:
> > [2015-10-01 15:14:14.851015] I [MSGID: 108026]
> > [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do]
> 0-gv0-replicate-0:
> > performing metadata selfheal on f78a47db-a359-430d-a655-1d217eb848c3
> > [2015-10-01 15:14:14.856392] I [MSGID: 108026]
> > [afr-self-heal-common.c:651:afr_log_selfheal] 0-gv0-replicate-0:
> Completed
> > metadata selfheal on f78a47db-a359-430d-a655-1d217eb848c3. source=0
> sinks=1
> >
> >
> > [root at eapps-gluster01 glusterfs]# cat glustershd.log |grep -v
'remote
> > operation failed' |grep -v 'self-heal'
> > [2015-09-27 08:46:56.893125] E [rpc-clnt.c:201:call_bail] 0-glusterfs:
> > bailing out frame type(GlusterFS Handshake) op(GETSPEC(2)) xid = 0x6
> sent > > 2015-09-27 08:16:51.742731. timeout = 1800 for
127.0.0.1:24007
> > [2015-09-28 12:54:17.524924] W [socket.c:588:__socket_rwv]
0-glusterfs:
> > readv on 127.0.0.1:24007 failed (Connection reset by peer)
> > [2015-09-28 12:54:27.844374] I
[glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> > 0-glusterfs: No change in volfile, continuing
> > [2015-09-28 12:57:03.485027] W [socket.c:588:__socket_rwv]
> 0-gv0-client-2:
> > readv on 160.10.31.227:24007 failed (Connection reset by peer)
> > [2015-09-28 12:57:05.872973] E [socket.c:2278:socket_connect_finish]
> > 0-gv0-client-2: connection to 160.10.31.227:24007 failed (Connection
> > refused)
> > [2015-09-28 12:57:38.490578] W [socket.c:588:__socket_rwv]
0-glusterfs:
> > readv on 127.0.0.1:24007 failed (No data available)
> > [2015-09-28 12:57:49.054475] I
[glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> > 0-glusterfs: No change in volfile, continuing
> > [2015-09-28 13:01:12.062960] W [glusterfsd.c:1219:cleanup_and_exit]
> > (-->/lib64/libpthread.so.0() [0x3c65e07a51]
> > -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x405e4d]
> > -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-:
received
> > signum (15), shutting down
> > [2015-09-28 13:01:12.981945] I [MSGID: 100030]
[glusterfsd.c:2301:main]
> > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
3.7.4
> > (args: /usr/sbin/glusterfs -s localhost --volfile-id
gluster/glustershd
> -p
> > /var/lib/glusterd/glustershd/run/glustershd.pid -l
> > /var/log/glusterfs/glustershd.log -S
> > /var/run/gluster/9a9819e90404187e84e67b01614bbe10.socket
--xlator-option
> > *replicate*.node-uuid=416d712a-06fc-4b3c-a92f-8c82145626ff)
> > [2015-09-28 13:01:13.009171] I [MSGID: 101190]
> > [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread
> > with index 1
> > [2015-09-28 13:01:13.092483] I [graph.c:269:gf_add_cmdline_options]
> > 0-gv0-replicate-0: adding option 'node-uuid' for volume
'gv0-replicate-0'
> > with value '416d712a-06fc-4b3c-a92f-8c82145626ff'
> > [2015-09-28 13:01:13.100856] I [MSGID: 101190]
> > [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread
> > with index 2
> > [2015-09-28 13:01:13.103995] I [MSGID: 114020] [client.c:2118:notify]
> > 0-gv0-client-0: parent translators are ready, attempting connect on
> > transport
> > [2015-09-28 13:01:13.114745] I [MSGID: 114020] [client.c:2118:notify]
> > 0-gv0-client-1: parent translators are ready, attempting connect on
> > transport
> > [2015-09-28 13:01:13.115725] I [rpc-clnt.c:1851:rpc_clnt_reconfig]
> > 0-gv0-client-0: changing port to 49152 (from 0)
> > [2015-09-28 13:01:13.125619] I [MSGID: 114020] [client.c:2118:notify]
> > 0-gv0-client-2: parent translators are ready, attempting connect on
> > transport
> > [2015-09-28 13:01:13.132316] E [socket.c:2278:socket_connect_finish]
> > 0-gv0-client-1: connection to 160.10.31.64:24007 failed (Connection
> > refused)
> > [2015-09-28 13:01:13.132650] I [MSGID: 114057]
> > [client-handshake.c:1437:select_server_supported_programs]
> 0-gv0-client-0:
> > Using Program GlusterFS 3.3, Num (1298437), Version (330)
> > [2015-09-28 13:01:13.133322] I [MSGID: 114046]
> > [client-handshake.c:1213:client_setvolume_cbk] 0-gv0-client-0:
Connected
> to
> > gv0-client-0, attached to remote volume '/export/sdb1/gv0'.
> > [2015-09-28 13:01:13.133365] I [MSGID: 114047]
> > [client-handshake.c:1224:client_setvolume_cbk] 0-gv0-client-0: Server
and
> > Client lk-version numbers are not same, reopening the fds
> > [2015-09-28 13:01:13.133782] I [MSGID: 108005]
> > [afr-common.c:3998:afr_notify] 0-gv0-replicate-0: Subvolume
> 'gv0-client-0'
> > came back up; going online.
> > [2015-09-28 13:01:13.133863] I [MSGID: 114035]
> > [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-0:
Server
> > lk version = 1
> > Final graph:
> >
> >
>
+------------------------------------------------------------------------------+
> > 1: volume gv0-client-0
> > 2: type protocol/client
> > 3: option clnt-lk-version 1
> > 4: option volfile-checksum 0
> > 5: option volfile-key gluster/glustershd
> > 6: option client-version 3.7.4
> > 7: option process-uuid
> > eapps-gluster01-65147-2015/09/28-13:01:12:970131-gv0-client-0-0-0
> > 8: option fops-version 1298437
> > 9: option ping-timeout 42
> > 10: option remote-host eapps-gluster01.uwg.westga.edu
> > 11: option remote-subvolume /export/sdb1/gv0
> > 12: option transport-type socket
> > 13: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> > 14: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> > 15: end-volume
> > 16:
> > 17: volume gv0-client-1
> > 18: type protocol/client
> > 19: option ping-timeout 42
> > 20: option remote-host eapps-gluster02.uwg.westga.edu
> > 21: option remote-subvolume /export/sdb1/gv0
> > 22: option transport-type socket
> > 23: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> > 24: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> > 25: end-volume
> > 26:
> > 27: volume gv0-client-2
> > 28: type protocol/client
> > 29: option ping-timeout 42
> > 30: option remote-host eapps-gluster03.uwg.westga.edu
> > 31: option remote-subvolume /export/sdb1/gv0
> > 32: option transport-type socket
> > 33: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> > 34: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> > 35: end-volume
> > 36:
> > 37: volume gv0-replicate-0
> > 38: type cluster/replicate
> > 39: option node-uuid 416d712a-06fc-4b3c-a92f-8c82145626ff
> > 46: subvolumes gv0-client-0 gv0-client-1 gv0-client-2
> > 47: end-volume
> > 48:
> > 49: volume glustershd
> > 50: type debug/io-stats
> > 51: subvolumes gv0-replicate-0
> > 52: end-volume
> > 53:
> >
> >
>
+------------------------------------------------------------------------------+
> > [2015-09-28 13:01:13.154898] E [MSGID: 114058]
> > [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-2:
failed
> > to get the port number for remote subvolume. Please run 'gluster
volume
> > status' on server to see if brick process is running.
> > [2015-09-28 13:01:13.155031] I [MSGID: 114018]
> > [client.c:2042:client_rpc_notify] 0-gv0-client-2: disconnected from
> > gv0-client-2. Client process will keep trying to connect to glusterd
> until
> > brick's port is available
> > [2015-09-28 13:01:13.155080] W [MSGID: 108001]
> > [afr-common.c:4081:afr_notify] 0-gv0-replicate-0: Client-quorum is not
> met
> > [2015-09-29 08:11:24.728797] I [MSGID: 100011]
> > [glusterfsd.c:1291:reincarnate] 0-glusterfsd: Fetching the volume file
> from
> > server...
> > [2015-09-29 08:11:24.763338] I
[glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> > 0-glusterfs: No change in volfile, continuing
> > [2015-09-29 12:50:41.915254] E [rpc-clnt.c:201:call_bail]
0-gv0-client-2:
> > bailing out frame type(GF-DUMP) op(DUMP(1)) xid = 0xd91f sent >
2015-09-29
> > 12:20:36.092734. timeout = 1800 for 160.10.31.227:24007
> > [2015-09-29 12:50:41.923550] W [MSGID: 114032]
> > [client-handshake.c:1623:client_dump_version_cbk] 0-gv0-client-2:
> received
> > RPC status error [Transport endpoint is not connected]
> > [2015-09-30 23:54:36.547979] W [socket.c:588:__socket_rwv]
0-glusterfs:
> > readv on 127.0.0.1:24007 failed (No data available)
> > [2015-09-30 23:54:46.812870] E [socket.c:2278:socket_connect_finish]
> > 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection refused)
> > [2015-10-01 00:14:20.997081] I
[glusterfsd-mgmt.c:1512:mgmt_getspec_cbk]
> > 0-glusterfs: No change in volfile, continuing
> > [2015-10-01 00:15:36.770579] W [socket.c:588:__socket_rwv]
> 0-gv0-client-2:
> > readv on 160.10.31.227:24007 failed (Connection reset by peer)
> > [2015-10-01 00:15:37.906708] E [socket.c:2278:socket_connect_finish]
> > 0-gv0-client-2: connection to 160.10.31.227:24007 failed (Connection
> > refused)
> > [2015-10-01 00:15:53.008130] W [glusterfsd.c:1219:cleanup_and_exit]
> > (-->/lib64/libpthread.so.0() [0x3b91807a51]
> > -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x405e4d]
> > -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-:
received
> > signum (15), shutting down
> > [2015-10-01 00:15:53.008697] I [timer.c:48:gf_timer_call_after]
> > (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3e2) [0x3b9480f992]
> > -->/usr/lib64/libgfrpc.so.0(__save_frame+0x76) [0x3b9480f046]
> > -->/usr/lib64/libglusterfs.so.0(gf_timer_call_after+0x1b1)
> [0x3b93447881] )
> > 0-timer: ctx cleanup started
> > [2015-10-01 00:15:53.994698] I [MSGID: 100030]
[glusterfsd.c:2301:main]
> > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
3.7.4
> > (args: /usr/sbin/glusterfs -s localhost --volfile-id
gluster/glustershd
> -p
> > /var/lib/glusterd/glustershd/run/glustershd.pid -l
> > /var/log/glusterfs/glustershd.log -S
> > /var/run/gluster/9a9819e90404187e84e67b01614bbe10.socket
--xlator-option
> > *replicate*.node-uuid=416d712a-06fc-4b3c-a92f-8c82145626ff)
> > [2015-10-01 00:15:54.020401] I [MSGID: 101190]
> > [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread
> > with index 1
> > [2015-10-01 00:15:54.086777] I [graph.c:269:gf_add_cmdline_options]
> > 0-gv0-replicate-0: adding option 'node-uuid' for volume
'gv0-replicate-0'
> > with value '416d712a-06fc-4b3c-a92f-8c82145626ff'
> > [2015-10-01 00:15:54.093004] I [MSGID: 101190]
> > [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread
> > with index 2
> > [2015-10-01 00:15:54.098144] I [MSGID: 114020] [client.c:2118:notify]
> > 0-gv0-client-0: parent translators are ready, attempting connect on
> > transport
> > [2015-10-01 00:15:54.107432] I [MSGID: 114020] [client.c:2118:notify]
> > 0-gv0-client-1: parent translators are ready, attempting connect on
> > transport
> > [2015-10-01 00:15:54.115962] I [MSGID: 114020] [client.c:2118:notify]
> > 0-gv0-client-2: parent translators are ready, attempting connect on
> > transport
> > [2015-10-01 00:15:54.120474] E [socket.c:2278:socket_connect_finish]
> > 0-gv0-client-1: connection to 160.10.31.64:24007 failed (Connection
> > refused)
> > [2015-10-01 00:15:54.120639] I [rpc-clnt.c:1851:rpc_clnt_reconfig]
> > 0-gv0-client-0: changing port to 49152 (from 0)
> > Final graph:
> >
> >
>
+------------------------------------------------------------------------------+
> > 1: volume gv0-client-0
> > 2: type protocol/client
> > 3: option ping-timeout 42
> > 4: option remote-host eapps-gluster01.uwg.westga.edu
> > 5: option remote-subvolume /export/sdb1/gv0
> > 6: option transport-type socket
> > 7: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> > 8: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> > 9: end-volume
> > 10:
> > 11: volume gv0-client-1
> > 12: type protocol/client
> > 13: option ping-timeout 42
> > 14: option remote-host eapps-gluster02.uwg.westga.edu
> > 15: option remote-subvolume /export/sdb1/gv0
> > 16: option transport-type socket
> > 17: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> > 18: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> > 19: end-volume
> > 20:
> > 21: volume gv0-client-2
> > 22: type protocol/client
> > 23: option ping-timeout 42
> > 24: option remote-host eapps-gluster03.uwg.westga.edu
> > 25: option remote-subvolume /export/sdb1/gv0
> > 26: option transport-type socket
> > 27: option username 0005f8fa-107a-4cc8-ac38-bb821c014c14
> > 28: option password 379bae9a-6529-4564-a6f5-f5a9f7424d01
> > 29: end-volume
> > 30:
> > 31: volume gv0-replicate-0
> > 32: type cluster/replicate
> > 33: option node-uuid 416d712a-06fc-4b3c-a92f-8c82145626ff
> > 40: subvolumes gv0-client-0 gv0-client-1 gv0-client-2
> > 41: end-volume
> > 42:
> > 43: volume glustershd
> > 44: type debug/io-stats
> > 45: subvolumes gv0-replicate-0
> > 46: end-volume
> > 47:
> >
> >
>
+------------------------------------------------------------------------------+
> > [2015-10-01 00:15:54.135650] I [MSGID: 114057]
> > [client-handshake.c:1437:select_server_supported_programs]
> 0-gv0-client-0:
> > Using Program GlusterFS 3.3, Num (1298437), Version (330)
> > [2015-10-01 00:15:54.136223] I [MSGID: 114046]
> > [client-handshake.c:1213:client_setvolume_cbk] 0-gv0-client-0:
Connected
> to
> > gv0-client-0, attached to remote volume '/export/sdb1/gv0'.
> > [2015-10-01 00:15:54.136262] I [MSGID: 114047]
> > [client-handshake.c:1224:client_setvolume_cbk] 0-gv0-client-0: Server
and
> > Client lk-version numbers are not same, reopening the fds
> > [2015-10-01 00:15:54.136410] I [MSGID: 108005]
> > [afr-common.c:3998:afr_notify] 0-gv0-replicate-0: Subvolume
> 'gv0-client-0'
> > came back up; going online.
> > [2015-10-01 00:15:54.136500] I [MSGID: 114035]
> > [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-0:
Server
> > lk version = 1
> > [2015-10-01 00:15:54.401702] E [MSGID: 114058]
> > [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-2:
failed
> > to get the port number for remote subvolume. Please run 'gluster
volume
> > status' on server to see if brick process is running.
> > [2015-10-01 00:15:54.401834] I [MSGID: 114018]
> > [client.c:2042:client_rpc_notify] 0-gv0-client-2: disconnected from
> > gv0-client-2. Client process will keep trying to connect to glusterd
> until
> > brick's port is available
> > [2015-10-01 00:15:54.401878] W [MSGID: 108001]
> > [afr-common.c:4081:afr_notify] 0-gv0-replicate-0: Client-quorum is not
> met
> > [2015-10-01 03:57:52.755426] E [socket.c:2278:socket_connect_finish]
> > 0-gv0-client-2: connection to 160.10.31.227:24007 failed (Connection
> > refused)
> > [2015-10-01 13:50:49.000708] E [socket.c:2278:socket_connect_finish]
> > 0-gv0-client-2: connection to 160.10.31.227:24007 failed (Connection
> > timed out)
> > [2015-10-01 14:36:40.481673] E [MSGID: 114058]
> > [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-1:
failed
> > to get the port number for remote subvolume. Please run 'gluster
volume
> > status' on server to see if brick process is running.
> > [2015-10-01 14:36:40.481833] I [MSGID: 114018]
> > [client.c:2042:client_rpc_notify] 0-gv0-client-1: disconnected from
> > gv0-client-1. Client process will keep trying to connect to glusterd
> until
> > brick's port is available
> > [2015-10-01 14:36:41.982037] I [rpc-clnt.c:1851:rpc_clnt_reconfig]
> > 0-gv0-client-1: changing port to 49152 (from 0)
> > [2015-10-01 14:36:41.993478] I [MSGID: 114057]
> > [client-handshake.c:1437:select_server_supported_programs]
> 0-gv0-client-1:
> > Using Program GlusterFS 3.3, Num (1298437), Version (330)
> > [2015-10-01 14:36:41.994568] I [MSGID: 114046]
> > [client-handshake.c:1213:client_setvolume_cbk] 0-gv0-client-1:
Connected
> to
> > gv0-client-1, attached to remote volume '/export/sdb1/gv0'.
> > [2015-10-01 14:36:41.994647] I [MSGID: 114047]
> > [client-handshake.c:1224:client_setvolume_cbk] 0-gv0-client-1: Server
and
> > Client lk-version numbers are not same, reopening the fds
> > [2015-10-01 14:36:41.994899] I [MSGID: 108002]
> > [afr-common.c:4077:afr_notify] 0-gv0-replicate-0: Client-quorum is met
> > [2015-10-01 14:36:42.002275] I [MSGID: 114035]
> > [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-1:
Server
> > lk version = 1
> >
> >
> >
> >
> > Thanks,
> > Gene Liverman
> > Systems Integration Architect
> > Information Technology Services
> > University of West Georgia
> > gliverma at westga.edu
> >
> > ITS: Making Technology Work for You!
> >
> >
> >
> > On Wed, Sep 30, 2015 at 10:54 PM, Gaurav Garg < ggarg at redhat.com
>
> wrote:
> >
> >
> > Hi Gene,
> >
> > Could you paste or attach core file/glusterd log file/cmd history to
find
> > out actual RCA of the crash. What steps you performed for this crash.
> >
> > >> How can I troubleshoot this?
> >
> > If you want to troubleshoot this then you can look into the glusterd
log
> > file, core file.
> >
> > Thank you..
> >
> > Regards,
> > Gaurav
> >
> > ----- Original Message -----
> > From: "Gene Liverman" < gliverma at westga.edu >
> > To: gluster-users at gluster.org
> > Sent: Thursday, October 1, 2015 7:59:47 AM
> > Subject: [Gluster-users] glusterd crashing
> >
> > In the last few days I've started having issues with my glusterd
service
> > crashing. When it goes down it seems to do so on all nodes in my
> replicated
> > volume. How can I troubleshoot this? I'm on a mix of CentOS 6 and
RHEL 6.
> > Thanks!
> >
> >
> >
> > Gene Liverman
> > Systems Integration Architect
> > Information Technology Services
> > University of West Georgia
> > gliverma at westga.edu
> >
> >
> > Sent from Outlook on my iPhone
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151006/be2f6d89/attachment.html>

Gluster users - Oct 2015 - glusterd crashing

[Gluster-users] glusterd crashing

[Gluster-users] glusterd crashing