thr3ads.net - Gluster users - [Gluster-users] Understanding client logs [Jan 2018]

If this information is useful, please help other people find it:
Share via:

Marcus Pedersén

2018-Jan-23 14:13 UTC

[Gluster-users] Understanding client logs

Hi all,
I have problem pin pointing an error, that users of
my system experience processes that crash.
The thing that have changed since the craches started
is that I added a gluster cluster.
Of cause the users start to attack my gluster cluster.

I started looking at logs, starting from the client side.
I just need help to understand how to read it in the right way.
I can see that every ten minutes the client changes port and
attach to the remote volume. About five minutes later
the client unmounts the volume.
I guess that this is the "old" mount and that the "new"
mount
is already responding to user interaction?

As this repeates every ten minutes I see this as normal behavior
and just want to get a better understanding on how the client
interacts with the cluster.

Have you experienced that this switch malfunctions and the
mount becomes unreachable for a while?

Many thanks in advance!

Best regards
Marcus Peder?n

An example of the output:
[2017-11-09 10:10:39.776403] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
0-interbull-interbull-client-1: changing port to 49160 (from 0)
[2017-11-09 10:10:39.776830] I [MSGID: 114057]
[client-handshake.c:1451:select_server_supported_programs]
0-interbull-interbull-client-0: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2017-11-09 10:10:39.777642] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk] 0-interbull-interbull-client-0:
Connected to interbull-interbull-client-0, attached to remote volume
'/interbullfs/i\
nterbull'.
[2017-11-09 10:10:39.777663] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk] 0-interbull-interbull-client-0:
Server and Client lk-version numbers are not same, reopening the fds
[2017-11-09 10:10:39.777724] I [MSGID: 108005] [afr-common.c:4756:afr_notify]
0-interbull-interbull-replicate-0: Subvolume
'interbull-interbull-client-0' came back up; going online.
[2017-11-09 10:10:39.777954] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk]
0-interbull-interbull-client-0: Server lk version = 1
[2017-11-09 10:10:39.779909] I [MSGID: 114057]
[client-handshake.c:1451:select_server_supported_programs]
0-interbull-interbull-client-1: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2017-11-09 10:10:39.780481] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk] 0-interbull-interbull-client-1:
Connected to interbull-interbull-client-1, attached to remote volume
'/interbullfs/i\
nterbull'.
[2017-11-09 10:10:39.780509] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk] 0-interbull-interbull-client-1:
Server and Client lk-version numbers are not same, reopening the fds
[2017-11-09 10:10:39.781544] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk]
0-interbull-interbull-client-1: Server lk version = 1
[2017-11-09 10:10:39.781608] I [fuse-bridge.c:4146:fuse_init] 0-glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.24 kernel 7.26
[2017-11-09 10:10:39.781632] I [fuse-bridge.c:4831:fuse_graph_sync] 0-fuse:
switched to graph 0
[2017-11-09 10:16:10.609922] I [fuse-bridge.c:5089:fuse_thread_proc] 0-fuse:
unmounting /interbull
[2017-11-09 10:16:10.610258] W [glusterfsd.c:1329:cleanup_and_exit]
(-->/usr/lib/libpthread.so.0(+0x72e7) [0x7f98c02282e7]
-->/usr/bin/glusterfs(glusterfs_sigwaiter+0xdd) [0x40890d]
-->/usr/bin/glusterfs(cleanu\
p_and_exit+0x4b) [0x40878b] ) 0-: received signum (15), shutting down
[2017-11-09 10:16:10.610290] I [fuse-bridge.c:5802:fini] 0-fuse: Unmounting
'/interbull'.
[2017-11-09 10:20:39.752079] I [MSGID: 100030] [glusterfsd.c:2460:main]
0-/usr/bin/glusterfs: Started running /usr/bin/glusterfs version 3.10.1 (args:
/usr/bin/glusterfs --negative-timeout=60 --volfile-server=1\
92.168.67.31 --volfile-id=/interbull-interbull /interbull)
[2017-11-09 10:20:39.763902] I [MSGID: 101190]
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 1
[2017-11-09 10:20:39.768738] I [afr.c:94:fix_quorum_options]
0-interbull-interbull-replicate-0: reindeer: incoming qtype = none
[2017-11-09 10:20:39.768756] I [afr.c:116:fix_quorum_options]
0-interbull-interbull-replicate-0: reindeer: quorum_count = 0
[2017-11-09 10:20:39.768856] W [MSGID: 108040]
[afr.c:315:afr_pending_xattrs_init] 0-interbull-interbull-replicate-0: Unable to
fetch afr-pending-xattr option from volfile. Falling back to using client
translat\
or names.
[2017-11-09 10:20:39.769832] I [MSGID: 101190]
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 2
[2017-11-09 10:20:39.770193] I [MSGID: 114020] [client.c:2352:notify]
0-interbull-interbull-client-0: parent translators are ready, attempting connect
on transport
[2017-11-09 10:20:39.773109] I [MSGID: 114020] [client.c:2352:notify]
0-interbull-interbull-client-1: parent translators are ready, attempting connect
on transport
[2017-11-09 10:20:39.773712] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
0-interbull-interbull-client-0: changing port to 49177 (from 0)


-- 
**************************************************
* Marcus Peders?n                                * 
* System administrator                           *
**************************************************
* Interbull Centre                               *
* ================                               *
* Department of Animal Breeding & Genetics ? SLU *
* Box 7023, SE-750 07                            *
* Uppsala, Sweden                                *
**************************************************
* Visiting address:                              *
* Room 55614, Ulls v?g 26, Ultuna                *
* Uppsala                                        *
* Sweden                                         *
*                                                *
* Tel: +46-(0)18-67 1962                         *
*                                                *
**************************************************
*     ISO 9001 Bureau Veritas No SE004561-1      *
**************************************************

Milind Changire

2018-Jan-23 14:46 UTC

head link

[Gluster-users] Understanding client logs

Marcus,
Please paste the name-version-release of the primary glusterfs package on
your system.

If possible, also describe the typical workload that happens at the mount
via the user application.



On Tue, Jan 23, 2018 at 7:43 PM, Marcus Peders?n <marcus.pedersen at
slu.se>
wrote:
> Hi all,
> I have problem pin pointing an error, that users of
> my system experience processes that crash.
> The thing that have changed since the craches started
> is that I added a gluster cluster.
> Of cause the users start to attack my gluster cluster.
>
> I started looking at logs, starting from the client side.
> I just need help to understand how to read it in the right way.
> I can see that every ten minutes the client changes port and
> attach to the remote volume. About five minutes later
> the client unmounts the volume.
> I guess that this is the "old" mount and that the "new"
mount
> is already responding to user interaction?
>
> As this repeates every ten minutes I see this as normal behavior
> and just want to get a better understanding on how the client
> interacts with the cluster.
>
> Have you experienced that this switch malfunctions and the
> mount becomes unreachable for a while?
>
> Many thanks in advance!
>
> Best regards
> Marcus Peder?n
>
> An example of the output:
> [2017-11-09 10:10:39.776403] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
> 0-interbull-interbull-client-1: changing port to 49160 (from 0)
> [2017-11-09 10:10:39.776830] I [MSGID: 114057] [client-handshake.c:1451:
> select_server_supported_programs] 0-interbull-interbull-client-0: Using
> Program GlusterFS 3.3, Num (1298437), Version (330)
> [2017-11-09 10:10:39.777642] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk]
> 0-interbull-interbull-client-0: Connected to
> interbull-interbull-client-0, attached to remote volume
'/interbullfs/i\
> nterbull'.
> [2017-11-09 10:10:39.777663] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk]
> 0-interbull-interbull-client-0: Server and Client lk-version numbers are
> not same, reopening the fds
> [2017-11-09 10:10:39.777724] I [MSGID: 108005]
> [afr-common.c:4756:afr_notify] 0-interbull-interbull-replicate-0:
> Subvolume 'interbull-interbull-client-0' came back up; going
online.
> [2017-11-09 10:10:39.777954] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk]
> 0-interbull-interbull-client-0: Server lk version = 1
> [2017-11-09 10:10:39.779909] I [MSGID: 114057] [client-handshake.c:1451:
> select_server_supported_programs] 0-interbull-interbull-client-1: Using
> Program GlusterFS 3.3, Num (1298437), Version (330)
> [2017-11-09 10:10:39.780481] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk]
> 0-interbull-interbull-client-1: Connected to
> interbull-interbull-client-1, attached to remote volume
'/interbullfs/i\
> nterbull'.
> [2017-11-09 10:10:39.780509] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk]
> 0-interbull-interbull-client-1: Server and Client lk-version numbers are
> not same, reopening the fds
> [2017-11-09 10:10:39.781544] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk]
> 0-interbull-interbull-client-1: Server lk version = 1
> [2017-11-09 10:10:39.781608] I [fuse-bridge.c:4146:fuse_init]
> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel
> 7.26
> [2017-11-09 10:10:39.781632] I [fuse-bridge.c:4831:fuse_graph_sync]
> 0-fuse: switched to graph 0
> [2017-11-09 10:16:10.609922] I [fuse-bridge.c:5089:fuse_thread_proc]
> 0-fuse: unmounting /interbull
> [2017-11-09 10:16:10.610258] W [glusterfsd.c:1329:cleanup_and_exit]
> (-->/usr/lib/libpthread.so.0(+0x72e7) [0x7f98c02282e7]
> -->/usr/bin/glusterfs(glusterfs_sigwaiter+0xdd) [0x40890d]
> -->/usr/bin/glusterfs(cleanu\
> p_and_exit+0x4b) [0x40878b] ) 0-: received signum (15), shutting down
> [2017-11-09 10:16:10.610290] I [fuse-bridge.c:5802:fini] 0-fuse:
> Unmounting '/interbull'.
> [2017-11-09 10:20:39.752079] I [MSGID: 100030] [glusterfsd.c:2460:main]
> 0-/usr/bin/glusterfs: Started running /usr/bin/glusterfs version 3.10.1
> (args: /usr/bin/glusterfs --negative-timeout=60 --volfile-server=1\
> 92.168.67.31 --volfile-id=/interbull-interbull /interbull)
> [2017-11-09 10:20:39.763902] I [MSGID: 101190]
[event-epoll.c:629:event_dispatch_epoll_worker]
> 0-epoll: Started thread with index 1
> [2017-11-09 10:20:39.768738] I [afr.c:94:fix_quorum_options]
> 0-interbull-interbull-replicate-0: reindeer: incoming qtype = none
> [2017-11-09 10:20:39.768756] I [afr.c:116:fix_quorum_options]
> 0-interbull-interbull-replicate-0: reindeer: quorum_count = 0
> [2017-11-09 10:20:39.768856] W [MSGID: 108040]
> [afr.c:315:afr_pending_xattrs_init] 0-interbull-interbull-replicate-0:
> Unable to fetch afr-pending-xattr option from volfile. Falling back to
> using client translat\
> or names.
> [2017-11-09 10:20:39.769832] I [MSGID: 101190]
[event-epoll.c:629:event_dispatch_epoll_worker]
> 0-epoll: Started thread with index 2
> [2017-11-09 10:20:39.770193] I [MSGID: 114020] [client.c:2352:notify]
> 0-interbull-interbull-client-0: parent translators are ready, attempting
> connect on transport
> [2017-11-09 10:20:39.773109] I [MSGID: 114020] [client.c:2352:notify]
> 0-interbull-interbull-client-1: parent translators are ready, attempting
> connect on transport
> [2017-11-09 10:20:39.773712] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
> 0-interbull-interbull-client-0: changing port to 49177 (from 0)
>
>
> --
> **************************************************
> * Marcus Peders?n                                *
> * System administrator                           *
> **************************************************
> * Interbull Centre                               *
> * ================                               *
> * Department of Animal Breeding & Genetics ? SLU *
> * Box 7023, SE-750 07                            *
> * Uppsala, Sweden                                *
> **************************************************
> * Visiting address:                              *
> * Room 55614, Ulls v?g 26, Ultuna                *
> * Uppsala                                        *
> * Sweden                                         *
> *                                                *
> * Tel: +46-(0)18-67 1962                         *
> *                                                *
> **************************************************
> *     ISO 9001 Bureau Veritas No SE004561-1      *
> **************************************************
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Milind
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180123/a2a1fcea/attachment.html>

Marcus Pedersén

2018-Jan-23 19:49 UTC

head link

[Gluster-users] Understanding client logs

Hi,
Yes, of cause...should have included it from start.
Yes, I know an old version, but I will rebuild a new cluster later on,
that is another story.

Client side:
Archlinux
glusterfs 1:3.10.1-1

Sever side:
Replicated cluster on two physical machines.
Both running:
Centos 7 3.10.0-514.16.1.el7.x86_64
Gluster glusterfs 3.8.11 from centos-gluster38

Typical user case(the one we have problem with now; typical):
Our users handle genomic evaluations, where loads of calculations
are done, intermediate results are saved to files (MB-GB size and
up to a hundred files),
and used for next calculation step where it is read from file,
calculated, written to file aso. a couple of times.
The lenght of these processes are about 8-12 hours and up to
processes running for up til about 72-96 hours.
For this run we had 12 clients (all connected to gluster and all
file read/writes done to gluster). On each client we had assign
3 cores to be used to run the processes, and most of the time all
3 cores were beeing used on all 12 clients.

Regards
Marcus



________________________________
Fr?n: Milind Changire <mchangir at redhat.com>
Skickat: den 23 januari 2018 15:46
Till: Marcus Peders?n
Kopia: Gluster Users
?mne: Re: [Gluster-users] Understanding client logs

Marcus,
Please paste the name-version-release of the primary glusterfs package on your
system.

If possible, also describe the typical workload that happens at the mount via
the user application.



On Tue, Jan 23, 2018 at 7:43 PM, Marcus Peders?n <marcus.pedersen at
slu.se<mailto:marcus.pedersen at slu.se>> wrote:
Hi all,
I have problem pin pointing an error, that users of
my system experience processes that crash.
The thing that have changed since the craches started
is that I added a gluster cluster.
Of cause the users start to attack my gluster cluster.

I started looking at logs, starting from the client side.
I just need help to understand how to read it in the right way.
I can see that every ten minutes the client changes port and
attach to the remote volume. About five minutes later
the client unmounts the volume.
I guess that this is the "old" mount and that the "new"
mount
is already responding to user interaction?

As this repeates every ten minutes I see this as normal behavior
and just want to get a better understanding on how the client
interacts with the cluster.

Have you experienced that this switch malfunctions and the
mount becomes unreachable for a while?

Many thanks in advance!

Best regards
Marcus Peder?n

An example of the output:
[2017-11-09 10:10:39.776403] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
0-interbull-interbull-client-1: changing port to 49160 (from 0)
[2017-11-09 10:10:39.776830] I [MSGID: 114057]
[client-handshake.c:1451:select_server_supported_programs]
0-interbull-interbull-client-0: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2017-11-09 10:10:39.777642] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk] 0-interbull-interbull-client-0:
Connected to interbull-interbull-client-0, attached to remote volume
'/interbullfs/i\
nterbull'.
[2017-11-09 10:10:39.777663] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk] 0-interbull-interbull-client-0:
Server and Client lk-version numbers are not same, reopening the fds
[2017-11-09 10:10:39.777724] I [MSGID: 108005] [afr-common.c:4756:afr_notify]
0-interbull-interbull-replicate-0: Subvolume
'interbull-interbull-client-0' came back up; going online.
[2017-11-09 10:10:39.777954] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk]
0-interbull-interbull-client-0: Server lk version = 1
[2017-11-09 10:10:39.779909] I [MSGID: 114057]
[client-handshake.c:1451:select_server_supported_programs]
0-interbull-interbull-client-1: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2017-11-09 10:10:39.780481] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk] 0-interbull-interbull-client-1:
Connected to interbull-interbull-client-1, attached to remote volume
'/interbullfs/i\
nterbull'.
[2017-11-09 10:10:39.780509] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk] 0-interbull-interbull-client-1:
Server and Client lk-version numbers are not same, reopening the fds
[2017-11-09 10:10:39.781544] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk]
0-interbull-interbull-client-1: Server lk version = 1
[2017-11-09 10:10:39.781608] I [fuse-bridge.c:4146:fuse_init] 0-glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.24 kernel 7.26
[2017-11-09 10:10:39.781632] I [fuse-bridge.c:4831:fuse_graph_sync] 0-fuse:
switched to graph 0
[2017-11-09 10:16:10.609922] I [fuse-bridge.c:5089:fuse_thread_proc] 0-fuse:
unmounting /interbull
[2017-11-09 10:16:10.610258] W [glusterfsd.c:1329:cleanup_and_exit]
(-->/usr/lib/libpthread.so.0(+0x72e7) [0x7f98c02282e7]
-->/usr/bin/glusterfs(glusterfs_sigwaiter+0xdd) [0x40890d]
-->/usr/bin/glusterfs(cleanu\
p_and_exit+0x4b) [0x40878b] ) 0-: received signum (15), shutting down
[2017-11-09 10:16:10.610290] I [fuse-bridge.c:5802:fini] 0-fuse: Unmounting
'/interbull'.
[2017-11-09 10:20:39.752079] I [MSGID: 100030] [glusterfsd.c:2460:main]
0-/usr/bin/glusterfs: Started running /usr/bin/glusterfs version 3.10.1 (args:
/usr/bin/glusterfs --negative-timeout=60 --volfile-server=1\
92.168.67.31 --volfile-id=/interbull-interbull /interbull)
[2017-11-09 10:20:39.763902] I [MSGID: 101190]
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 1
[2017-11-09 10:20:39.768738] I [afr.c:94:fix_quorum_options]
0-interbull-interbull-replicate-0: reindeer: incoming qtype = none
[2017-11-09 10:20:39.768756] I [afr.c:116:fix_quorum_options]
0-interbull-interbull-replicate-0: reindeer: quorum_count = 0
[2017-11-09 10:20:39.768856] W [MSGID: 108040]
[afr.c:315:afr_pending_xattrs_init] 0-interbull-interbull-replicate-0: Unable to
fetch afr-pending-xattr option from volfile. Falling back to using client
translat\
or names.
[2017-11-09 10:20:39.769832] I [MSGID: 101190]
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 2
[2017-11-09 10:20:39.770193] I [MSGID: 114020] [client.c:2352:notify]
0-interbull-interbull-client-0: parent translators are ready, attempting connect
on transport
[2017-11-09 10:20:39.773109] I [MSGID: 114020] [client.c:2352:notify]
0-interbull-interbull-client-1: parent translators are ready, attempting connect
on transport
[2017-11-09 10:20:39.773712] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
0-interbull-interbull-client-0: changing port to 49177 (from 0)


--
**************************************************
* Marcus Peders?n                                *
* System administrator                           *
**************************************************
* Interbull Centre                               *
* ================                               *
* Department of Animal Breeding & Genetics - SLU *
* Box 7023, SE-750 07                            *
* Uppsala, Sweden                                *
**************************************************
* Visiting address:                              *
* Room 55614, Ulls v?g 26, Ultuna                *
* Uppsala                                        *
* Sweden                                         *
*                                                *
* Tel: +46-(0)18-67 1962                         *
*                                                *
**************************************************
*     ISO 9001 Bureau Veritas No SE004561-1      *
**************************************************
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
http://lists.gluster.org/mailman/listinfo/gluster-users



--
Milind

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180123/19fb56dd/attachment.html>

Reasonably Related Threads

Search for more possibly parallel threads

Gluster users - Jan 2018 - Understanding client logs

[Gluster-users] Understanding client logs

[Gluster-users] Understanding client logs

[Gluster-users] Understanding client logs

Reasonably Related Threads