thr3ads.net - Gluster users - [Gluster-users] 3.6.3 + fuse mount [May 2015]

If this information is useful, please help other people find it:
Share via:

David Robinson

2015-May-05 00:23 UTC

[Gluster-users] 3.6.3 + fuse mount

I upgrade my systems to 3.6.3 and some of my clients are now having 
issues connecting.  I can mount using NFS without any issues.  However, 
when I try to FUSE mount, it times out on many of my nodes.  It mounted 
to approximately 400-nodes.  However, the remainder timed out.  Any 
suggestions for how to fix?

On the client side, I am getting the following in the logs:

[2015-05-05 00:17:18.013319] I [MSGID: 100030] [glusterfsd.c:2018:main] 
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.3
(args: /usr/sbin/glusterfs --volfile-server=gfsib01a.corvidtec.com 
--volfile-server-transport=tcp --volfile-id=/homegfs.tcp /homegfs_test)
[2015-05-05 00:18:21.019012] E [socket.c:2276:socket_connect_finish] 
0-glusterfs: connection to 10.1.70.1:24007 failed (Connection timed out)
[2015-05-05 00:18:21.019092] E [glusterfsd-mgmt.c:1811:mgmt_rpc_notify] 
0-glusterfsd-mgmt: failed to connect with remote-host: 
gfsib01a.corvidtec
.com (Transport endpoint is not connected)
[2015-05-05 00:18:21.019100] I [glusterfsd-mgmt.c:1817:mgmt_rpc_notify] 
0-glusterfsd-mgmt: Exhausted all volfile servers
[2015-05-05 00:18:21.019224] W [glusterfsd.c:1194:cleanup_and_exit] (--> 
0-: received signum (1), shutting down
[2015-05-05 00:18:21.019239] I [fuse-bridge.c:5599:fini] 0-fuse: 
Unmounting '/homegfs_test'.
[2015-05-05 00:18:21.027770] W [glusterfsd.c:1194:cleanup_and_exit] (--> 
0-: received signum (15), shutting down
Logs from my server are attached...

[root at gfs01a log]# gluster volume status homegfs
Status of volume: homegfs
Gluster process                                         Port    Online  
Pid
------------------------------------------------------------------------------
Brick gfsib01a.corvidtec.com:/data/brick01a/homegfs     49152   Y       
3816
Brick gfsib01b.corvidtec.com:/data/brick01b/homegfs     49152   Y       
3826
Brick gfsib01a.corvidtec.com:/data/brick02a/homegfs     49153   Y       
3821
Brick gfsib01b.corvidtec.com:/data/brick02b/homegfs     49153   Y       
3831
Brick gfsib02a.corvidtec.com:/data/brick01a/homegfs     49152   Y       
3959
Brick gfsib02b.corvidtec.com:/data/brick01b/homegfs     49152   Y       
3970
Brick gfsib02a.corvidtec.com:/data/brick02a/homegfs     49153   Y       
3964
Brick gfsib02b.corvidtec.com:/data/brick02b/homegfs     49153   Y       
3975
NFS Server on localhost                                 2049    Y       
3830
Self-heal Daemon on localhost                           N/A     Y       
3835
NFS Server on gfsib01b.corvidtec.com                    2049    Y       
3840
Self-heal Daemon on gfsib01b.corvidtec.com              N/A     Y       
3845
NFS Server on gfsib02b.corvidtec.com                    2049    Y       
3984
Self-heal Daemon on gfsib02b.corvidtec.com              N/A     Y       
3989
NFS Server on gfsib02a.corvidtec.com                    2049    Y       
3973
Self-heal Daemon on gfsib02a.corvidtec.com              N/A     Y       
3978

Task Status of Volume homegfs
------------------------------------------------------------------------------
Task                 : Rebalance
ID                   : 58b6cc76-c29c-4695-93fe-c42b1112e171
Status               : completed


[root at gfs01a log]# gluster volume info homegfs

Volume Name: homegfs
Type: Distributed-Replicate
Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
Options Reconfigured:
server.manage-gids: on
changelog.rollover-time: 15
changelog.fsync-interval: 3
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: off
storage.owner-gid: 100
network.ping-timeout: 10
server.allow-insecure: on
performance.write-behind-window-size: 128MB
performance.cache-size: 128MB
performance.io-thread-count: 32

David



=======================


David F. Robinson, Ph.D.

President - Corvid Technologies

145 Overhill Drive

Mooresville, NC 28117

704.799.6944 x101   [Office]

704.252.1310           [Cell]

704.799.7974           [Fax]

david.robinson at corvidtec.com

http://www.corvidtec.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150505/7b8d5295/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glusterfs.tgz
Type: application/x-compressed
Size: 1395639 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150505/7b8d5295/attachment-0001.bin>

David Robinson

2015-May-05 03:05 UTC

head link

[Gluster-users] 3.6.3 + fuse mount

It looks like my issue was due to a change in the way name resolution is 
now handled in 3.6.3.  I'll send in an explanation tomorrow in case 
anyone else is having a similar issue.

David


------ Original Message ------
From: "David Robinson" <drobinson at corvidtec.com>
To: "gluster-users at gluster.org" <gluster-users at
gluster.org>; "Gluster
Devel" <gluster-devel at gluster.org>
Sent: 5/4/2015 8:23:28 PM
Subject: 3.6.3 + fuse mount
>I upgrade my systems to 3.6.3 and some of my clients are now having 
>issues connecting.  I can mount using NFS without any issues.  However, 
>when I try to FUSE mount, it times out on many of my nodes.  It mounted 
>to approximately 400-nodes.  However, the remainder timed out.  Any 
>suggestions for how to fix?
>
>On the client side, I am getting the following in the logs:
>
>[2015-05-05 00:17:18.013319] I [MSGID: 100030] [glusterfsd.c:2018:main] 
>0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 
>3.6.3
>(args: /usr/sbin/glusterfs --volfile-server=gfsib01a.corvidtec.com 
>--volfile-server-transport=tcp --volfile-id=/homegfs.tcp /homegfs_test)
>[2015-05-05 00:18:21.019012] E [socket.c:2276:socket_connect_finish] 
>0-glusterfs: connection to 10.1.70.1:24007 failed (Connection timed 
>out)
>[2015-05-05 00:18:21.019092] E [glusterfsd-mgmt.c:1811:mgmt_rpc_notify] 
>0-glusterfsd-mgmt: failed to connect with remote-host: 
>gfsib01a.corvidtec
>.com (Transport endpoint is not connected)
>[2015-05-05 00:18:21.019100] I [glusterfsd-mgmt.c:1817:mgmt_rpc_notify] 
>0-glusterfsd-mgmt: Exhausted all volfile servers
>[2015-05-05 00:18:21.019224] W [glusterfsd.c:1194:cleanup_and_exit] 
>(--> 0-: received signum (1), shutting down
>[2015-05-05 00:18:21.019239] I [fuse-bridge.c:5599:fini] 0-fuse: 
>Unmounting '/homegfs_test'.
>[2015-05-05 00:18:21.027770] W [glusterfsd.c:1194:cleanup_and_exit] 
>(--> 0-: received signum (15), shutting down
>Logs from my server are attached...
>
>[root at gfs01a log]# gluster volume status homegfs
>Status of volume: homegfs
>Gluster process                                         Port    Online  
>Pid
>------------------------------------------------------------------------------
>Brick gfsib01a.corvidtec.com:/data/brick01a/homegfs     49152   Y       
>3816
>Brick gfsib01b.corvidtec.com:/data/brick01b/homegfs     49152   Y       
>3826
>Brick gfsib01a.corvidtec.com:/data/brick02a/homegfs     49153   Y       
>3821
>Brick gfsib01b.corvidtec.com:/data/brick02b/homegfs     49153   Y       
>3831
>Brick gfsib02a.corvidtec.com:/data/brick01a/homegfs     49152   Y       
>3959
>Brick gfsib02b.corvidtec.com:/data/brick01b/homegfs     49152   Y       
>3970
>Brick gfsib02a.corvidtec.com:/data/brick02a/homegfs     49153   Y       
>3964
>Brick gfsib02b.corvidtec.com:/data/brick02b/homegfs     49153   Y       
>3975
>NFS Server on localhost                                 2049    Y       
>3830
>Self-heal Daemon on localhost                           N/A     Y       
>3835
>NFS Server on gfsib01b.corvidtec.com                    2049    Y       
>3840
>Self-heal Daemon on gfsib01b.corvidtec.com              N/A     Y       
>3845
>NFS Server on gfsib02b.corvidtec.com                    2049    Y       
>3984
>Self-heal Daemon on gfsib02b.corvidtec.com              N/A     Y       
>3989
>NFS Server on gfsib02a.corvidtec.com                    2049    Y       
>3973
>Self-heal Daemon on gfsib02a.corvidtec.com              N/A     Y       
>3978
>
>Task Status of Volume homegfs
>------------------------------------------------------------------------------
>Task                 : Rebalance
>ID                   : 58b6cc76-c29c-4695-93fe-c42b1112e171
>Status               : completed
>
>
>[root at gfs01a log]# gluster volume info homegfs
>
>Volume Name: homegfs
>Type: Distributed-Replicate
>Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
>Status: Started
>Number of Bricks: 4 x 2 = 8
>Transport-type: tcp
>Bricks:
>Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
>Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
>Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
>Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
>Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
>Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
>Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
>Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
>Options Reconfigured:
>server.manage-gids: on
>changelog.rollover-time: 15
>changelog.fsync-interval: 3
>changelog.changelog: on
>geo-replication.ignore-pid-check: on
>geo-replication.indexing: off
>storage.owner-gid: 100
>network.ping-timeout: 10
>server.allow-insecure: on
>performance.write-behind-window-size: 128MB
>performance.cache-size: 128MB
>performance.io-thread-count: 32
>
>David
>
>
>
>=======================>
>
>
>David F. Robinson, Ph.D.
>
>President - Corvid Technologies
>
>145 Overhill Drive
>
>Mooresville, NC 28117
>
>704.799.6944 x101   [Office]
>
>704.252.1310           [Cell]
>
>704.799.7974           [Fax]
>
>dmailto:David.Robinson at corvidtec.com
>
>http://www.corvidtec.com
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150505/fc003550/attachment.html>

Gluster users - May 2015 - 3.6.3 + fuse mount

[Gluster-users] 3.6.3 + fuse mount

[Gluster-users] 3.6.3 + fuse mount