Markus Fröhlich
2011-Jan-27  16:03 UTC
[Gluster-users] sometimes connection errors - glusterfs limit?
hi!
problem description:
1-3 times a day glusterfs seems to hang - like freezed - network traffic is a
few secounds again
zero, because there is no glusterfs communication.
in the log of the servers, that clients got disconnected and a little bit later
connected again
but the network traffic seems okay and is not at limit - there are no errors on
the networkinterfaces.
we also checked cables and switchports on the switches.
we think that glusterfs in combination with the I/O is the bottleneck here.
average we can say the storage setup is in use with 2/3 read and 1/3 write
operations.
is there a limitation of volumes, clients, mounts etc. in the glusterfs code?
has some one made similar experience or troubles with such a setup?
is it possible that to much clients are connected to few servers?
has any one some tip for us?
-------------------------------------
our  setup:
two glusterfs servers:
  * areca raid controller with raid5 setup
  * 3 LUNs each 11 TB with 70%-99% in use with ext3 formated
  * SLES11 x86_64
  * glusterfs V 3.0.7
45 - 50 glusterfs client servers:
  * SLES10, SLES11, SLES11 SP1
  * glusterfs V 3.0.7
  * all VOLs client replicated to the both glusterfs servers
-----------------------------------
volume files:
server export vols look like this - some options differ:
--
volume posix
   type storage/posix
   option directory /gluster-storage/projekte/ksc/
   option background-unlink yes
end-volume
volume locks
   type features/locks
   subvolumes posix
end-volume
volume ksc
   type performance/io-threads
   option thread-count 16
   subvolumes locks
end-volume
volume server
   type protocol/server
   option transport-type tcp
   option transport.socket.listen-port 7025
   option auth.addr.ksc.allow 10.0.1.*
   subvolumes ksc
end-volume
--
volume posix
   type storage/posix
   option directory /gluster-storage/projekte/hosting/
   option o-direct enable
   option background-unlink yes
end-volume
volume locks
   type features/locks
   subvolumes posix
end-volume
volume hosting2
   type performance/io-threads
   option thread-count 16
   subvolumes locks
end-volume
volume server
   type protocol/server
   option transport-type tcp
   option transport.socket.listen-port 7005
   option auth.addr.hosting2.allow 10.0.1.*
   subvolumes hosting2
end-volume
--
client repl. mount VOL files:
volume vgfs-01-001-ksc
   type protocol/client
   option transport-type tcp
   option remote-host vgfs-01-001
   option remote-port 7025
   option ping-timeout 10
   option remote-subvolume ksc
end-volume
# distribute
volume distribute1-ksc
   type cluster/distribute
   option lookup-unhashed auto
    option min-free-disk 5%
   subvolumes vgfs-01-001-ksc
end-volume
volume vgfs-01-002-ksc
   type protocol/client
   option transport-type tcp
   option remote-host vgfs-01-002
   option remote-port 7025
   option ping-timeout 10
   option remote-subvolume ksc
end-volume
# distribute
volume distribute2-ksc
   type cluster/distribute
   option lookup-unhashed auto
    option min-free-disk 5%
   subvolumes vgfs-01-002-ksc
end-volume
volume ksc-data-replicate
     type cluster/replicate
     subvolumes distribute1-ksc distribute2-ksc
end-volume
volume iocache
   type performance/io-cache
   option cache-size 64MB  #1GB supported
   option cache-timeout 1
   subvolumes ksc-data-replicate
end-volume
volume quick-read
   type performance/quick-read
#  option cache-timeout 10 (1 second)
#  option max-file-size 1048576 (64Kb)
   subvolumes iocache
end-volume
volume trace
   type debug/trace
   subvolumes quick-read
#  option include open,close,create,readdir,opendir,closedir
#  option exclude lookup,read,write
end-volume
--
volume vgfs-01-001-hosting
   type protocol/client
   option transport-type tcp
   option remote-host vgfs-01-001
   option remote-port 7005
   option ping-timeout 20
   option remote-subvolume hosting
end-volume
volume vgfs-01-002-hosting
   type protocol/client
   option transport-type tcp
   option remote-host vgfs-01-002
   option remote-port 7005
   option ping-timeout 20
   option remote-subvolume hosting
end-volume
# distribute
volume distribute1-hosting
   type cluster/distribute
   option lookup-unhashed yes
    option min-free-disk 5%
   subvolumes vgfs-01-001-hosting
end-volume
# distribute
volume distribute2-hosting
   type cluster/distribute
   option lookup-unhashed yes
    option min-free-disk 5%
   subvolumes vgfs-01-002-hosting
end-volume
volume backup-data-replicate
     type cluster/replicate
     subvolumes distribute1-hosting distribute2-hosting
     subvolumes distribute2-hosting
end-volume
volume readahead
   type performance/read-ahead
   option page-count 16              # cache per file  = (page-count x
page-size)
   subvolumes backup-data-replicate
end-volume
volume iocache
   type performance/io-cache
   option cache-size 1024MB  #1GB supported
   option cache-timeout 1
   subvolumes readahead
end-volume
volume iothreads
   type performance/io-threads
   option thread-count 6  # default is 16
   subvolumes iocache
end-volume
volume quickread
     type performance/quick-read
     option cache-timeout 30
     option max-file-size 1024000
     subvolumes iothreads
end-volume
--
regards
markus
-- 
Mit freundlichen Gr?ssen
Markus Fr?hlich
Techniker
________________________________________________________
Xidras GmbH
Stockern 47
3744 Stockern
Austria
Tel:     +43 (0) 2983 201 30503
Fax:     +43 (0) 2983 201 305039
Email:   markus.froehlich at xidras.com
Web:    http://www.xidras.com
FN 317036 f | Landesgericht Krems | ATU64485024
________________________________________________________________________________
VERTRAULICHE INFORMATIONEN!
Diese eMail enth?lt vertrauliche Informationen und ist nur f?r den
berechtigten Empf?nger bestimmt. Wenn diese eMail nicht f?r Sie bestimmt
ist, bitten wir Sie, diese eMail an uns zur?ckzusenden und anschlie?end
auf Ihrem Computer und Mail-Server zu l?schen. Solche eMails und Anlagen
d?rfen Sie weder nutzen, noch verarbeiten oder Dritten zug?nglich
machen, gleich in welcher Form.
Wir danken f?r Ihre Kooperation!
CONFIDENTIAL!
This email contains confidential information and is intended for the
authorised recipient only. If you are not an authorised recipient,
please return the email to us and then delete it from your computer
and mail-server. You may neither use nor edit any such emails including
attachments, nor make them accessible to third parties in any manner
whatsoever.
Thank you for your cooperation
________________________________________________________________________________
Burnash, James
2011-Jan-27  16:57 UTC
[Gluster-users] sometimes connection errors - GlusterFS limit?
Hello.
I have experienced this situation with the 3.0.4 release of Glusterfs - it was
related to a bug that had to do with recursive file deletions (in my case).
That bug has been fixed in 3.1.1 which is what I am currently running.
Can you give us your Glusterfs version, and a copy of your volume files for
server and client?
That would help us to help you.
Thanks,
James
-----Original Message-----
From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at
gluster.org] On Behalf Of Markus Fr?hlich
Sent: Thursday, January 27, 2011 11:03 AM
To: gluster-users at gluster.org
Subject: [Gluster-users] sometimes connection errors - glusterfs limit?
hi!
problem description:
1-3 times a day glusterfs seems to hang - like freezed - network traffic is a
few secounds again
zero, because there is no glusterfs communication.
in the log of the servers, that clients got disconnected and a little bit later
connected again
but the network traffic seems okay and is not at limit - there are no errors on
the networkinterfaces.
we also checked cables and switchports on the switches.
we think that glusterfs in combination with the I/O is the bottleneck here.
average we can say the storage setup is in use with 2/3 read and 1/3 write
operations.
is there a limitation of volumes, clients, mounts etc. in the glusterfs code?
has some one made similar experience or troubles with such a setup?
is it possible that to much clients are connected to few servers?
has any one some tip for us?
-------------------------------------
our  setup:
two glusterfs servers:
  * areca raid controller with raid5 setup
  * 3 LUNs each 11 TB with 70%-99% in use with ext3 formated
  * SLES11 x86_64
  * glusterfs V 3.0.7
45 - 50 glusterfs client servers:
  * SLES10, SLES11, SLES11 SP1
  * glusterfs V 3.0.7
  * all VOLs client replicated to the both glusterfs servers
-----------------------------------
volume files:
server export vols look like this - some options differ:
--
volume posix
   type storage/posix
   option directory /gluster-storage/projekte/ksc/
   option background-unlink yes
end-volume
volume locks
   type features/locks
   subvolumes posix
end-volume
volume ksc
   type performance/io-threads
   option thread-count 16
   subvolumes locks
end-volume
volume server
   type protocol/server
   option transport-type tcp
   option transport.socket.listen-port 7025
   option auth.addr.ksc.allow 10.0.1.*
   subvolumes ksc
end-volume
--
volume posix
   type storage/posix
   option directory /gluster-storage/projekte/hosting/
   option o-direct enable
   option background-unlink yes
end-volume
volume locks
   type features/locks
   subvolumes posix
end-volume
volume hosting2
   type performance/io-threads
   option thread-count 16
   subvolumes locks
end-volume
volume server
   type protocol/server
   option transport-type tcp
   option transport.socket.listen-port 7005
   option auth.addr.hosting2.allow 10.0.1.*
   subvolumes hosting2
end-volume
--
client repl. mount VOL files:
volume vgfs-01-001-ksc
   type protocol/client
   option transport-type tcp
   option remote-host vgfs-01-001
   option remote-port 7025
   option ping-timeout 10
   option remote-subvolume ksc
end-volume
# distribute
volume distribute1-ksc
   type cluster/distribute
   option lookup-unhashed auto
    option min-free-disk 5%
   subvolumes vgfs-01-001-ksc
end-volume
volume vgfs-01-002-ksc
   type protocol/client
   option transport-type tcp
   option remote-host vgfs-01-002
   option remote-port 7025
   option ping-timeout 10
   option remote-subvolume ksc
end-volume
# distribute
volume distribute2-ksc
   type cluster/distribute
   option lookup-unhashed auto
    option min-free-disk 5%
   subvolumes vgfs-01-002-ksc
end-volume
volume ksc-data-replicate
     type cluster/replicate
     subvolumes distribute1-ksc distribute2-ksc
end-volume
volume iocache
   type performance/io-cache
   option cache-size 64MB  #1GB supported
   option cache-timeout 1
   subvolumes ksc-data-replicate
end-volume
volume quick-read
   type performance/quick-read
#  option cache-timeout 10 (1 second)
#  option max-file-size 1048576 (64Kb)
   subvolumes iocache
end-volume
volume trace
   type debug/trace
   subvolumes quick-read
#  option include open,close,create,readdir,opendir,closedir
#  option exclude lookup,read,write
end-volume
--
volume vgfs-01-001-hosting
   type protocol/client
   option transport-type tcp
   option remote-host vgfs-01-001
   option remote-port 7005
   option ping-timeout 20
   option remote-subvolume hosting
end-volume
volume vgfs-01-002-hosting
   type protocol/client
   option transport-type tcp
   option remote-host vgfs-01-002
   option remote-port 7005
   option ping-timeout 20
   option remote-subvolume hosting
end-volume
# distribute
volume distribute1-hosting
   type cluster/distribute
   option lookup-unhashed yes
    option min-free-disk 5%
   subvolumes vgfs-01-001-hosting
end-volume
# distribute
volume distribute2-hosting
   type cluster/distribute
   option lookup-unhashed yes
    option min-free-disk 5%
   subvolumes vgfs-01-002-hosting
end-volume
volume backup-data-replicate
     type cluster/replicate
     subvolumes distribute1-hosting distribute2-hosting
     subvolumes distribute2-hosting
end-volume
volume readahead
   type performance/read-ahead
   option page-count 16              # cache per file  = (page-count x
page-size)
   subvolumes backup-data-replicate
end-volume
volume iocache
   type performance/io-cache
   option cache-size 1024MB  #1GB supported
   option cache-timeout 1
   subvolumes readahead
end-volume
volume iothreads
   type performance/io-threads
   option thread-count 6  # default is 16
   subvolumes iocache
end-volume
volume quickread
     type performance/quick-read
     option cache-timeout 30
     option max-file-size 1024000
     subvolumes iothreads
end-volume
--
regards
markus
-- 
Mit freundlichen Gr?ssen
Markus Fr?hlich
Techniker
________________________________________________________
Xidras GmbH
Stockern 47
3744 Stockern
Austria
Tel:     +43 (0) 2983 201 30503
Fax:     +43 (0) 2983 201 305039
Email:   markus.froehlich at xidras.com
Web:    http://www.xidras.com
FN 317036 f | Landesgericht Krems | ATU64485024
________________________________________________________________________________
VERTRAULICHE INFORMATIONEN!
Diese eMail enth?lt vertrauliche Informationen und ist nur f?r den
berechtigten Empf?nger bestimmt. Wenn diese eMail nicht f?r Sie bestimmt
ist, bitten wir Sie, diese eMail an uns zur?ckzusenden und anschlie?end
auf Ihrem Computer und Mail-Server zu l?schen. Solche eMails und Anlagen
d?rfen Sie weder nutzen, noch verarbeiten oder Dritten zug?nglich
machen, gleich in welcher Form.
Wir danken f?r Ihre Kooperation!
CONFIDENTIAL!
This email contains confidential information and is intended for the
authorised recipient only. If you are not an authorised recipient,
please return the email to us and then delete it from your computer
and mail-server. You may neither use nor edit any such emails including
attachments, nor make them accessible to third parties in any manner
whatsoever.
Thank you for your cooperation
________________________________________________________________________________
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
DISCLAIMER: 
This e-mail, and any attachments thereto, is intended only for use by the
addressee(s) named herein and may contain legally privileged and/or confidential
information. If you are not the intended recipient of this e-mail, you are
hereby notified that any dissemination, distribution or copying of this e-mail,
and any attachments thereto, is strictly prohibited. If you have received this
in error, please immediately notify me and permanently delete the original and
any copy of any e-mail and any printout thereof. E-mail transmission cannot be
guaranteed to be secure or error-free. The sender therefore does not accept
liability for any errors or omissions in the contents of this message which
arise as a result of e-mail transmission.
NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its
discretion, monitor and review the content of all e-mail communications.
http://www.knight.com