Diego Remolina
2015-Sep-09 23:42 UTC
[Gluster-users] Very slow roaming profiles on top of glusterfs
Hi, I am running two glusterfs servers as replicas. I have a 3rd server which provides quorum. Since gluster was introduced, we have had an issue where windows roaming profiles are extremely slow. The initial setup was done on 3.6.x and since 3.7.x has small file performance improvements, I upgraded to 3.7.3, but that has not helped. It seems that for some reason gluster is very slow when dealing with lots of small files. I am not sure how to really troubleshoot this via samba, but I have come up with other tests that produce rather disconcerting results as shown below. If I run directly on the brick: [root at ysmha01 /]# time ( find /bricks/hdds/brick/home/jgibbs/.winprofile.V2 -type f > /dev/null ) real 0m3.683s user 0m0.042s sys 0m0.154s Now running on the gluster volume mounted via fuse: [root at ysmha01 /]# mount | grep export 10.0.1.6:/export on /export type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,allow_other,max_read=131072) [root at ysmha01 /]# time ( find /export/home/jgibbs/.winprofile.V2 -type f > /dev/null ) real 0m57.812s user 0m0.118s sys 0m0.374s In general, the time to run the command on this particular user can be up to 2 minutes. If I run the command on the brick first, then it seems the time to run on the mounted gluster volume is lower like in the example above. I assume some caching in preserved. This particular user has 13,216 files in his roaming profile, which adds up to about 452MB of data. The server performance over samba for copying big files (both read and write) is great, I can almost max out the gigabit connections on the desktops. Reading from samba share on the server and writing to local drive: 111MB/s (Copying a 650MB iso file) Reading from local drive and writing to server samba share: 94MB/s (Copying a 3.2GB ISO file) The servers are connected to the network with 10Gbit adapters and also use separate adapters; one 10 Gbit adapter is used for services, and other for the backend storage communication. The servers have hardware raid controllers and the samba shares are on top of an Areca ARC-1882 controller, with a volume made out of 12 2TB drives in raid 6. If you can provide any steps to better troubleshoot this problem and fix the issue, I will really appreciate it. Diego Further details about the machines below: [root at ysmha01 /]# cat /etc/redhat-release CentOS Linux release 7.1.1503 (Core) [root at ysmha01 /]# gluster volume info export Volume Name: export Type: Replicate Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.0.1.7:/bricks/hdds/brick Brick2: 10.0.1.6:/bricks/hdds/brick Options Reconfigured: performance.io-cache: on performance.io-thread-count: 64 nfs.disable: on cluster.server-quorum-type: server performance.cache-size: 1024MB server.allow-insecure: on cluster.server-quorum-ratio: 51% Each server has dual Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz with 32GB of memory.
Diego Remolina
2015-Sep-14 12:21 UTC
[Gluster-users] Very slow roaming profiles on top of glusterfs
Bump... Anybody has any clues as to how I can try and identify the cause of the slowness? Diego On Wed, Sep 9, 2015 at 7:42 PM, Diego Remolina <dijuremo at gmail.com> wrote:> Hi, > > I am running two glusterfs servers as replicas. I have a 3rd server > which provides quorum. Since gluster was introduced, we have had an > issue where windows roaming profiles are extremely slow. The initial > setup was done on 3.6.x and since 3.7.x has small file performance > improvements, I upgraded to 3.7.3, but that has not helped. > > It seems that for some reason gluster is very slow when dealing with > lots of small files. I am not sure how to really troubleshoot this via > samba, but I have come up with other tests that produce rather > disconcerting results as shown below. > > If I run directly on the brick: > [root at ysmha01 /]# time ( find > /bricks/hdds/brick/home/jgibbs/.winprofile.V2 -type f > /dev/null ) > real 0m3.683s > user 0m0.042s > sys 0m0.154s > > Now running on the gluster volume mounted via fuse: > [root at ysmha01 /]# mount | grep export > 10.0.1.6:/export on /export type fuse.glusterfs > (rw,relatime,user_id=0,group_id=0,allow_other,max_read=131072) > > [root at ysmha01 /]# time ( find /export/home/jgibbs/.winprofile.V2 -type > f > /dev/null ) > real 0m57.812s > user 0m0.118s > sys 0m0.374s > > In general, the time to run the command on this particular user can be > up to 2 minutes. If I run the command on the brick first, then it > seems the time to run on the mounted gluster volume is lower like in > the example above. I assume some caching in preserved. > > This particular user has 13,216 files in his roaming profile, which > adds up to about 452MB of data. > > The server performance over samba for copying big files (both read and > write) is great, I can almost max out the gigabit connections on the > desktops. > > Reading from samba share on the server and writing to local drive: > 111MB/s (Copying a 650MB iso file) > Reading from local drive and writing to server samba share: 94MB/s > (Copying a 3.2GB ISO file) > > The servers are connected to the network with 10Gbit adapters and also > use separate adapters; one 10 Gbit adapter is used for services, and > other for the backend storage communication. > > The servers have hardware raid controllers and the samba shares are on > top of an Areca ARC-1882 controller, with a volume made out of 12 2TB > drives in raid 6. > > If you can provide any steps to better troubleshoot this problem and > fix the issue, I will really appreciate it. > > Diego > > Further details about the machines below: > > [root at ysmha01 /]# cat /etc/redhat-release > CentOS Linux release 7.1.1503 (Core) > > [root at ysmha01 /]# gluster volume info export > Volume Name: export > Type: Replicate > Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: 10.0.1.7:/bricks/hdds/brick > Brick2: 10.0.1.6:/bricks/hdds/brick > Options Reconfigured: > performance.io-cache: on > performance.io-thread-count: 64 > nfs.disable: on > cluster.server-quorum-type: server > performance.cache-size: 1024MB > server.allow-insecure: on > cluster.server-quorum-ratio: 51% > > Each server has dual Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz with > 32GB of memory.