thr3ads.net - Gluster users - [Gluster-users] fuse client performance issues [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Daniel Davidson

2015-Apr-28 14:25 UTC

[Gluster-users] fuse client performance issues

Hello,

I am preparing our first real production gluster service and I have run 
in to a performance issue with the fuse client.

Here is our config:

Volume Name: biocluster-home
Type: Distribute
Volume ID: a9352088-d9d9-4438-8a87-5bbaa0cded5e
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: gl-0:/poola/brick
Brick2: gl-0:/poolb/brick
Brick3: gl-0:/poolc/brick
Options Reconfigured:
diagnostics.latency-measurement: on
diagnostics.count-fop-hits: on

The bricks are all ZFS volumes.

When we transfer to the ZFS volumes on the storage node we get a great 
speed:

./iozone -w -c -e -i 0 -+n -C -r 64k -s 1g -t 8 -F 
/poola/f{0,1,2,3,4,5,6,7,8}.ioz |grep Children
         Children see throughput for  8 initial writers  = 2200360.69 kB/sec

When we go to the fuse client on the gluster storage node, we get good 
speed:

./iozone -w -c -e -i 0 -+n -C -r 64k -s 1g -t 8 -F 
/glusterhome/f{0,1,2,3,4,5,6,7,8}.ioz |grep Children
         Children see throughput for  8 initial writers  =  803187.06 kB/sec

When we go to the fuse client on another system, it tanks:

./iozone -w -c -e -i 0 -+n -C -r 64k -s 1g -t 8 -F 
/gluster/f{0,1,2,3,4,5,6,7,8}.ioz |grep Children
         Children see throughput for  8 initial writers  =  114423.71 kB/sec

However the NFS client still does ok:

./iozone -w -c -e -i 0 -+n -C -r 64k -s 1g -t 8 -F 
/glusternfs/f{0,1,2,3,4,5,6,7,8}.ioz |grep Children
         Children see throughput for  8 initial writers  =  952116.27 kB/sec

Iperf shows the network is ok:

[  3]  0.0-10.0 sec  11.5 GBytes  9.83 Gbits/sec

Am I missing anything here?  Is there anything I can do to improve the 
performance of the cluster fuse client?

Dan

Joe Julian

2015-May-07 19:20 UTC

head link

[Gluster-users] fuse client performance issues

glusterfs, the client, is a userspace application that reads network 
traffic, does stuff with it, and - in the case of fuse, interfaces with 
the kernel fuse driver.

So... data from the network -> (kernel) network driver -> (user) 
glusterfs -> (kernel) fuse -> (user) application.
That's 3 context switches. To speed those up, use faster ram and/or cpu. 
or use a userspace network driver to eliminate one c.s., use libgfapi to 
eliminate 2 c.s. The former requires special hardware that has userspace 
drivers. The latter requires customizing your software (unless it's in 
java, then it's ridiculously easy).

NFS lies to iozone, making it appear faster than it really is. Some data 
requests don't actually go out to the servers and are instead served 
from cache. If the server data has changed, nfs won't (necessarily) know it.

My opinion, as always, is to determine your SLA/OLA requirements and, 
instead of building something and working out why it doesn't satisfy 
those requirements, engineer something that does. Look at the system as 
a whole, not necessarily each individual task. Optimize for the job, not 
the individual step.

Finally, ensure your benchmark emulates your actual workload. Premature 
optimization to fix a benchmark may actually hinder your overall outcome.

On 04/28/2015 07:25 AM, Daniel Davidson wrote:> Hello,
>
> I am preparing our first real production gluster service and I have 
> run in to a performance issue with the fuse client.
>
> Here is our config:
>
> Volume Name: biocluster-home
> Type: Distribute
> Volume ID: a9352088-d9d9-4438-8a87-5bbaa0cded5e
> Status: Started
> Number of Bricks: 3
> Transport-type: tcp
> Bricks:
> Brick1: gl-0:/poola/brick
> Brick2: gl-0:/poolb/brick
> Brick3: gl-0:/poolc/brick
> Options Reconfigured:
> diagnostics.latency-measurement: on
> diagnostics.count-fop-hits: on
>
> The bricks are all ZFS volumes.
>
> When we transfer to the ZFS volumes on the storage node we get a great 
> speed:
>
> ./iozone -w -c -e -i 0 -+n -C -r 64k -s 1g -t 8 -F 
> /poola/f{0,1,2,3,4,5,6,7,8}.ioz |grep Children
>         Children see throughput for  8 initial writers  = 2200360.69 
> kB/sec
>
> When we go to the fuse client on the gluster storage node, we get good 
> speed:
>
> ./iozone -w -c -e -i 0 -+n -C -r 64k -s 1g -t 8 -F 
> /glusterhome/f{0,1,2,3,4,5,6,7,8}.ioz |grep Children
>         Children see throughput for  8 initial writers  = 803187.06 
> kB/sec
>
> When we go to the fuse client on another system, it tanks:
>
> ./iozone -w -c -e -i 0 -+n -C -r 64k -s 1g -t 8 -F 
> /gluster/f{0,1,2,3,4,5,6,7,8}.ioz |grep Children
>         Children see throughput for  8 initial writers  = 114423.71 
> kB/sec
>
> However the NFS client still does ok:
>
> ./iozone -w -c -e -i 0 -+n -C -r 64k -s 1g -t 8 -F 
> /glusternfs/f{0,1,2,3,4,5,6,7,8}.ioz |grep Children
>         Children see throughput for  8 initial writers  = 952116.27 
> kB/sec
>
> Iperf shows the network is ok:
>
> [  3]  0.0-10.0 sec  11.5 GBytes  9.83 Gbits/sec
>
> Am I missing anything here?  Is there anything I can do to improve the 
> performance of the cluster fuse client?
>
> Dan
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

Gluster users - Apr 2015 - fuse client performance issues

[Gluster-users] fuse client performance issues

[Gluster-users] fuse client performance issues