You have encountered a (if not "the") major flaw in glusterfs, it is
not very good at dealing with lots of small files.
There are some tunables in gluster that may help just a bit, but you
will *not* get the same speeds as raw direct attached storage without
clustering or be even close to it. IIRC, this is because you will have
to stat the files on each of the bricks and this adds latency.
SSDs will help some, but *not* dramatically as the major slow down is
in checking all bricks.
What version of Gluster are you using?
If you really need great small file performance, I recommend you look
elsewhere (listed in order of performance):
1. drbd so far has been the best in terms of small file performance in
my tests, however it is more complex to setup than gluster and
MooseFS. DRBD did not support more than 2 servers until version 9 and
have recently changed their management system. May be a steep learning
curve. DRBD was the best performing option in my tests by far.
2. MooseFS/LizardFS: I have been playing with MooseFS and find it much
better than Glusterfs for dealing with lots of small files. It is as
close as easy to setup as gluster vs the higher complexity of DRBD.
However, their stable release 3.x series does not have free HA (i.e.
automated failover with multiple masters). If you want to have
HA/failover, then you have to purchase their "pro" edition (no idea on
pricing). They had said that version 4.x would be released this fall
2018 and the free edition would have the free HA component, but it has
not yet been released, so unless you are willing to pay for 3 Pro, you
need to go elsewhere.
------------------------------
If you decide to stick with gluster, you can try some of the small
file performance optimization changes, it will improve a bit, but will
not be as good as DRBD nor MooseFS in my experience:
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/small_file_performance_enhancements
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/html/administration_guide/small_file_performance_enhancements
Some Small file performance options available since gluster 3.9:
https://stackoverflow.com/questions/42343391/how-can-i-improve-glusterfs-performance-with-small-files
HTH,
Diego
On Tue, Dec 18, 2018 at 10:36 PM csirotic <csirotic at gmail.com>
wrote:>
> Hi,
> I am new to using gluster and I am running some tests right now. I am
fairly inexperienced as well, so it's a good learning experience for me.
>
> My problem right now is the small file create iops, using smallfile. I
cannot get more than 800 files/second 4k.
>
> My setup is fairly simple.
> I have 4 servers.
> 3 first server have each one brick that is three way replicated.
> Server 4 simply mount the volume using the fuse native client.
>
> The first three servers, all have the same hardware. Its common supermicro
servers with a raid 6 array of 8 x 6tb hgst 7200 drives.
> If I test smallfile directly on the brick location, I get very high
results.
>
> For the networking part of it, the 4 servers are using 10GBytes. Iperf3
give me steady 10GBytes when I test between all the servers.
>
> When I transfer files from the client-server with the fuse mount, large
.qcow files, I get around 150 MB/s. Why is not low, but is not great either.
>
> What would you look at first?
> The options that I am pondering are buying ssd drives to put cache on each
servers.
> Also, it seems to me that having only a 3 way replication, instead of 2+2
setup, is really hurting.
> Any other tests that could help my process?
>
> Any input is much appreciated.
> Thank you.
>
>
>
>
> Sent from my Bell Samsung device over Canada's largest network.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users