Hello I'm running some tests to compare performance between Gluster FUSE mount and formated sparse files (located on the same Gluster FUSE mount). The Gluster volume is EC (same for both tests). I'm seeing HUGE difference and trying to figure out why. Here is an example: GlusterFUSE mount: # cd /mnt/glusterfs # rm -f testfile1 ; dd if=/dev/zero of=testfile1 bs=1G count=1 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 9.74757 s, *110 MB/s* Sparse file (located on GlusterFUSE mount): # truncate -l 100GB /mnt/glusterfs/xfs-100G.img # mkfs.xfs /mnt/glusterfs/xfs-100G.img # mount -o loop /mnt/glusterfs/xfs-100G.img /mnt/xfs-100G # cd /mnt/xfs-100G # rm -f testfile1 ; dd if=/dev/zero of=testfile1 bs=1G count=1 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB) copied, 1.20576 s, *891 MB/s* The same goes for working with small files (i.e. code file, make, etc) with the same data located on FUSE mount vs formated sparse file on the same FUSE mount. What would explain such difference? How does Gluster work with sparse files in general? I may move some of the data on gluster volumes to formated sparse files.. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170922/12730937/attachment.html>
Hi Dmitri, On 22/09/17 17:07, Dmitri Chebotarov wrote:> > Hello > > I'm running some tests to compare performance between Gluster FUSE mount > and formated sparse files (located on the same Gluster FUSE mount). > > The Gluster volume is EC (same for both tests). > > I'm seeing HUGE difference and trying to figure out why.Could you explain what hardware configuration are you using ? Do you have a plain disk for each brick formatted in XFS, or do you have some RAID configuration ?> > Here is an example: > > GlusterFUSE mount: > > # cd /mnt/glusterfs > # rm -f testfile1 ; dd if=/dev/zero of=testfile1 bs=1G count=1 > 1+0 records in > 1+0 records out > 1073741824 bytes (1.1 GB) copied, 9.74757 s, *110 MB/s* > > Sparse file (located on GlusterFUSE mount): > > #?truncate -l 100GB /mnt/glusterfs/xfs-100G.img > # mkfs.xfs /mnt/glusterfs/xfs-100G.img > # mount -o loop /mnt/glusterfs/xfs-100G.img?/mnt/xfs-100G > # cd /mnt/xfs-100G > # rm -f testfile1 ; dd if=/dev/zero of=testfile1 bs=1G count=1 > 1+0 records in > 1+0 records out > 1073741824 bytes (1.1 GB) copied, 1.20576 s, *891 MB/s* > > The same goes for working with small files (i.e. code file, make, etc) > with the same data located on FUSE mount vs formated sparse file on the > same FUSE mount. > > What would explain such difference?First of all, doing tests with relatively small files tends to be misleading because of caching capacity of the operating system (to minimize that, you can add 'conv=fsync' option to dd). You should do tests with file sizes bigger than the amount of physical memory on servers. This way you minimize cache effects and see the real sustained performance. A second important point to note is that gluster is a distributed file system that can be accessed simultaneously by more than one client. This means that consistency must be assured in all cases, which makes things go to bricks sooner than local filesystems normally do. In your case, all data saved to the fuse volume will most probably be present on bricks once the dd command completes. On the other side, the test through the formatted sparse file, most probably, is keeping most of the data in the cache of the client machine. Note that using the formatted sparse file makes it possible a better use of local cache, improving (relatively) small file access, but on the other side, this filesystem can only be used from a single client (single mount). If this client fails for some reason, you will loose access to your data.> > How does Gluster work with sparse files in general? I may move some of > the data on gluster volumes to formated sparse files..Gluster works fine with sparse files. However you should consider the previous points before choosing the formatted sparse files option. I guess that the sustained throughput will be very similar for bigger files. Regards, Xavi> > Thank you. > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >
Hi Xavi At this time I'm using 'plain' bricks with XFS. I'll be moving to LVM cached bricks. There is no RAID for data bricks, but I'll be using hardware RAID10 for SSD cache disks (I can use 'writeback' cache in this case). 'small file performance' is the main reason I'm looking at different options, i.e. using formated sparse files. I spent considerable amount of time tuning 10GB/kernel/gluster to reduce latency - the small file performance improved ~50% but it's still no good enough, especially when I need to use Gluster for /home folders. I understand limitations and single point of failure in case with sparse files. I'm considering different options to provide HA (pacemaker/corosync, keepalived or using VMs - RHEV - to deliver storage). Thank you for your reply. On Tue, Sep 26, 2017 at 3:55 AM, Xavi Hernandez <jahernan at redhat.com> wrote:> Hi Dmitri, > > On 22/09/17 17:07, Dmitri Chebotarov wrote: > >> >> Hello >> >> I'm running some tests to compare performance between Gluster FUSE mount >> and formated sparse files (located on the same Gluster FUSE mount). >> >> The Gluster volume is EC (same for both tests). >> >> I'm seeing HUGE difference and trying to figure out why. >> > > Could you explain what hardware configuration are you using ? > > Do you have a plain disk for each brick formatted in XFS, or do you have > some RAID configuration ? > > >> Here is an example: >> >> GlusterFUSE mount: >> >> # cd /mnt/glusterfs >> # rm -f testfile1 ; dd if=/dev/zero of=testfile1 bs=1G count=1 >> 1+0 records in >> 1+0 records out >> 1073741824 bytes (1.1 GB) copied, 9.74757 s, *110 MB/s* >> >> Sparse file (located on GlusterFUSE mount): >> >> # truncate -l 100GB /mnt/glusterfs/xfs-100G.img >> # mkfs.xfs /mnt/glusterfs/xfs-100G.img >> # mount -o loop /mnt/glusterfs/xfs-100G.img /mnt/xfs-100G >> # cd /mnt/xfs-100G >> # rm -f testfile1 ; dd if=/dev/zero of=testfile1 bs=1G count=1 >> 1+0 records in >> 1+0 records out >> 1073741824 bytes (1.1 GB) copied, 1.20576 s, *891 MB/s* >> >> The same goes for working with small files (i.e. code file, make, etc) >> with the same data located on FUSE mount vs formated sparse file on the >> same FUSE mount. >> >> What would explain such difference? >> > > First of all, doing tests with relatively small files tends to be > misleading because of caching capacity of the operating system (to minimize > that, you can add 'conv=fsync' option to dd). You should do tests with file > sizes bigger than the amount of physical memory on servers. This way you > minimize cache effects and see the real sustained performance. > > A second important point to note is that gluster is a distributed file > system that can be accessed simultaneously by more than one client. This > means that consistency must be assured in all cases, which makes things go > to bricks sooner than local filesystems normally do. > > In your case, all data saved to the fuse volume will most probably be > present on bricks once the dd command completes. On the other side, the > test through the formatted sparse file, most probably, is keeping most of > the data in the cache of the client machine. > > Note that using the formatted sparse file makes it possible a better use > of local cache, improving (relatively) small file access, but on the other > side, this filesystem can only be used from a single client (single mount). > If this client fails for some reason, you will loose access to your data. > > >> How does Gluster work with sparse files in general? I may move some of >> the data on gluster volumes to formated sparse files.. >> > > Gluster works fine with sparse files. However you should consider the > previous points before choosing the formatted sparse files option. I guess > that the sustained throughput will be very similar for bigger files. > > Regards, > > Xavi > > >> Thank you. >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170926/14b9f0c1/attachment.html>