thr3ads.net - Gluster users - [Gluster-users] Gluster-users Digest, Vol 49, Issue 25 -- Disk utilization [May 2012]

If this information is useful, please help other people find it:
Share via:

Ben England

2012-May-21 13:22 UTC

[Gluster-users] Gluster-users Digest, Vol 49, Issue 25 -- Disk utilization

Peter,

see comments marked with ben> below, hope this helps.

Message: 1
Date: Tue, 15 May 2012 22:12:10 +0200
From: Peter Frey <pfrey09 at googlemail.com>
Subject: [Gluster-users] Disk utilisation
To: gluster-users at gluster.org
Message-ID:
	<CAFWmEw==E990t-DYa_DRB37w3dDrkNLJJ=qFGJt3-bptmtGamQ at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

we are using Gluster to make http file downloads available. We currently
have 2 gluster servers serving a replicated volume. Each gluster server has
22 disks in a hardware raid, the underlying file system is XFS. The average
file size is around 3-4MB. There are stored around 16TB of data on the
volume.

ben> Linux distro version and Gluster version would be helpful.  What RAID
stripe element size?  If you have 64-KB stripe element size, then EVERY disk
will be made busy by reading a single 4-MB file.  Striping will not help you
much at that file size.  ~130 mbit/s = ~15 MB/s, most disks can read at > 50
MB/s, so your total system throughput is far less than throughput of a single
disk drive, so why use striping?  Wouldn't it be better to be able to serve
many files in parallel from your disks?  You may want to increase readahead if
the application tends to sequentially read the entire file, try increasing it
way up, the Linux default of 128 KB is not good for Gluster.   Lastly, try the
deadline I/O scheduler on your data disks, CFQ can't help with a Gluster
server.

Once we start sending live http traffic towards the infrastructure we see a
horrible performance. For instance if the outgoing bandwidth on each of the
gluster servers is at ~130mbit/s our hardware raid has a busy rate of ~30%.
Once we increase the traffic towards 250mbit/s the busy rate doubles to
60%. With this the iowait values also increase.

We started to play with the read buffers on the http servers. There is no
difference between loading the whole file into memory at once and loading
the file in 64k chunks. This makes me believe that the gluster server loads
the file with its own buffers and the clients buffer has no influence. We
have also enabled profiling on the gluster volume: There are roughly 18
read() calls for each open() call which should be an indication for too
small buffers.

ben> Gluster avoids read caching on the client side.  You can give Gluster
servers more memory so that XFS can cache more files if this leads to more cache
hits.  If you really need aggressive client-side caching, you can NFS mount the
gluster server.  If your app is HTTP-based and is RESTful then there are web
caching servers that can intercept requests before they reach your application. 
18 read calls/open is not a terrible ratio.  In my experience, if network tuning
is correct and read files are cached (or prefetched) on the server, Gluster
reads at network speed (which is why disk read-ahead is important).  How much
traffic can your network transmit?   Have you tested network by itself (i.e.
without using Gluster to test it?)

We have also made the mistake to store all files in a single directory but
XFS advertises that it can handle millions of files in a single directory
so it shouldn't be a problem or should it?

ben> Never put millions of files in a single directory if you can help it. 
Many file systems do not do well with this many files/directory.  But even if
the filesystem is perfect at it, applications that attempt to display directory
contents (other than "find") tend to lock up because apps will read
entire directory, read all inodes in directory, sort them, then display them. 
Classic example: "ls" command.

ben> Recent XFS versions (such as version in RHEL6.2) handles metadata far
better than before (e.g. RHEL6.1), so you may want to make sure you're using
the right one.

Gluster users - May 2012 - Gluster-users Digest, Vol 49, Issue 25 -- Disk utilization

[Gluster-users] Gluster-users Digest, Vol 49, Issue 25 -- Disk utilization