I have been tweaking and researching for a while now and can't seem to get "good" performance out of Gluster. I'm using Gluster to replace an NFS server (c1.xlarge) that serves files to an array of web servers, all in EC2. In my tests Gluster is significantly slower than NFS on average. I'm using a distributed replicated volume on two (m1.large) bricks: Volume Name: ebs Type: Replicate Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: 10....:/mnt/ebs Brick2: 10....:/mnt/ebs Options Reconfigured: performance.io-thread-count: 64 performance.cache-refresh-timeout: 60 performance.cache-size: 6GB The web servers are serving pages built with eZ Publish, a CMS that reads a lot of small files to build a page. I'm benchmarking with siege. Here are some sample results I'm getting: With NFS, siege request concurrency 5: Lifting the server siege... done. Transactions: 271 hits Availability: 100.00 % Elapsed time: 61.60 secs Data transferred: 3.11 MB Response time: 1.12 secs Transaction rate: 4.40 trans/sec Throughput: 0.05 MB/sec Concurrency: 4.95 Successful transactions: 271 Failed transactions: 0 Longest transaction: 2.45 Shortest transaction: 0.84 More NFS, concurrency 20: Lifting the server siege... done. Transactions: 857 hits Availability: 100.00 % Elapsed time: 85.56 secs Data transferred: 9.84 MB Response time: 1.97 secs Transaction rate: 10.02 trans/sec Throughput: 0.11 MB/sec Concurrency: 19.76 Successful transactions: 857 Failed transactions: 0 Longest transaction: 6.53 Shortest transaction: 1.15 And with Gluster, concurrency 5: Lifting the server siege... done. Transactions: 75 hits Availability: 100.00 % Elapsed time: 61.26 secs Data transferred: 0.63 MB Response time: 3.96 secs Transaction rate: 1.22 trans/sec Throughput: 0.01 MB/sec Concurrency: 4.85 Successful transactions: 75 Failed transactions: 0 Longest transaction: 4.64 Shortest transaction: 3.54 More Gluster, concurrency 20: Lifting the server siege... done. Transactions: 139 hits Availability: 100.00 % Elapsed time: 84.96 secs Data transferred: 1.16 MB Response time: 11.53 secs Transaction rate: 1.64 trans/sec Throughput: 0.01 MB/sec Concurrency: 18.86 Successful transactions: 139 Failed transactions: 0 Longest transaction: 16.14 Shortest transaction: 10.01 Any ideas how I can improve Gluster's performance? Ryan Williams
On 01/19/2011 04:24 PM, Ryan Williams wrote:> I have been tweaking and researching for a while now and can't seem to > get "good" performance out of Gluster. > > I'm using Gluster to replace an NFS server (c1.xlarge) that serves > files to an array of web servers, all in EC2. In my tests Gluster is > significantly slower than NFS on average. I'm using a distributed > replicated volume on two (m1.large) bricks:Hmmm ... we looked through similar concepts (with non-virtualized hosts) recently, and found that for large block sequential IO, gluster is faster (fewer context switches and less network stack to traverse). There was an about 50-60% penalty (basically context switching in the fuse layer) associated with the smaller blocks. To work aruond this, we suggested local caching (if possible) or RAMdisk caching. Use gluster for initial distribution of the files, and then copy them to local storage. Or turn up the client side gluster caching so that after initial read, the files come from local cache. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615
On 19 Jan 2011, at 22:24, Ryan Williams wrote:> The web servers are serving pages built with eZ Publish, a CMS that > reads a lot of small files to build a page....and the majority of those files probably don't change much, making them perfect cache material. Since ezPublish is PHP, I'd strongly suggest using an accelerator like APC if you're not already as that will immediately reduce the majority of file requests to stat calls to check mod times. You can turn the stat check off and files will be served straight from the cache and not hit the disk at all, however, in a clustered scenario you need to be able to force re-reading on all nodes when you do change files, which you can probably do with a script that turns on stat checks, loads every PHP file you have (or at least all changed ones) then turns stat checks off again. This will help regardless of what underlying disk system you use. Another approach is to keep static files (i.e. app code) on local storage and sync that manually across nodes and only put dynamic stuff (e.g. user uploads) on the gluster volume. That's what I do. Marcus