Ernie Dunbar
2015-May-07 18:36 UTC
[Gluster-users] Improving Gluster performance through more hardware.
Hi all. First, I have a specific question about what hardware should be used for Gluster, then after that I have a question about how Gluster does its multithreading/hyperthreading. So, we have a new Gluster cluster (currently, two servers with one "replicated" volume) serving up our files for e-mail, which has for years been stored in Maildir format. That works pretty well except for the few clients who store all their old mail on our server, and their "cur" folder contains a few tens of thousands of messages. As others have noticed, this isn't something that Gluster handles well. But we value high availability and redundancy more than we value fast, and we don't yet have a large enough cluster to justify going with software the requires a metadata server. So we're going with Gluster as a result of this. That doesn't mean we don't need better performance though. So I've noticed that the resources that Gluster consumes the most in our use case isn't the network or disk utilization - both of which remain *well* under full utilization - but CPU cycles. I can easily test this by running `ls -l` in a folder with ~20,000 files in it, and I see CPU usage by glusterfsd jump to between 40-200%. The glusterfs process usually stays around 20-30%. Both of our Gluster servers are gen III Dell 2950's with dual Xeon E5345's (quad-core, 2.33 GHz CPUs) in them, so we have 8 CPUs total to deal with this load. So far, we're only using a single mail server, but we'll be migrating to a load-balanced pair very soon. So my guess is that we can reduce the latency that's very noticeable in our webmail by upgrading to the fastest CPUs the 2950's can hold, evidently a 3.67 GHz quad-core. It would be nice to know what other users have experienced with this kind of upgrade, or whether they've gotten better performance from other hardware upgrades. Which leads to my second question. Does glusterfsd spawn multiple threads to handle other requests made of it? I don't see any evidence of this in the `top` program, but other clients don't notice at all that I'm running up the CPU usage with my one `ls` process. Smaller mail accounts can read their mail just as quickly as if the system were at near-idle while this operation is in progress. It's also hard for me to test this with only one mail server attached to the Gluster cluster. I can't tell if the additional load from 20 or 100 other servers makes any difference to CPU usage, but we want to know about what performance we can expect should we expand that far, and whether throwing more CPUs at the problem is the answer, or just throwing faster CPUs at the problem is what we will need to do in the future.
Ben Turner
2015-May-07 23:06 UTC
[Gluster-users] Improving Gluster performance through more hardware.
----- Original Message -----> From: "Ernie Dunbar" <maillist at lightspeed.ca> > To: "Gluster Users" <gluster-users at gluster.org> > Sent: Thursday, May 7, 2015 2:36:08 PM > Subject: [Gluster-users] Improving Gluster performance through more hardware. > > Hi all. > > First, I have a specific question about what hardware should be used for > Gluster, then after that I have a question about how Gluster does its > multithreading/hyperthreading. > > So, we have a new Gluster cluster (currently, two servers with one > "replicated" volume) serving up our files for e-mail, which has for > years been stored in Maildir format. That works pretty well except for > the few clients who store all their old mail on our server, and their > "cur" folder contains a few tens of thousands of messages. As others > have noticed, this isn't something that Gluster handles well. But we > value high availability and redundancy more than we value fast, and we > don't yet have a large enough cluster to justify going with software the > requires a metadata server. So we're going with Gluster as a result of > this. That doesn't mean we don't need better performance though. > > So I've noticed that the resources that Gluster consumes the most in our > use case isn't the network or disk utilization - both of which remain > *well* under full utilization - but CPU cycles. I can easily test this > by running `ls -l` in a folder with ~20,000 files in it, and I see CPU > usage by glusterfsd jump to between 40-200%. The glusterfs process > usually stays around 20-30%. > > Both of our Gluster servers are gen III Dell 2950's with dual Xeon > E5345's (quad-core, 2.33 GHz CPUs) in them, so we have 8 CPUs total to > deal with this load. So far, we're only using a single mail server, but > we'll be migrating to a load-balanced pair very soon. So my guess is > that we can reduce the latency that's very noticeable in our webmail by > upgrading to the fastest CPUs the 2950's can hold, evidently a 3.67 GHz > quad-core. > > It would be nice to know what other users have experienced with this > kind of upgrade, or whether they've gotten better performance from other > hardware upgrades. > > Which leads to my second question. Does glusterfsd spawn multiple > threads to handle other requests made of it? I don't see any evidence of > this in the `top` program, but other clients don't notice at all that > I'm running up the CPU usage with my one `ls` process. Smaller mail > accounts can read their mail just as quickly as if the system were at > near-idle while this operation is in progress. It's also hard for me to > test this with only one mail server attached to the Gluster cluster. I > can't tell if the additional load from 20 or 100 other servers makes any > difference to CPU usage, but we want to know about what performance we > can expect should we expand that far, and whether throwing more CPUs at > the problem is the answer, or just throwing faster CPUs at the problem > is what we will need to do in the future.Alot of what you are seeing is getting addressed with: http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf Specifically: http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf#multi-thread-epoll In the past the single event listener thread would peg out a CPU(hot thread) and until MT epoll throwing more CPUs at the problem wouldn't help much: "Previously, epoll thread did socket even-handling and the same thread was used for serving the client or processing the response received from the server. Due to this, other requests were in a queue until the current epoll thread completed its operation. With multi-threaded epoll, events are distributed that improves the performance due the parallel processing of requests/responses received." Here are the guidelines for tuning them: https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/Small_File_Performance_Enhancements.html Server and client event threads are available in 3.7, and more improvements are in the pipe. I would start with 4 of each and do some tuning to see what fits your workload best. I just ran a test where I created ~300 GB with of 64k files. On 3.7 beta I got: 4917.50 files / second across 4 clients mounting a 2x2 dist rep volume. The same test on 3.6 + same HW: 2069.28 files / second across 4 clients mounting a 2x2 dist rep volume. I was running smallfile: http://www.gluster.org/community/documentation/index.php/Performance_Testing#smallfile_Distributed_I.2FO_Benchmark To confirm you are hitting the hot thread I suggest running the benchamark of your choice(I like smallfile for this) and on the brick servers hit: # top -H If you see one of the gluster threads at 100% CPU then you are probably hitting the hot event thread issue that MT epoll addresses. Here is what my top -H list looks like during a smallfile run: Tasks: 640 total, 3 running, 637 sleeping, 0 stopped, 0 zombie Cpu(s): 12.8%us, 11.1%sy, 0.0%ni, 74.3%id, 0.0%wa, 0.0%hi, 1.7%si, 0.0%st Mem: 49544600k total, 5809544k used, 43735056k free, 6344k buffers Swap: 24772604k total, 0k used, 24772604k free, 4380832k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3278 root 20 0 2005m 90m 4252 R 65.4 0.2 4:45.85 glusterfsd 4155 root 20 0 2005m 90m 4252 S 64.0 0.2 4:32.96 glusterfsd 4156 root 20 0 2005m 90m 4252 R 64.0 0.2 4:19.60 glusterfsd 3277 root 20 0 2005m 90m 4252 S 63.7 0.2 4:45.19 glusterfsd 4224 root 20 0 2005m 90m 4252 S 26.7 0.2 1:54.49 glusterfsd 6106 root 20 0 2005m 90m 4252 S 26.4 0.2 0:46.62 glusterfsd 4194 root 20 0 2005m 90m 4252 S 25.4 0.2 1:58.92 glusterfsd 4222 root 20 0 2005m 90m 4252 S 25.4 0.2 1:53.72 glusterfsd 4051 root 20 0 2005m 90m 4252 S 24.4 0.2 2:08.99 glusterfsd 3647 root 20 0 2005m 90m 4252 S 24.1 0.2 2:07.82 glusterfsd 3280 root 20 0 2005m 90m 4252 S 23.4 0.2 2:13.00 glusterfsd 4223 root 20 0 2005m 90m 4252 S 23.1 0.2 1:53.21 glusterfsd 4227 root 20 0 2005m 90m 4252 S 23.1 0.2 1:54.60 glusterfsd 4226 root 20 0 2005m 90m 4252 S 22.4 0.2 1:54.64 glusterfsd 6107 root 20 0 2005m 90m 4252 S 22.1 0.2 0:46.16 glusterfsd 6108 root 20 0 2005m 90m 4252 S 22.1 0.2 0:46.07 glusterfsd 4052 root 20 0 2005m 90m 4252 S 21.5 0.2 2:08.35 glusterfsd 4053 root 20 0 2005m 90m 4252 S 21.1 0.2 2:08.40 glusterfsd 4195 root 20 0 2005m 90m 4252 S 20.8 0.2 1:58.29 glusterfsd 4225 root 20 0 2005m 90m 4252 S 20.5 0.2 1:53.36 glusterfsd 3286 root 20 0 2005m 90m 4252 S 7.9 0.2 0:43.18 glusterfsd 2817 root 20 0 0 0 0 S 1.3 0.0 0:02.18 xfslogd/1 2757 root 20 0 0 0 0 S 0.7 0.0 0:42.58 dm-thin 2937 root 20 0 0 0 0 S 0.7 0.0 0:01.80 xfs-cil/dm-6 10039 root 20 0 15536 1692 932 R 0.7 0.0 0:00.17 top 155 root 20 0 0 0 0 S 0.3 0.0 0:00.27 kblockd/1 3283 root 20 0 2005m 90m 4252 S 0.3 0.2 0:00.08 glusterfsd 1 root 20 0 19356 1472 1152 S 0.0 0.0 0:03.08 init HTH! -b> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >