John Madden
2009-Dec-03 22:10 UTC
[Gluster-users] client-side cpu usage, performance issue
I experienced some embarrassingly bad performance today from a two-node AFR used by two clients to store and share PHP sessions. (I ended up switching to NFS by the end of the day.) It was on average a few thousand sessions with a good smattering of create/write/read with pretty high concurrency due to some thousands of hits per minute. I played with settings galore from threading to caching to writeback caching to client io threads and got about nowhere. The symptoms are extremely latent i/o requests and high client-side CPU usage but little if any server-side usage and no actual disk i/o to speak of. All four nodes are virtualized RHEL 5 instances connected over Gbit. The last-used configs are below. Any ideas? Server: volume php-sessions type storage/posix option directory /var/glusterfs/php-sessions end-volume volume php-sessions-locks type features/locks option mandatory-locks on subvolumes php-sessions end-volume volume php-sessions-brick type performance/io-threads option thread-count 16 # default is 16 subvolumes php-sessions-locks end-volume volume server type protocol/server option transport-type tcp option transport.socket.nodelay on option auth.addr.php-sessions-brick.allow 1.2.3.4,1.2.3.5 option listen-port 6996 subvolumes php-sessions-brick end-volume Client: volume gluster0 type protocol/client option transport-type tcp option remote-host gluster0 option remote-port 6996 option transport.socket.nodelay on option remote-subvolume php-sessions-brick end-volume volume gluster1 type protocol/client option transport-type tcp option remote-host gluster1 option remote-port 6996 option transport.socket.nodelay on option remote-subvolume php-sessions-brick end-volume volume mirror-0 type cluster/replicate subvolumes gluster0 gluster1 end-volume volume writeback type performance/write-behind option window-size 1MB subvolumes mirror-0 end-volume volume io-cache type performance/io-cache option cache-size 512MB subvolumes writeback end-volume volume iothreads type performance/io-threads option thread-count 4 # default is 16 subvolumes io-cache end-volume TIA, John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden at ivytech.edu
Anush Shetty
2009-Dec-07 02:23 UTC
[Gluster-users] client-side cpu usage, performance issue
Hi John, For reading small files, you could try using Quick-read translator. http://gluster.com/community/documentation/index.php/Translators/performance/quick-read Also we would like to know the GlusterFS version no used for this setup. - Anush On Fri, Dec 4, 2009 at 3:40 AM, John Madden <jmadden at ivytech.edu> wrote:> I experienced some embarrassingly bad performance today from a two-node AFR > used by two clients to store and share PHP sessions. (I ended up switching > to NFS by the end of the day.) ?It was on average a few thousand sessions > with a good smattering of create/write/read with pretty high concurrency due > to some thousands of hits per minute. > > I played with settings galore from threading to caching to writeback caching > to client io threads and got about nowhere. ?The symptoms are extremely > latent i/o requests and high client-side CPU usage but little if any > server-side usage and no actual disk i/o to speak of. > > All four nodes are virtualized RHEL 5 instances connected over Gbit. The > last-used configs are below. ?Any ideas? > > Server: > > volume php-sessions > ?type storage/posix > ?option directory /var/glusterfs/php-sessions > end-volume > volume php-sessions-locks > ? ?type features/locks > ? ?option mandatory-locks on > ? ?subvolumes php-sessions > end-volume > volume php-sessions-brick > ?type performance/io-threads > ?option thread-count 16 # default is 16 > ?subvolumes php-sessions-locks > end-volume > volume server > ? ?type protocol/server > ? ?option transport-type tcp > ? ?option transport.socket.nodelay on > ? ?option auth.addr.php-sessions-brick.allow 1.2.3.4,1.2.3.5 > ? ?option listen-port 6996 > ? ?subvolumes php-sessions-brick > end-volume > > Client: > > volume gluster0 > ? ?type protocol/client > ? ?option transport-type tcp > ? ?option remote-host gluster0 > ? ?option remote-port 6996 > ? ?option transport.socket.nodelay on > ? ?option remote-subvolume php-sessions-brick > end-volume > volume gluster1 > ? ?type protocol/client > ? ?option transport-type tcp > ? ?option remote-host gluster1 > ? ?option remote-port 6996 > ? ?option transport.socket.nodelay on > ? ?option remote-subvolume php-sessions-brick > end-volume > volume mirror-0 > ? ?type cluster/replicate > ? ?subvolumes gluster0 gluster1 > end-volume > volume writeback > ? ?type performance/write-behind > ? ?option window-size 1MB > ? ?subvolumes mirror-0 > end-volume > volume io-cache > ? ?type performance/io-cache > ? ?option cache-size 512MB > ? ?subvolumes writeback > end-volume > volume iothreads > ?type performance/io-threads > ?option thread-count 4 # default is 16 > ?subvolumes io-cache > end-volume > > > TIA, > ?John > > > > > > -- > John Madden > Sr UNIX Systems Engineer > Ivy Tech Community College of Indiana > jmadden at ivytech.edu > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >
Tejas N. Bhise
2009-Dec-07 16:51 UTC
[Gluster-users] client-side cpu usage, performance issue
Hi John, Thank you for sharing information about you setup. Is the application you are using, something that we can easily setup and use for generating more diagnostics inhouse at Gluster ? Regards, Tejas. ----- Original Message ----- From: "John Madden" <jmadden at ivytech.edu> To: "Anush Shetty" <anush at gluster.com> Cc: gluster-users at gluster.org Sent: Monday, December 7, 2009 8:20:52 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi Subject: Re: [Gluster-users] client-side cpu usage, performance issue> For reading small files, you could try using Quick-read translator. > > http://gluster.com/community/documentation/index.php/Translators/performance/quick-readThis was one of the caching options I explored but it didn't seem to help. Also, keep in mind that it isn't just reading but lots of writes too.> Also we would like to know the GlusterFS version no used for this setup.Apologies. This is on 2.0.8, pre-built RPMs off the site. John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden at ivytech.edu _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Krzysztof Strasburger
2009-Dec-09 06:07 UTC
[Gluster-users] client-side cpu usage, performance issue
> I hear you, I had followed this article specifically in an attempt to > improve performance. I suppose I'm hoping for more specifics on how > these values correspond to an application, a number of cpu's, a brand of > network card, etc. io-threads counts, for example, only seem to drive > load average higher, as they all sit there chewing up cpu anyway, so you > lower them and get lower overall system load but higher latency. But > why would the glusterfs process need cpu time anyway? > > JohnIMHO adding new translators here is the wrong way to solve the problem, as it could be related to high memory usage by the client, which was reported by me. No extra caches, no io-threads, just a plain unify-over-replicate setup. I observed also high CPU load and high latency, but concentrated on the memory usage. Same behavior occured with plain striping, so it seems to be setup-independent. I would suspect aggressive caching of some data describing every file touched by the glusterfs client. In fact, these data seem to be kept forever, as the memory is never freed and no new allocations are made, if the same files are accessed for the second time. It is sufficient to run ls -R or du on a big directory tree and run top, to see the memory usage of glusterfs client increasing nicely up to hundreds of megabytes. John, do you see it too? Of course, cache lookups become expensive and we have high CPU load. I tried to find it in the code, but no success :(. Dear developers, wouldn't it be better to forget everything about a file, which has been closed? Just tell me, where to search in the sources, if you are overloaded with other work. Krzysztof BTW, glusterfs version is 2.0.8.