Scott Martin
2010-Jan-14 22:49 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
I'm trying to rsync a directory containing around 20K small files totaling 100 MB into a new Gluster volume. I'm getting super-slow performance, with the rsync taking on the order of an hour to complete. Is this a normal performance profile for writing many small files? I mean, I understand that writes will be slower than reads, but should it really be this slow? I have two machines running the server and one of those machines mounting as a client. The rsync is going from the local disk on the second machine to the Gluster volume. The two machines are both in Amazon EC2, so their network connection should not be a bottleneck. Any help would be greatly appreciated. Configuration files follow, mostly from here: http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files Server configuration: volume posix type storage/posix option directory /mnt/gluster-files end-volume volume locks type features/locks subvolumes posix end-volume volume iothreads type performance/io-threads subvolumes locks end-volume volume writebehind type performance/write-behind option cache-size 1000MB # default is equal to aggregate-size option flush-behind off # default is 'off' # too aggressive and slow background flush! # do not enable for php sessions behaviour subvolumes iothreads end-volume volume brick type performance/io-cache option cache-size 2000MB # default is 32MB # option priority *.h:3,*.html:2,*:1 # default is '*:0' option cache-timeout 1 # default is 1 second subvolumes writebehind end-volume volume server type protocol/server option transport-type tcp option auth.addr.brick.allow * subvolumes brick end-volume Client configuration: volume master type protocol/client option transport-type tcp option remote-host 10.251.39.98 option remote-subvolume brick end-volume volume slave01 type protocol/client option transport-type tcp option remote-host 10.250.159.143 option remote-subvolume brick end-volume volume nufa type cluster/nufa option local-volume-name slave01 subvolumes master slave01 end-volume volume iocache type performance/io-cache option cache-size 1000MB # default is 32MB option cache-timeout 1 # default is 1 second subvolumes nufa end-volume volume writeback type performance/write-behind option cache-size 500MB # default is equal to aggregate-size option flush-behind off # default is 'off' # too aggressive and slow background flush! # do not enable for php sessions behaviour subvolumes iocache end-volume volume quickread type performance/quick-read option cache-timeout 1 # default 1 second option max-file-size 256KB # default 64Kb subvolumes writeback end-volume volume iothreads type performance/io-threads option thread-count 16 # default is 16 subvolumes quickread end-volume
Andre Felipe Machado
2010-Jan-15 12:38 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
Hello, It seems that you did not use the option transport.socket.nodelay on # undocumented option for speed # http://gluster.org/pipermail/gluster-users/2009-September/003158.html as listed at [0]. This option has impact for small files. I did not test these suggested configs for v 3.x yet. Regards. Andre Felipe [0] http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files
Andre Felipe Machado
2010-Jan-15 12:47 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
Hello, It also seems that you did not use the option lookup-unhashed off # off will reduce cpu usage, and network at sections as described at [0] Also, you must tune caches, threads count and various time outs for your hardware, network and app behaviour. Regards. Andre Felipe [0] http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files
John Madden
2010-Jan-15 16:10 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
> Configuration files follow, mostly from here: > http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_filesI've followed this guide as well and found it to be helpful but not entirely correct. I've found the quickread translator to be more of a burden than a benefit -- try your setup without it? Also, writebehind is a client-side thing, so remove that from the server (server is a normal userspace process, so it'll get the benefit of kernel filesystem cache). John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden at ivytech.edu
Andre Felipe Machado
2010-Jan-15 19:04 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
Hello, It seems that you did not use the option transport.socket.nodelay on # undocumented option for speed # http://gluster.org/pipermail/gluster-users/2009-September/003158.html as listed at [0]. This option has impact for small files. I did not test these suggested configs for v 3.x yet. Regards. Andre Felipe [0] http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files
Andre Felipe Machado
2010-Jan-15 19:05 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
Hello, It also seems that you did not use the option lookup-unhashed off # off will reduce cpu usage, and network at sections as described at [0] Also, you must tune caches, threads count and various time outs for your hardware, network and app behaviour. Regards. Andre Felipe [0] http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files