Scott Martin
2010-Jan-14 22:49 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
I'm trying to rsync a directory containing around 20K small files totaling
100 MB into a new Gluster volume. I'm getting super-slow performance, with
the rsync taking on the order of an hour to complete. Is this a normal
performance profile for writing many small files? I mean, I understand that
writes will be slower than reads, but should it really be this slow?
I have two machines running the server and one of those machines mounting as
a client. The rsync is going from the local disk on the second machine to
the Gluster volume. The two machines are both in Amazon EC2, so their
network connection should not be a bottleneck.
Any help would be greatly appreciated.
Configuration files follow, mostly from here:
http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files
Server configuration:
volume posix
type storage/posix
option directory /mnt/gluster-files
end-volume
volume locks
type features/locks
subvolumes posix
end-volume
volume iothreads
type performance/io-threads
subvolumes locks
end-volume
volume writebehind
type performance/write-behind
option cache-size 1000MB # default is equal to aggregate-size
option flush-behind off # default is 'off'
# too aggressive and slow background flush!
# do not enable for php sessions behaviour
subvolumes iothreads
end-volume
volume brick
type performance/io-cache
option cache-size 2000MB # default is 32MB
# option priority *.h:3,*.html:2,*:1 # default is '*:0'
option cache-timeout 1 # default is 1 second
subvolumes writebehind
end-volume
volume server
type protocol/server
option transport-type tcp
option auth.addr.brick.allow *
subvolumes brick
end-volume
Client configuration:
volume master
type protocol/client
option transport-type tcp
option remote-host 10.251.39.98
option remote-subvolume brick
end-volume
volume slave01
type protocol/client
option transport-type tcp
option remote-host 10.250.159.143
option remote-subvolume brick
end-volume
volume nufa
type cluster/nufa
option local-volume-name slave01
subvolumes master slave01
end-volume
volume iocache
type performance/io-cache
option cache-size 1000MB # default is 32MB
option cache-timeout 1 # default is 1 second
subvolumes nufa
end-volume
volume writeback
type performance/write-behind
option cache-size 500MB # default is equal to aggregate-size
option flush-behind off # default is 'off'
# too aggressive and slow background flush!
# do not enable for php sessions behaviour
subvolumes iocache
end-volume
volume quickread
type performance/quick-read
option cache-timeout 1 # default 1 second
option max-file-size 256KB # default 64Kb
subvolumes writeback
end-volume
volume iothreads
type performance/io-threads
option thread-count 16 # default is 16
subvolumes quickread
end-volume
Andre Felipe Machado
2010-Jan-15 12:38 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
Hello,
It seems that you did not use the
option transport.socket.nodelay on # undocumented option for speed
# http://gluster.org/pipermail/gluster-users/2009-September/003158.html
as listed at [0].
This option has impact for small files.
I did not test these suggested configs for v 3.x yet.
Regards.
Andre Felipe
[0] http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files
Andre Felipe Machado
2010-Jan-15 12:47 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
Hello, It also seems that you did not use the option lookup-unhashed off # off will reduce cpu usage, and network at sections as described at [0] Also, you must tune caches, threads count and various time outs for your hardware, network and app behaviour. Regards. Andre Felipe [0] http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files
John Madden
2010-Jan-15 16:10 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
> Configuration files follow, mostly from here: > http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_filesI've followed this guide as well and found it to be helpful but not entirely correct. I've found the quickread translator to be more of a burden than a benefit -- try your setup without it? Also, writebehind is a client-side thing, so remove that from the server (server is a normal userspace process, so it'll get the benefit of kernel filesystem cache). John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden at ivytech.edu
Andre Felipe Machado
2010-Jan-15 19:04 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
Hello,
It seems that you did not use the
option transport.socket.nodelay on # undocumented option for speed
# http://gluster.org/pipermail/gluster-users/2009-September/003158.html
as listed at [0].
This option has impact for small files.
I did not test these suggested configs for v 3.x yet.
Regards.
Andre Felipe
[0] http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files
Andre Felipe Machado
2010-Jan-15 19:05 UTC
[Gluster-users] Very slow writes of many small files in NUFA configuration, is this expected?
Hello, It also seems that you did not use the option lookup-unhashed off # off will reduce cpu usage, and network at sections as described at [0] Also, you must tune caches, threads count and various time outs for your hardware, network and app behaviour. Regards. Andre Felipe [0] http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files