Rafiq Maniar
2010-Nov-18 21:38 UTC
[Gluster-users] GlusterFS3.1 - Bad Write Performance Unzipping Many Small Files
Hi, I'm using Glusterfs3.1 on Ubuntu 10.04 in a dual replication setup, on Amazon EC2. It takes 40-50 seconds to unzip an 8MB zip file full of small files and directories to a gluster mount, in contrast to 0.8 seconds to local disk. The volume configuration was created with: gluster volume create volname replica 1 transport tcp server1:/shared server2:/shared I am mounting on the client via NFS with: mount -t nfs -o async,noatime,nodiratime server1:/shared /mnt/shared And also tried via Gluster native client with: mount -t glusterfs server1:/shared /mnt/shared I found a post here where he author talks about a similar slow unzip of the Linux kernel: http://northernmost.org/blog/improving-glusterfs-performance/ I believe the 'nodelay' option was implemented in response to this, and I have tried using that in the 3 configuration files on the servers but with no improvement. I've also tried some other performance tuning tricks I found on the web. I tried it on another server that has Gluster3.0 with an NFS share but no replication and it completes in 3 seconds. I have similar bad performance with a simple copy of the same files+directories from /tmp into the gluster mount so its not limited to zip. Here is my /etc/glusterd/vols/shared/shared-fuse.vol. Bear in mind that this is 'tuned' but the out-of-the-box version is the same performance. I also tried removing all the performance translators as per someones suggestion in IRC. * * *volume shared-client-0* * type protocol/client* * option remote-host server1* * option remote-subvolume /mnt/shared* * option transport-type tcp* * option transport.socket.nodelay on* *end-volume* * * *volume shared-client-1* * type protocol/client* * option remote-host server2* * option remote-subvolume /mnt/shared* * option transport-type tcp* * option transport.socket.nodelay on* *end-volume* * * *volume shared-replicate-0* * type cluster/replicate* * subvolumes shared-client-0 shared-client-1* *end-volume* * * *volume shared-write-behind* * type performance/write-behind* * option cache-size 100MB* * option flush-behind off* * subvolumes shared-replicate-0* *end-volume* * * *volume shared-read-ahead* * type performance/read-ahead* * subvolumes shared-write-behind* *end-volume* * * *volume shared-io-cache* * type performance/io-cache* * option cache-size 100MB* * option cache-timeout 1* * subvolumes shared-read-ahead* *end-volume* * * *volume shared-quick-read* * type performance/quick-read* * option cache-timeout 1 # default 1 second* * option max-file-size 256KB # default 64Kb* * subvolumes shared-io-cache* *end-volume* * * * * *volume shared* * type debug/io-stats* * subvolumes shared-quick-read* *end-volume* * * And my /etc/glusterd/vols/shared/shared.server1.mnt-shared.vol : *volume shared-posix* * type storage/posix* * option directory /mnt/shared* *end-volume* * * *volume shared-access-control* * type features/access-control* * subvolumes shared-posix* *end-volume* * * *volume shared-locks* * type features/locks* * subvolumes shared-access-control* *end-volume* * * *volume shared-io-threads* * type performance/io-threads* * option thread-count 16* * subvolumes shared-locks* *end-volume* * * *volume /mnt/shared* * type debug/io-stats* * subvolumes shared-io-threads* *end-volume* * * *volume shared-server* * type protocol/server* * option transport-type tcp* * option auth.addr./mnt/shared.allow ** * option transport.socket.nodelay on* * subvolumes /mnt/shared* *end-volume* Here's the output of nfsstat on the client: *Client rpc stats:* *calls retrans authrefrsh* *1652499 231 124* * * *Client nfs v3:* *null getattr setattr lookup access readlink* *0 0% 744498 45% 32762 1% 490843 29% 235276 14% 37 0%* *read write create mkdir symlink mknod* *52085 3% 21940 1% 14452 0% 948 0% 1 0% 0 0%* *remove rmdir rename link readdir readdirplus* *10961 0% 562 0% 19 0% 0 0% 135 0% 32623 1%* *fsstat fsinfo pathconf commit* *140 0% 46 0% 23 0% 15126 0%* Anyone got any ideas on improving the performance of this? Thanks, Rafiq
Rafiq Maniar
2010-Nov-19 05:14 UTC
[Gluster-users] GlusterFS3.1 - Bad Write Performance Unzipping Many Small Files
Hi, I'm using Glusterfs3.1 on Ubuntu 10.04 in a dual replication setup, on Amazon EC2. It takes 40-50 seconds to unzip an 8MB zip file full of small files and directories to a gluster mount, in contrast to 0.8 seconds to local disk. The volume configuration was created with: gluster volume create volname replica 1 transport tcp server1:/shared server2:/shared I am mounting on the client via NFS with: mount -t nfs -o async,noatime,nodiratime server1:/shared /mnt/shared And also tried via Gluster native client with: mount -t glusterfs server1:/shared /mnt/shared I found a post here where he author talks about a similar slow unzip of the Linux kernel: http://northernmost.org/blog/improving-glusterfs-performance/ I believe the 'nodelay' option was implemented in response to this, and I have tried using that in the 3 configuration files on the servers but with no improvement. I've also tried some other performance tuning tricks I found on the web. I tried it on another server that has Gluster3.0 with an NFS share but no replication and it completes in 3 seconds. I have similar bad performance with a simple copy of the same files+directories from /tmp into the gluster mount so its not limited to zip. Here is my /etc/glusterd/vols/shared/shared-fuse.vol. Bear in mind that this is 'tuned' but the out-of-the-box version is the same performance. I also tried removing all the performance translators as per someones suggestion in IRC. * * *volume shared-client-0* * type protocol/client* * option remote-host server1* * option remote-subvolume /mnt/shared* * option transport-type tcp* * option transport.socket.nodelay on* *end-volume* * * *volume shared-client-1* * type protocol/client* * option remote-host server2* * option remote-subvolume /mnt/shared* * option transport-type tcp* * option transport.socket.nodelay on* *end-volume* * * *volume shared-replicate-0* * type cluster/replicate* * subvolumes shared-client-0 shared-client-1* *end-volume* * * *volume shared-write-behind* * type performance/write-behind* * option cache-size 100MB* * option flush-behind off* * subvolumes shared-replicate-0* *end-volume* * * *volume shared-read-ahead* * type performance/read-ahead* * subvolumes shared-write-behind* *end-volume* * * *volume shared-io-cache* * type performance/io-cache* * option cache-size 100MB* * option cache-timeout 1* * subvolumes shared-read-ahead* *end-volume* * * *volume shared-quick-read* * type performance/quick-read* * option cache-timeout 1 # default 1 second* * option max-file-size 256KB # default 64Kb* * subvolumes shared-io-cache* *end-volume* * * * * *volume shared* * type debug/io-stats* * subvolumes shared-quick-read* *end-volume* * * And my /etc/glusterd/vols/shared/shared.server1.mnt-shared.vol : *volume shared-posix* * type storage/posix* * option directory /mnt/shared* *end-volume* * * *volume shared-access-control* * type features/access-control* * subvolumes shared-posix* *end-volume* * * *volume shared-locks* * type features/locks* * subvolumes shared-access-control* *end-volume* * * *volume shared-io-threads* * type performance/io-threads* * option thread-count 16* * subvolumes shared-locks* *end-volume* * * *volume /mnt/shared* * type debug/io-stats* * subvolumes shared-io-threads* *end-volume* * * *volume shared-server* * type protocol/server* * option transport-type tcp* * option auth.addr./mnt/shared.allow ** * option transport.socket.nodelay on* * subvolumes /mnt/shared* *end-volume* Here's the output of nfsstat on the client: *Client rpc stats:* *calls retrans authrefrsh* *1652499 231 124* * * *Client nfs v3:* *null getattr setattr lookup access readlink* *0 0% 744498 45% 32762 1% 490843 29% 235276 14% 37 0%* *read write create mkdir symlink mknod* *52085 3% 21940 1% 14452 0% 948 0% 1 0% 0 0%* *remove rmdir rename link readdir readdirplus* *10961 0% 562 0% 19 0% 0 0% 135 0% 32623 1%* *fsstat fsinfo pathconf commit* *140 0% 46 0% 23 0% 15126 0%* Anyone got any ideas on improving the performance of this? Thanks, Rafiq
Jacob Shucart
2010-Nov-24 16:53 UTC
[Gluster-users] GlusterFS3.1 - Bad Write Performance Unzipping Many Small Files
Rafiq, We have identified a bug that will be fixed in 3.1.1 which should be out very soon that should help with this. -Jacob -----Original Message----- From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Rafiq Maniar Sent: Thursday, November 18, 2010 1:39 PM To: gluster-users at gluster.org Subject: [Gluster-users] GlusterFS3.1 - Bad Write Performance Unzipping Many Small Files Hi, I'm using Glusterfs3.1 on Ubuntu 10.04 in a dual replication setup, on Amazon EC2. It takes 40-50 seconds to unzip an 8MB zip file full of small files and directories to a gluster mount, in contrast to 0.8 seconds to local disk. The volume configuration was created with: gluster volume create volname replica 1 transport tcp server1:/shared server2:/shared I am mounting on the client via NFS with: mount -t nfs -o async,noatime,nodiratime server1:/shared /mnt/shared And also tried via Gluster native client with: mount -t glusterfs server1:/shared /mnt/shared I found a post here where he author talks about a similar slow unzip of the Linux kernel: http://northernmost.org/blog/improving-glusterfs-performance/ I believe the 'nodelay' option was implemented in response to this, and I have tried using that in the 3 configuration files on the servers but with no improvement. I've also tried some other performance tuning tricks I found on the web. I tried it on another server that has Gluster3.0 with an NFS share but no replication and it completes in 3 seconds. I have similar bad performance with a simple copy of the same files+directories from /tmp into the gluster mount so its not limited to zip. Here is my /etc/glusterd/vols/shared/shared-fuse.vol. Bear in mind that this is 'tuned' but the out-of-the-box version is the same performance. I also tried removing all the performance translators as per someones suggestion in IRC. * * *volume shared-client-0* * type protocol/client* * option remote-host server1* * option remote-subvolume /mnt/shared* * option transport-type tcp* * option transport.socket.nodelay on* *end-volume* * * *volume shared-client-1* * type protocol/client* * option remote-host server2* * option remote-subvolume /mnt/shared* * option transport-type tcp* * option transport.socket.nodelay on* *end-volume* * * *volume shared-replicate-0* * type cluster/replicate* * subvolumes shared-client-0 shared-client-1* *end-volume* * * *volume shared-write-behind* * type performance/write-behind* * option cache-size 100MB* * option flush-behind off* * subvolumes shared-replicate-0* *end-volume* * * *volume shared-read-ahead* * type performance/read-ahead* * subvolumes shared-write-behind* *end-volume* * * *volume shared-io-cache* * type performance/io-cache* * option cache-size 100MB* * option cache-timeout 1* * subvolumes shared-read-ahead* *end-volume* * * *volume shared-quick-read* * type performance/quick-read* * option cache-timeout 1 # default 1 second* * option max-file-size 256KB # default 64Kb* * subvolumes shared-io-cache* *end-volume* * * * * *volume shared* * type debug/io-stats* * subvolumes shared-quick-read* *end-volume* * * And my /etc/glusterd/vols/shared/shared.server1.mnt-shared.vol : *volume shared-posix* * type storage/posix* * option directory /mnt/shared* *end-volume* * * *volume shared-access-control* * type features/access-control* * subvolumes shared-posix* *end-volume* * * *volume shared-locks* * type features/locks* * subvolumes shared-access-control* *end-volume* * * *volume shared-io-threads* * type performance/io-threads* * option thread-count 16* * subvolumes shared-locks* *end-volume* * * *volume /mnt/shared* * type debug/io-stats* * subvolumes shared-io-threads* *end-volume* * * *volume shared-server* * type protocol/server* * option transport-type tcp* * option auth.addr./mnt/shared.allow ** * option transport.socket.nodelay on* * subvolumes /mnt/shared* *end-volume* Here's the output of nfsstat on the client: *Client rpc stats:* *calls retrans authrefrsh* *1652499 231 124* * * *Client nfs v3:* *null getattr setattr lookup access readlink* *0 0% 744498 45% 32762 1% 490843 29% 235276 14% 37 0%* *read write create mkdir symlink mknod* *52085 3% 21940 1% 14452 0% 948 0% 1 0% 0 0%* *remove rmdir rename link readdir readdirplus* *10961 0% 562 0% 19 0% 0 0% 135 0% 32623 1%* *fsstat fsinfo pathconf commit* *140 0% 46 0% 23 0% 15126 0%* Anyone got any ideas on improving the performance of this? Thanks, Rafiq