Greetings, I am having what I perceive to be AFR performance problems. Before I get to that, I will briefly describe the setup... /** Setup **/ I have glusterfs up and running using the gluster optimized fuse module on the latest Centos 5 kernel (2.6.18-164.6.1.el5 #1 SMP) running on two machines (server and client volume configurations are below). Both the server and client run on both machines. Both servers are connected by a single CAT6 cable running directly into the Gigabit NICs dedicated to this task (no switch is used). My goal is simply to mirror files across both servers. As far as the files themselves, it is mixed, but there are many and about 90% of them are under 50K. Each server runs a Quad core Q6600 processor with 8GB or RAM. The disks are quite speedy - running 15K RPM SAS drives hooked to a 3ware controller (RAID 5 with a 512MB cache). The filesystem is ext3 mounted with noatime. Writing directly to the ext3 partition with dd if=/dev/zero of=/sites/disktest bs=1M count=2048 yields 2147483648 bytes (2.1 GB) copied, 4.68686 seconds, 458 MB/s. Kernel optimizations on both servers outside of a stock CentOS 5 setup include: 3ware controller specific to avoid iowait latency under load: echo 64 > /sys/block/sda/queue/max_sectors_kb /sbin/blockdev --setra 8192 /dev/sda echo 128 > /sys/block/sda/queue/nr_requests echo 64 > /sys/block/sda/device/queue_depth echo 10 > /proc/sys/vm/swappiness echo 16 > /proc/sys/vm/page-cluster echo 2 > /proc/sys/vm/dirty_background_ratio echo 40 > /proc/sys/vm/dirty_ratio Tweaks for better network performance (sysctl.conf): net/core/rmem_max = 8738000 net/core/wmem_max = 8738000 net/ipv4/tcp_rmem = 8192 873800 8738000 net/ipv4/tcp_wmem = 4096 873800 8738000 /** Gluster Results **/ It should be noted that for the below test results I did not see high CPU or IOwait times during the tests. Also, there are no other active processes running on either server. Doing a simple write test using "dd if=/dev/zero of=/sites/sites/glustertest bs=1M count=2048" I am seeing: 2048+0 records in 2048+0 records out 2147483648 bytes (2.1 GB) copied, 27.8451 seconds, 77.1 MB/s Which is acceptable for my purposes. I expected around 80 MB/s with the gigabit NICs being the obvious bottleneck. So for a more real-world test using the actual files to be clustered, I took a small subset of the files (22016 of them - 440M in total) and extracted them from a tarball onto the /sites/sites (mount point for the tests) replicated cluster. It took 17m28.972s to extract all files. By way of comparison it takes 0m5.102s when extracting just to the ext3 partition. Here are some unlinking times (for the 22016 files) - gluster mount: 0m28.428s ext3: 0m0.456s. And here are some read times: gluster mount: 0m19.871s. You can note from my config that these times are with "option flush-behind on" for write-behind. During the write test, I monitored NIC stats on the receiving server to see how much the link was utilized - its peak was 4.80Mb - so the NIC was not the bottleneck either. I just cannot find the hold up, the network, disks, and cpu are not loaded during the write test. So the biggest issue seems to be AFR write performance. Is this normal or is there something specific to my setup causing these problems? Obviously I am new to glusterfs so I do not know what to expect, but I think I must be doing something wrong. Any help/advice/direction is greatly appreciated. I have googled and googled and found no advice that has yielded real results. Sorry if I missed something obvious that was documented. Michael Volume files (same on each server) were first created using the /usr/bin/glusterfs-volgen --raid 1 --cache-size 512MB --export-directory /sites_gfs --name sites1 172.16.0.1 172.16.0.2 /** Server - adapted from generated to add one other directory **/ volume posix_sites type storage/posix option directory /sites_gfs end-volume volume posix_phplib type storage/posix option directory /usr/local/lib/php_gfs end-volume volume locks_sites type features/locks subvolumes posix_sites end-volume volume locks_phplib type features/locks subvolumes posix_phplib end-volume volume brick_sites type performance/io-threads option thread-count 8 subvolumes locks_sites end-volume volume brick_phplib type performance/io-threads option thread-count 8 subvolumes locks_phplib end-volume volume server type protocol/server option transport-type tcp option auth.addr.brick_sites.allow * option auth.addr.brick_phplib.allow * option listen-port 6996 subvolumes brick_sites brick_phplib end-volume /** Client - adapted from generated to try and fix write issues - to no avail**/ volume 172.16.0.1 type protocol/client option transport-type tcp option remote-host 172.16.0.1 option remote-port 6996 option remote-subvolume brick_sites end-volume volume 172.16.0.2 type protocol/client option transport-type tcp option remote-host 172.16.0.2 option remote-port 6996 option remote-subvolume brick_sites end-volume volume mirror-0 type cluster/replicate subvolumes 172.16.0.1 172.16.0.2 end-volume volume writebehind type performance/write-behind option cache-size 1MB option flush-behind on subvolumes mirror-0 end-volume volume io-cache type performance/io-cache option cache-size 64MB subvolumes writebehind end-volume