RedShift
2013-Oct-06 07:40 UTC
[Gluster-users] Options to turn off/on for reliable virtual machine writes & write performance
Hi all, I'm building a cluster to host virtual machines to ESXi hosts (using NFS). The point of the cluster is that it should survive an unclean node death (test scenario by hard removing disks or cutting power, etc...), by which I need to make sure all writes are completed on both nodes before gluster returns the operation as completed. For now, I have this: gluster> volume info ha-ds1 Volume Name: ha-ds1 Type: Replicate Volume ID: da2fb668-2f3e-4839-a5da-4a51d5fcba05 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.255.255.1:/vol/gluster/ha-ds1 Brick2: 10.255.255.2:/vol/gluster/ha-ds1 Options Reconfigured: cluster.self-heal-daemon: on performance.flush-behind: Off network.frame-timeout: 30 network.ping-timeout: 15 cluster.heal-timeout: 300 gluster> volume status all detail Status of volume: ha-ds1 ------------------------------------------------------------------------------ Brick : Brick 10.255.255.1:/vol/gluster/ha-ds1 Port : 49153 Online : Y Pid : 2252 File System : ext4 Device : /dev/mapper/stor--node1-gluster Mount Options : rw,noatime,nodiratime,journal_checksum,data=journal,errors=panic,nodelalloc Inode Size : 256 Disk Space Free : 219.3GB Total Disk Space : 269.1GB Inode Count : 17924096 Free Inodes : 17923263 ------------------------------------------------------------------------------ Brick : Brick 10.255.255.2:/vol/gluster/ha-ds1 Port : 49152 Online : Y Pid : 2319 File System : ext4 Device : /dev/mapper/stor--node2-gluster Mount Options : rw,noatime,nodiratime,journal_checksum,data=journal,errors=panic,nodelalloc Inode Size : 256 Disk Space Free : 221.3GB Total Disk Space : 269.1GB Inode Count : 17924096 Free Inodes : 17923162 gluster> (I would also like to grab your attention to the mount options - are those OK or can I do better?) Is this enough to garantuee a proper cluster failover (data is consistent at all times) to the second node without interruption to the virtual machines? In my testing it appears to be, but I want to make sure - maybe someone else has something to add or something to look out for? Second, I'd like to improve the write performance of this cluster. Reads are good (> 110 MB/s, the ESXi servers are connected via gigabit so that'll be the maximum) but writes are only half that (~60 MB/s). The hardware can definitely do more - a simple dd 16 GB filewrite to the underlying filesystem nets ~227 MB/s. I gathered some statistics during sequential write tests, I see the load going to ~15 and some CPU usage but it looks like one CPU core is spendings its majority in IO wait. I know the hardware can perform better - are there any other places I should start looking? Thanks, Glenn