On 09/13/2012 05:24 PM, Bob Black wrote:> Hello community of Gluster,
>
> Sorry for the long post.
> TL;DR: stock gluster 3.3.0 on 5 nodes results in massive data
> corruption on brick "failure" or peer disconnection.
>
> We are having problems with data corruption on VM volumes with VMs
> running on top of Gluster 3.3.0 when introducing brick failures and/or
> node disconnects.
> In our setup we have 5 storage nodes with 16 core AMD Opteron(tm)
> Processor 6128, 32GB ram and 34 2TB SATA disks. To utilize the the
> storage nodes we have 20 compute nodes with 24 core AMD Opteron(TM)
> Processor 6238 and 128GB ram.
>
> To be able to test and verify this setup we installed Gluster 3.3.0 on
> the storage nodes and GlusterFS 3.3.0 client on the compute nodes.
> We created one brick for each hard drive and created
> Distributed-Replicate volume with the bricks with tcp,rdma transport.
> The volume was mounted with glusterfs over tcp transport over
> infiniband on all the compute nodes. We created 500 virtual machines
> on the compute nodes and made them do heavy IO benchmarking on the
> volume and Gluster performed as expected.
> Then we created sanity test script that creates files, copies over and
> over again and does md5 sums of all written data and does md5 check of
> all the operating system. We ran this test on all the VMs
> successfully, then we did it again and stopped one storage node for
> few minutes and started it again, which gluster recovered from
> successfully.
> Then we ran this test again but with kill -9 on all Gluster processes
> on one node for more than an hour. We kept the tests running to
> emulate load and then started the Gluster deamon on the storage node
> again. Now around 10% of all VMs lost connection to Gluster and failed
> to "read-only" file-system and more instances got some data
> corruption, missing or broken files. Very bad!
>
> We wiped the VMs and created new ones instead. Started the same test
> again but now we terminated 4 bricks on one node and carried out load
> testing to test shrinking and re-balancing. Before we got the chance
> to remove/move bricks we started getting bunch of corrupted VMs and
> data corruption and after re-balancing we got a load of kernel panics
> on the VMs. Very bad indeed!
>
> Are anyone else having the same problem, is there anything we are
> doing wrong, is this lost cause?
>
> Thanks for any input.
Hi Bob,
Thanks for detailed bug/issue description. We will work on this with 
priority.
Regards,
Amar