Hello, I have a 2 client (gc1, gc2) and 2 server (gs1, gs2) cluster setup. Both the servers have 2 x 1TB HDD in them, gs1 and gs2 are replicated. With my configuration below, if gs2 goes offline... should I still be able to have access to the cluster? volume client1a type protocol/client option transport-type tcp/client option remote-host gs1 option remote-port 7001 option remote-subvolume brick end-volume volume client2a type protocol/client option transport-type tcp/client option remote-host gs2 option remote-port 7001 option remote-subvolume brick end-volume volume client1b type protocol/client option transport-type tcp/client option remote-host gs1 option remote-port 7002 option remote-subvolume brick end-volume volume client2b type protocol/client option transport-type tcp/client option remote-host gs2 option remote-port 7002 option remote-subvolume brick end-volume volume afr1 type cluster/replicate subvolumes client1a client2a end-volume volume afr2 type cluster/replicate subvolumes client1b client2b end-volume volume distribute type cluster/distribute subvolumes afr1 afr2 end-volume volume iothreads type performance/io-threads option thread-count 8 subvolumes distribute end-volume volume readahead type performance/read-ahead option page-count 8 subvolumes iothreads end-volume
On 08/12/2009 10:33 AM, Simon Liang wrote:> I have a 2 client (gc1, gc2) and 2 server (gs1, gs2) cluster setup. Both the servers have 2 x 1TB HDD in them, gs1 and gs2 are replicated. > > With my configuration below, if gs2 goes offline... should I still be able to have access to the cluster? >Yes? :-) I'd suggest try and see. I'm a proponent of the "pull the plug and see" model of testing before deploying. Too many people trust marketing material, and/or trust their own understanding and choice of configuration. It can be a big surprise for people when they are doing database backups, for instance, when the database actually does require a restore, and the restore process does not work. Oops. AFR puts the data on each of the boxes, AFR has code in it to detect and deal with volumes being unavailable, and AFR has "self-heal" capabilities to try to fix the data once the broken nodes are brought back into service. The theory is yes. Try it out and see for yourself as to whether it works for you in practice. :-) Cheers, mark -- Mark Mielke<mark at mielke.cc>