Gandalf Corvotempesta
2014-Dec-23 10:08 UTC
[Gluster-users] New architecture: some advice needed
2014-12-22 13:56 GMT+01:00 Jeff Darcy <jdarcy at redhat.com>:> The key tradeoffs here are storage utilization vs. performance. In > general, erasure codes (disperse) will give better storage utilization > than replication for the same level of performance. However, this might > not be the case for N=3. With replication, that will protect against > two failures. However, from the admin guide section on disperse: > > "redundancy_ must be greater than 0, and the total number of bricks must > be greater than 2 * _redundancy_" > > I interpret this to mean that for two-failure protection you would need > at least five bricks. With three bricks disperse can only offer > one-failure protection. In this case it's roughly equivalent to RAID-5, > with only a 50% storage penalty vs. 100% for replica 2 offering the same > protection. > > The other issue is performance. With disperse, all writes *and reads* > must be done to all bricks, and at a stripe size equal to 512 times the > number of bricks (minus those used for redundancy). This means more > data transfer, especially for reads, and also more write contention than > with replication. This being new code, some optimizations that already > exist for replication do not yet exist for disperse even though they're > applicable.Thank you for the response. So, if I understood properly, disperse is space-optimized but has a performance penalty compared to a standard 'replication' What I would like is to get a RAID-6 equivalent, so, replica 2 should be used to get performance and redudancy, right ? More over, with replication, in case of issue, I can always read raw files from disks, this would not be possible in case of disperse, where each file is splitted in multiple chuncks, right? Any hardware suggestions for my environment ?
Gandalf Corvotempesta
2014-Dec-23 10:09 UTC
[Gluster-users] New architecture: some advice needed
2014-12-23 11:08 GMT+01:00 Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com>:> Any hardware suggestions for my environment ?Other question: do you suggest to use RAID-5 or RAID-6 on each storage server directly using disks as bricks ?
Xavier Hernandez
2014-Dec-23 11:14 UTC
[Gluster-users] New architecture: some advice needed
On 12/23/2014 11:08 AM, Gandalf Corvotempesta wrote:> 2014-12-22 13:56 GMT+01:00 Jeff Darcy <jdarcy at redhat.com>: >> The key tradeoffs here are storage utilization vs. performance. In >> general, erasure codes (disperse) will give better storage utilization >> than replication for the same level of performance. However, this might >> not be the case for N=3. With replication, that will protect against >> two failures. However, from the admin guide section on disperse: >> >> "redundancy_ must be greater than 0, and the total number of bricks must >> be greater than 2 * _redundancy_" >> >> I interpret this to mean that for two-failure protection you would need >> at least five bricks. With three bricks disperse can only offer >> one-failure protection. In this case it's roughly equivalent to RAID-5, >> with only a 50% storage penalty vs. 100% for replica 2 offering the same >> protection. >> >> The other issue is performance. With disperse, all writes *and reads* >> must be done to all bricks, and at a stripe size equal to 512 times the >> number of bricks (minus those used for redundancy). This means more >> data transfer, especially for reads, and also more write contention than >> with replication. This being new code, some optimizations that already >> exist for replication do not yet exist for disperse even though they're >> applicable. > > Thank you for the response. > So, if I understood properly, disperse is space-optimized but has a > performance penalty compared > to a standard 'replication'It has some performance penalty in some scenarios, however it depends on multiple factors. For some workloads ec performs better than replicate thanks to the distribution of the load and the reduced amount of data managed. The only way to be sure for your use case would be to test it.> > What I would like is to get a RAID-6 equivalent, so, replica 2 should > be used to get > performance and redudancy, right ?To get a RAID-6 equivalent you will need a replica 3 or a disperse 5:2 (5 bricks with 2 of redundancy). These are the smaller combinations that support two failed bricks.> > More over, with replication, in case of issue, I can always read raw > files from disks, this > would not be possible in case of disperse, where each file is splitted > in multiple chuncks, right?Yes, with replica you can directly read the files from the bricks. With disperse the files are encoded and splitted, so they cannot be directly recovered. Xavi