thr3ads.net - Gluster users - [Gluster-users] distribute replicated volume and tons of questions [Feb 2017]

If this information is useful, please help other people find it:
Share via:

Joe Julian

2017-Feb-22 18:27 UTC

[Gluster-users] distribute replicated volume and tons of questions

On 02/21/17 09:33, Gandalf Corvotempesta wrote:> Some questions:
>
> 1) can I start with a simple replicated volume and then move to a
> ditributed, replicated by adding more bricks ? I would like to start
> with 3 disks and then add 3 disks more in next month.
> seems stupid but this allow me to buy disks from different production
batches.
Yes, you'll need to rebalance after you add a dht set so the hash table 
can utilize the new subvolume(s).
>
> 2) let's assume (to keep it simple) a 1GB file with sharding enabled
> with 100MB size.
> In a replicated volume with just 1 replicated brick, all shared (and
> thus the file) are placed on the brick (replicated to 3 servers).
> What in case of 2 bricks ? Gluster will place shard 1 to 5 on brick1,
> and 6 to 10 on brick2 or "distribution" only happen for the whole
file
> ? (in example, all shards for file1 are placed on brick1, and all
> shards for file2 are placed on brick2)
My understanding is that the shards will be distributed using the same 
distributed hash table algorithm as any other file. (See 
https://joejulian.name/blog/dht-misses-are-expensive/ )
> 3) Based on question 2, when accessing a distributed file, gluster
> will read from all disks increasing the available bandwidth and
> thourhgput ?
That depends on where your bandwidth bottlenecks are.
>
> 4) Still keeping it simple, very simple: let's assume a VM with 10GB
> disk image placed on a distributed replicated volume. This VM hosts a
> simple webserver with a simple, but huge, website.
> Users accessing the website will access different section of the
> underlaying disk image.
> These accesses are distributed across the 2 bricks doubling the read
> performance (and write performance, as I can write on 2 disks at once)
> ?
If your web servers are hitting the disk for every page load, you're 
doing it wrong. As for your performance question, you are on the right 
train of thought.
> 5) by using ZFS, should I use a redundant ZIL ? What happens in case
> of ZIL failure? Usually, some date are lost, but Gluster is replicated
> in a syncronous way, thus loosing a ZIL on a single server should not
> be an issue, right ? Is gluster able to recover from this
> automatically ?
I can't answer ZFS questions. I, personally, don't feel it's worth
all
the hype it's getting and I don't use it.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

Gandalf Corvotempesta

2017-Feb-22 19:25 UTC

head link

[Gluster-users] distribute replicated volume and tons of questions

2017-02-22 19:27 GMT+01:00 Joe Julian <joe at
julianfamily.org>:> I can't answer ZFS questions. I, personally, don't feel it's
worth all the
> hype it's getting and I don't use it.
the alternative would be XFS, but:
1) i'm using XFS on a backup server. I've *NEVER* seen so many crashes
in latest 2 month than in my 15 years by using ext[[2-4]
2) ZFS has native bit-rot protection and data scrubbing
3) ZFS has de-duplication and compression
4) In case of hardcrash, try to run a XFS fsck on a server with 12x
8TB disks..... Probably you need a week or more. on our backup server
(8TB raid-6) it took 18 hours.

Gluster users - Feb 2017 - distribute replicated volume and tons of questions

[Gluster-users] distribute replicated volume and tons of questions

[Gluster-users] distribute replicated volume and tons of questions