thr3ads.net - Gluster users - [Gluster-users] Mechanisms for automatic management of Gluster [Nov 2013]

If this information is useful, please help other people find it:
Share via:

James

2013-Nov-27 15:42 UTC

[Gluster-users] Mechanisms for automatic management of Gluster

Hi,

This is along the lines of "tools for sysadmins". I plan on using
these algorithms for puppet-gluster, but will try to maintain them
separately as a standalone tool.

The problem: Given a set of bricks and servers, if they have a logical
naming convention, can an algorithm decide the ideal order. This could
allow parameters such as replica count, and
chained=true/false/offset#.

The second problem: Given a set of bricks in a volume, if someone adds
X bricks and removes Y bricks, is this valid, and what is the valid
sequence of add/remove brick commands.

I've written some code with test cases to try and figure this all out.
I've left out a lot of corner cases, but the boilerplate is there to
make it happen. Hopefully it's self explanatory. (gluster.py) Read and
run it.

Once this all works, the puppet-gluster use case is magic. It will be
able to take care of these operations for you (if you want).

For non puppet users, this will give admins the confidence to know
what commands they should _probably_ run in what order. I say probably
because we assume that if there's an error, they'll stop and inspect
first.

I haven't yet tried to implement the chained cases, or anything
involving striping. There are also some corner cases with some of the
current code. Once you add chaining and striping, etc, I realized it
was time to step back and ask for help :)

I hope this all makes sense. Comments, code, test cases are appreciated!

Cheers,

James
@purpleidea (irc/twitter)
https://ttboj.wordpress.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gluster.py
Type: text/x-python
Size: 16085 bytes
Desc: not available
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131127/27311df2/attachment.py>

Anand Avati

2013-Dec-11 23:06 UTC

head link

[Gluster-users] Mechanisms for automatic management of Gluster

James,
This is the right way to think about the problem. I have more specific
comments in the script, but just wanted to let you know this is a great
start.

Thanks!


On Wed, Nov 27, 2013 at 7:42 AM, James <purpleidea at gmail.com> wrote:
> Hi,
>
> This is along the lines of "tools for sysadmins". I plan on using
> these algorithms for puppet-gluster, but will try to maintain them
> separately as a standalone tool.
>
> The problem: Given a set of bricks and servers, if they have a logical
> naming convention, can an algorithm decide the ideal order. This could
> allow parameters such as replica count, and
> chained=true/false/offset#.
>
> The second problem: Given a set of bricks in a volume, if someone adds
> X bricks and removes Y bricks, is this valid, and what is the valid
> sequence of add/remove brick commands.
>
> I've written some code with test cases to try and figure this all out.
> I've left out a lot of corner cases, but the boilerplate is there to
> make it happen. Hopefully it's self explanatory. (gluster.py) Read and
> run it.
>
> Once this all works, the puppet-gluster use case is magic. It will be
> able to take care of these operations for you (if you want).
>
> For non puppet users, this will give admins the confidence to know
> what commands they should _probably_ run in what order. I say probably
> because we assume that if there's an error, they'll stop and
inspect
> first.
>
> I haven't yet tried to implement the chained cases, or anything
> involving striping. There are also some corner cases with some of the
> current code. Once you add chaining and striping, etc, I realized it
> was time to step back and ask for help :)
>
> I hope this all makes sense. Comments, code, test cases are appreciated!
>
> Cheers,
>
> James
> @purpleidea (irc/twitter)
> https://ttboj.wordpress.com/
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131211/7b45e67c/attachment.html>

Jeff Darcy

2014-Feb-12 17:24 UTC

head link

[Gluster-users] [Gluster-devel] Mechanisms for automatic management of Gluster

> This is along the lines of "tools for sysadmins". I plan on using
> these algorithms for puppet-gluster, but will try to maintain them
> separately as a standalone tool.
> 
> The problem: Given a set of bricks and servers, if they have a logical
> naming convention, can an algorithm decide the ideal order. This could
> allow parameters such as replica count, and
> chained=true/false/offset#.
> 
> The second problem: Given a set of bricks in a volume, if someone adds
> X bricks and removes Y bricks, is this valid, and what is the valid
> sequence of add/remove brick commands.
> 
> I've written some code with test cases to try and figure this all out.
> I've left out a lot of corner cases, but the boilerplate is there to
> make it happen. Hopefully it's self explanatory. (gluster.py) Read and
> run it.
> 
> Once this all works, the puppet-gluster use case is magic. It will be
> able to take care of these operations for you (if you want).
> 
> For non puppet users, this will give admins the confidence to know
> what commands they should _probably_ run in what order. I say probably
> because we assume that if there's an error, they'll stop and
inspect
> first.
> 
> I haven't yet tried to implement the chained cases, or anything
> involving striping. There are also some corner cases with some of the
> current code. Once you add chaining and striping, etc, I realized it
> was time to step back and ask for help :)
> 
> I hope this all makes sense. Comments, code, test cases are appreciated!
It's a good start.  For the chained case, you'd probably want to start
with something like this:

        # Convert the input into a "list of lists" like this:
        #   [
        #      [ 'host1', [ 'path1', 'path2', ... ],
        #      [ 'host2', [ 'path1', 'path2', ... ],
        #      ...
        #   ]
        out_list = []
        while in_list:
                first_host = in_list.pop()
                first_path = first_host[1].pop()
                # If there are any bricks left on this host, move the host to
                # the end so the next iteration will start with the next host.
                # Otherwise, we've used all bricks from this host so
discard.
                if first_host[1]:
                        in_list.append(first_host)
                second_host = in_list[0]
                second_host = second_host[1].pop()
                # Have we exhausted this host as well?
                if not second_host[1]:
                        del in_list[0]
               
out_list.append({'host':first_host[0],'path',first_path})
               
out_list.append({'host':second_host[0],'path',second_path})
        return out_list

(I haven't actually run this.  It's merely illustrative of the
algorithm.)

Can you spot the bug?  If one host has more bricks than the others, it might
run out of bricks on other hosts to pair with, so it'll end up pairing with
itself.  For example, consider the following input:

        H1P1, H1P2, H1P3, H1P4, H2P1, H2P2, H3P1, H3P2

This algorithm would yield the following replica pairs.

        H1P1 + H2P1
        H2P2 + H3P1
        H3P2 + H1P2
        H1P3 + H1P4 (oops)

Instead, we need to find this:

        H1P1 + H2P1
        H2P2 + H1P2
        H1P3 + H3P1
        H3P2 + H1P4

I would actually not try to deal with this in the loop above.  Why not?
Because that loop's already going to get a bit hairy when it's enhanced
to
handle replica counts greater than two.  Instead, I would deal with the
imbalance cases *up front* - check the number of bricks for each host, then
equalize them e.g. by splitting a host with many bricks into two virtual hosts
separated by enough others that they'll never pair with one another.

Alternatively, one could do a recursive implementation, roughly like this:

        if less than rep_factor hosts left, fail
        pick rep_factor bricks from different hosts
        loop:
                pass remainder to recursive call
                if result is valid, combine and return
                pick a *different* rep_factor bricks from different hosts

That will generate *some* valid order if any exists, but it will tend toward
sub-optimal orders where e.g. all of X's bricks are paired with all of
Y's
instead of being spread around.  There might be some sort of
"optimization"
pass we could do that would swap replica-set members to address this, but
I'm
sure you can see it's already becoming a hard problem.  I'd have to code
up
full versions of both algorithms and run them on many different inputs to say
with any confidence which is better.

Gluster users - Nov 2013 - Mechanisms for automatic management of Gluster

[Gluster-users] Mechanisms for automatic management of Gluster

[Gluster-users] Mechanisms for automatic management of Gluster

[Gluster-users] [Gluster-devel] Mechanisms for automatic management of Gluster