James
2013-Nov-27 15:42 UTC
[Gluster-users] Mechanisms for automatic management of Gluster
Hi, This is along the lines of "tools for sysadmins". I plan on using these algorithms for puppet-gluster, but will try to maintain them separately as a standalone tool. The problem: Given a set of bricks and servers, if they have a logical naming convention, can an algorithm decide the ideal order. This could allow parameters such as replica count, and chained=true/false/offset#. The second problem: Given a set of bricks in a volume, if someone adds X bricks and removes Y bricks, is this valid, and what is the valid sequence of add/remove brick commands. I've written some code with test cases to try and figure this all out. I've left out a lot of corner cases, but the boilerplate is there to make it happen. Hopefully it's self explanatory. (gluster.py) Read and run it. Once this all works, the puppet-gluster use case is magic. It will be able to take care of these operations for you (if you want). For non puppet users, this will give admins the confidence to know what commands they should _probably_ run in what order. I say probably because we assume that if there's an error, they'll stop and inspect first. I haven't yet tried to implement the chained cases, or anything involving striping. There are also some corner cases with some of the current code. Once you add chaining and striping, etc, I realized it was time to step back and ask for help :) I hope this all makes sense. Comments, code, test cases are appreciated! Cheers, James @purpleidea (irc/twitter) https://ttboj.wordpress.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: gluster.py Type: text/x-python Size: 16085 bytes Desc: not available URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131127/27311df2/attachment.py>
Anand Avati
2013-Dec-11 23:06 UTC
[Gluster-users] Mechanisms for automatic management of Gluster
James, This is the right way to think about the problem. I have more specific comments in the script, but just wanted to let you know this is a great start. Thanks! On Wed, Nov 27, 2013 at 7:42 AM, James <purpleidea at gmail.com> wrote:> Hi, > > This is along the lines of "tools for sysadmins". I plan on using > these algorithms for puppet-gluster, but will try to maintain them > separately as a standalone tool. > > The problem: Given a set of bricks and servers, if they have a logical > naming convention, can an algorithm decide the ideal order. This could > allow parameters such as replica count, and > chained=true/false/offset#. > > The second problem: Given a set of bricks in a volume, if someone adds > X bricks and removes Y bricks, is this valid, and what is the valid > sequence of add/remove brick commands. > > I've written some code with test cases to try and figure this all out. > I've left out a lot of corner cases, but the boilerplate is there to > make it happen. Hopefully it's self explanatory. (gluster.py) Read and > run it. > > Once this all works, the puppet-gluster use case is magic. It will be > able to take care of these operations for you (if you want). > > For non puppet users, this will give admins the confidence to know > what commands they should _probably_ run in what order. I say probably > because we assume that if there's an error, they'll stop and inspect > first. > > I haven't yet tried to implement the chained cases, or anything > involving striping. There are also some corner cases with some of the > current code. Once you add chaining and striping, etc, I realized it > was time to step back and ask for help :) > > I hope this all makes sense. Comments, code, test cases are appreciated! > > Cheers, > > James > @purpleidea (irc/twitter) > https://ttboj.wordpress.com/ > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131211/7b45e67c/attachment.html>
Jeff Darcy
2014-Feb-12 17:24 UTC
[Gluster-users] [Gluster-devel] Mechanisms for automatic management of Gluster
> This is along the lines of "tools for sysadmins". I plan on using > these algorithms for puppet-gluster, but will try to maintain them > separately as a standalone tool. > > The problem: Given a set of bricks and servers, if they have a logical > naming convention, can an algorithm decide the ideal order. This could > allow parameters such as replica count, and > chained=true/false/offset#. > > The second problem: Given a set of bricks in a volume, if someone adds > X bricks and removes Y bricks, is this valid, and what is the valid > sequence of add/remove brick commands. > > I've written some code with test cases to try and figure this all out. > I've left out a lot of corner cases, but the boilerplate is there to > make it happen. Hopefully it's self explanatory. (gluster.py) Read and > run it. > > Once this all works, the puppet-gluster use case is magic. It will be > able to take care of these operations for you (if you want). > > For non puppet users, this will give admins the confidence to know > what commands they should _probably_ run in what order. I say probably > because we assume that if there's an error, they'll stop and inspect > first. > > I haven't yet tried to implement the chained cases, or anything > involving striping. There are also some corner cases with some of the > current code. Once you add chaining and striping, etc, I realized it > was time to step back and ask for help :) > > I hope this all makes sense. Comments, code, test cases are appreciated!It's a good start. For the chained case, you'd probably want to start with something like this: # Convert the input into a "list of lists" like this: # [ # [ 'host1', [ 'path1', 'path2', ... ], # [ 'host2', [ 'path1', 'path2', ... ], # ... # ] out_list = [] while in_list: first_host = in_list.pop() first_path = first_host[1].pop() # If there are any bricks left on this host, move the host to # the end so the next iteration will start with the next host. # Otherwise, we've used all bricks from this host so discard. if first_host[1]: in_list.append(first_host) second_host = in_list[0] second_host = second_host[1].pop() # Have we exhausted this host as well? if not second_host[1]: del in_list[0] out_list.append({'host':first_host[0],'path',first_path}) out_list.append({'host':second_host[0],'path',second_path}) return out_list (I haven't actually run this. It's merely illustrative of the algorithm.) Can you spot the bug? If one host has more bricks than the others, it might run out of bricks on other hosts to pair with, so it'll end up pairing with itself. For example, consider the following input: H1P1, H1P2, H1P3, H1P4, H2P1, H2P2, H3P1, H3P2 This algorithm would yield the following replica pairs. H1P1 + H2P1 H2P2 + H3P1 H3P2 + H1P2 H1P3 + H1P4 (oops) Instead, we need to find this: H1P1 + H2P1 H2P2 + H1P2 H1P3 + H3P1 H3P2 + H1P4 I would actually not try to deal with this in the loop above. Why not? Because that loop's already going to get a bit hairy when it's enhanced to handle replica counts greater than two. Instead, I would deal with the imbalance cases *up front* - check the number of bricks for each host, then equalize them e.g. by splitting a host with many bricks into two virtual hosts separated by enough others that they'll never pair with one another. Alternatively, one could do a recursive implementation, roughly like this: if less than rep_factor hosts left, fail pick rep_factor bricks from different hosts loop: pass remainder to recursive call if result is valid, combine and return pick a *different* rep_factor bricks from different hosts That will generate *some* valid order if any exists, but it will tend toward sub-optimal orders where e.g. all of X's bricks are paired with all of Y's instead of being spread around. There might be some sort of "optimization" pass we could do that would swap replica-set members to address this, but I'm sure you can see it's already becoming a hard problem. I'd have to code up full versions of both algorithms and run them on many different inputs to say with any confidence which is better.