Hi, I''m trying to design a solution that will encompass load balancing puppet master. I have two nodes, and idea is to connect them with DRBD, put some cluster aware filesystem (OCFS2 or GFS2), and just link /etc/puppet and /var/lib/puppet on both nodes to cluster FS. Accessing to the masters would be load balanced by round-robin DNS. Would this work? Is there any problem that both masters are using the same directories? Any possibility of data corruption or potential race-conditions? -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
Load balancing will work without issue, there is a good suggestion on using round robin DNS in Chapter 4 of Pro Puppet page 116. On Thursday, July 18, 2013 1:00:34 PM UTC-4, Jakov Sosic wrote:> > Hi, > > I''m trying to design a solution that will encompass load balancing > puppet master. > > I have two nodes, and idea is to connect them with DRBD, put some > cluster aware filesystem (OCFS2 or GFS2), and just link /etc/puppet and > /var/lib/puppet on both nodes to cluster FS. > > Accessing to the masters would be load balanced by round-robin DNS. > > Would this work? Is there any problem that both masters are using the > same directories? Any possibility of data corruption or potential > race-conditions? > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
On 07/18/2013 08:29 PM, GregC wrote:> Load balancing will work without issue, there is a good suggestion on > using round robin DNS in Chapter 4 of Pro Puppet page 116.Yeah but I am little suspicious about two masters sharing /var/lib/puppet ... could that cause troubles? -- Jakov Sosic www.srce.unizg.hr -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
On Thu, Jul 18, 2013 at 1:53 PM, Jakov Sosic <jsosic@srce.hr> wrote:> On 07/18/2013 08:29 PM, GregC wrote: > >> Load balancing will work without issue, there is a good suggestion on >> using round robin DNS in Chapter 4 of Pro Puppet page 116. >> > > Yeah but I am little suspicious about two masters sharing /var/lib/puppet > ... could that cause troubles? > >We have puppetmasters that share /var/lib/puppet/ssl via NFS. This works great. Just make sure you do not have 2 active CA servers at the same time. The only shared stuff for the puppetmasters in /var/lib/puppet is the SSL stuff. We also share the /etc/puppet/environments for code. We use this setup to scale to tens of thousands of puppet clients. HTH -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
On Jul 18, 2013 10:00 AM, "Jakov Sosic" <jsosic@srce.hr> wrote:> I''m trying to design a solution that will encompass load balancing > puppet master. > > I have two nodes, and idea is to connect them with DRBD, put some > cluster aware filesystem (OCFS2 or GFS2), and just link /etc/puppet and > /var/lib/puppet on both nodes to cluster FS. > > Accessing to the masters would be load balanced by round-robin DNS. > > Would this work? Is there any problem that both masters are using the > same directories? Any possibility of data corruption or potential > race-conditions?This would be risky if not disastrous. I am wary of anything that might depend on file locking on shared file systems, unless it is well-supported by the vendor. You should be able to share /etc/puppet as this should be read-only for the master process (you might have a function that generates or writes files here, but that would be unusual). But you could more simply maintain this with SCM checkouts, unless you have the agent generating files here and precise consistency required. For /var, however, you might divide into data subsets, some that you could safely share and some definitely not: - Reports processed with "store" are named with timestamps and node names, so that might be ok. And a single report, stored in a single file, is unlikely to be a big deal if lost. - Reports processed with "rrdgraph" seem like a big risk. - The CA store seems highly vulnerable to race conditions, unless you have such a low rate of node provisioning you can guarantee serial access -- but you probably would not need an HA setup in that case. - The filebucket I would expect to be risky -- seems like a high probability of attempted concurrent writes of the same file. - Other stuff is specific to the node agent and node master that you would not want to share in any case. You might consider an active/passive setup with a front-end load balancer, where one of the above data subsets is effectively read-only for the passive server. You could distribute the load by taking advantage of the ability to configure the various master roles (fileserver, catalog, inventory, filebucket, CA, etc.) with different hostnames and ports. It would still be a risk of corruption in a split-brain situation, but that''s often (always?) a danger with shared-storage filesystems. Wil -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
On 07/19/2013 09:16 AM, Wil Cooley wrote:> - Reports processed with "rrdgraph" seem like a big risk.OK.> - The CA store seems highly vulnerable to race conditions, unless you > have such a low rate of node provisioning you can guarantee serial > access -- but you probably would not need an HA setup in that case.I thought that only problem in this case could be two nodes simoultanously sending certificate request, which could cause the certs to get same serial, but couldn''t that be solved just by issuing revocation for that serial?> - The filebucket I would expect to be risky -- seems like a high > probability of attempted concurrent writes of the same file.While? If one client connects only to one master per run?> You might consider an active/passive setup with a front-end load > balancer, where one of the above data subsets is effectively read-only > for the passive server. You could distribute the load by taking > advantage of the ability to configure the various master roles > (fileserver, catalog, inventory, filebucket, CA, etc.) with different > hostnames and ports. It would still be a risk of corruption in a > split-brain situation, but that''s often (always?) a danger with > shared-storage filesystems.We don''t have such a high volume environment but we do have two machines at our disposal. So why not set up LB instead of simple HA... I''m still considering solutions, although one of the most easier to set up is simple HA through RHEL Cluster, with failover/failback in case of the primary node failure. -- Jakov Sosic www.srce.unizg.hr -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
On Jul 19, 2013 11:34 AM, "Jakov Sosic" <jsosic@srce.hr> wrote:> > On 07/19/2013 09:16 AM, Wil Cooley wrote: >> - The CA store seems highly vulnerable to race conditions, unless you >> have such a low rate of node provisioning you can guarantee serial >> access -- but you probably would not need an HA setup in that case. > > > I thought that only problem in this case could be two nodessimoultanously sending certificate request, which could cause the certs to get same serial, but couldn''t that be solved just by issuing revocation for that serial? Assuming that file is updated safely, that is, copy to temp, modify, rename, then that might be OK. The agent no doubt takes care to update file resources that way, but its extra work and I wouldn''t assume other parts that were not intended for concurrent access do that. You''d want to test carefully or read the code, at least.> >> - The filebucket I would expect to be risky -- seems like a high >> probability of attempted concurrent writes of the same file. > > While? If one client connects only to one master per run?Often one file is distributed to many clients and when that file is changed there is a fairly narrow window of time that most of those clients will update; and most of those have the same old file, with the same checksum, so there is a high probability of concurrent writes.> We don''t have such a high volume environment but we do have two machinesat our disposal. So why not set up LB instead of simple HA...>*shrug* Complexity like that tends to fail in the most surprising of ways. More than once I''ve seen active/active redundant systems fail worse and more frequently than non-redundant or active/passive systems. (OTOH, secomdary systems that are not used have a way of being overlooked and not there when you need them.)> I''m still considering solutions, although one of the most easier to setup is simple HA through RHEL Cluster, with failover/failback in case of the primary node failure.>That would probably be safest and easiest. I have often lamented that HA cluster systems don''t seem to support two nodes that are "differently" active (2 VIPs for 2 DNS servers, for example). Or at least, I''ve not found clear and obvious docs supporting that. Wil -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
On 07/20/2013 06:31 AM, Wil Cooley wrote:> That would probably be safest and easiest. I have often lamented that HA > cluster systems don''t seem to support two nodes that are "differently" > active (2 VIPs for 2 DNS servers, for example). Or at least, I''ve not > found clear and obvious docs supporting that.That is certainly possible. I have set up one HA cluster with two named (bind) daemons, each in it''s own ha-service with it''s own IP. Each can run on node1 or node2 of the cluster. Second bind is set up to do zone transfers from the first. Also, I''ve set up mysql master and mysql slave, both as HA services in same cluster. Master runs on node1 and slave on node2 as it''s respective primary. In case of failure of node2, slave failovers to node1, and vice versa. Both servers share FC storage. But, I don''t know how puppet internals work in detail (like CA and things like that), and that''s why I don''t just jump in the fire and worry later... It seems to me that I''d be setting up simple ha-service (failover/failback) and forget about LB for a time being :) -- Jakov Sosic www.srce.unizg.hr -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.