Hi, We''ve been having some internal discussions about the best way to handle certain cases and I thought I''d turn to the list to solicit opinions on how other people have solved this issue (or don''t, as the case may be). The issue is that we would like our modules to, where possible, check for the existence of certain on disk data when installing a service for the first time and retrieve it from somewhere if it''s not available. As an example of the kind of thing we''re talking about we use a product called Sonatype Nexus that relies on a bunch of on disk data in /srv/sonatype-nexus/. When installing the system for the first time (for example, when the file{} containing the .war triggers) we would like it to automatically put down a copy of /srv/sonatype-nexus/. We obviously don''t want this drifting out of sync with the production data which is where the issue is. How do other people handle this? Our options seem to be: * Nightly/hourly backups of production data to some location where Puppet can rsync/wget/shovel it out when needed. * Some kind of process that real-time syncs directories to nfs storage. * Erroring if the data is missing in some fashion when Puppet runs and relying on sysadmins to put it in place. We''ve talked through the options but they all have fairly significant drawbacks. My personal favorite solution would be some kind of daemon that syncs data constantly and is capable of intelligently syncing the data back to the node if it goes missing. It could be potentially error prone but it represents the least bad choice. That combined with regular backups would seem ideal but I can''t find anything out there that does this without significant work/investment. I debated just rsyncing every 15 minutes but that''s not great either. Thanks, -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Nov 23, 2010, at 6:45 AM, Ashley Penney wrote:> Hi, > > We''ve been having some internal discussions about the best way to handle certain cases and I thought I''d turn to the list to solicit opinions on how other people have solved this issue (or don''t, as the case may be). The issue is that we would like our modules to, where possible, check for the existence of certain on disk data when installing a service for the first time and retrieve it from somewhere if it''s not available. > > As an example of the kind of thing we''re talking about we use a product called Sonatype Nexus that relies on a bunch of on disk data in /srv/sonatype-nexus/. When installing the system for the first time (for example, when the file{} containing the .war triggers) we would like it to automatically put down a copy of /srv/sonatype-nexus/. We obviously don''t want this drifting out of sync with the production data which is where the issue is. How do other people handle this? > > Our options seem to be: > > * Nightly/hourly backups of production data to some location where Puppet can rsync/wget/shovel it out when needed. > * Some kind of process that real-time syncs directories to nfs storage. > * Erroring if the data is missing in some fashion when Puppet runs and relying on sysadmins to put it in place. > > We''ve talked through the options but they all have fairly significant drawbacks. My personal favorite solution would be some kind of daemon that syncs data constantly and is capable of intelligently syncing the data back to the node if it goes missing. It could be potentially error prone but it represents the least bad choice. That combined with regular backups would seem ideal but I can''t find anything out there that does this without significant work/investment. I debated just rsyncing every 15 minutes but that''s not great either.1) So are the Puppet clients (Nexus servers) supposed to be modifying the data in /srv/sonatype-nexus like a database or is it used read-only like a file server? 2) If you change the "master copy" of the data, can you wipe the data and recopy on each client or do you need to merge in changes? 3) How big is the biggest file in the data? What''s the total size? -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Ashley Penney <apenney@gmail.com> writes:> We''ve been having some internal discussions about the best way to handle > certain cases and I thought I''d turn to the list to solicit opinions on how > other people have solved this issue (or don''t, as the case may be). The > issue is that we would like our modules to, where possible, check for the > existence of certain on disk data when installing a service for the first > time and retrieve it from somewhere if it''s not available.Patrick asked some really good questions about this, but generally:> As an example of the kind of thing we''re talking about we use a product > called Sonatype Nexus that relies on a bunch of on disk data in > /srv/sonatype-nexus/. When installing the system for the first time (for > example, when the file{} containing the .war triggers) we would like it to > automatically put down a copy of /srv/sonatype-nexus/. We obviously don''t > want this drifting out of sync with the production data which is where the > issue is. How do other people handle this?Package those data files yourself, if necessary including logic in the package to ensure that you don''t overwrite valuable local changes. Then use puppet to ensure that package is either ''installed'' or ''latest''.> Our options seem to be: > > * Nightly/hourly backups of production data to some location where Puppet > can rsync/wget/shovel it out when needed. > * Some kind of process that real-time syncs directories to nfs storage. > * Erroring if the data is missing in some fashion when Puppet runs and relying on > sysadmins to put it in place....or making it available as a puppet file server, and using puppet to put it in place.> We''ve talked through the options but they all have fairly significant > drawbacks. My personal favorite solution would be some kind of daemon that > syncs data constantly and is capable of intelligently syncing the data back > to the node if it goes missing. It could be potentially error prone but it > represents the least bad choice.You could potentially just use: file { "/example": source => ''puppet:///module/example'', replace => false } That will only put the file in place if it doesn''t already exist. Regards, Daniel -- ✣ Daniel Pittman ✉ daniel@rimspace.net ☎ +61 401 155 707 ♽ made with 100 percent post-consumer electrons -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Tue, Nov 23, 2010 at 6:41 PM, Daniel Pittman <daniel@rimspace.net> wrote:> Ashley Penney <apenney@gmail.com> writes: > > > As an example of the kind of thing we''re talking about we use a product > > called Sonatype Nexus that relies on a bunch of on disk data in > > /srv/sonatype-nexus/. When installing the system for the first time (for > > example, when the file{} containing the .war triggers) we would like it > to > > automatically put down a copy of /srv/sonatype-nexus/. We obviously > don''t > > want this drifting out of sync with the production data which is where > the > > issue is. How do other people handle this? > > Package those data files yourself, if necessary including logic in the > package > to ensure that you don''t overwrite valuable local changes. Then use puppet > to > ensure that package is either ''installed'' or ''latest''. >I suppose this is possible, but awkward. An example of another application is this horrible Java CMS that we use that writes numerous XML files of random names all over the place during operation. There''s cache directories, it constantly rewrites various bits of configuration xml files, it spews logs all over. Packaging something like that up in a way that is functional is almost impossible. When we want to reinstall/clone that server we just copy the entire directory and then run Puppet to change a few key XML files. Something like that is difficult to package, and the files that you would package change frequently due to patches and internal development on top of the CMS.> > > Our options seem to be: > > > > * Nightly/hourly backups of production data to some location where Puppet > > can rsync/wget/shovel it out when needed. > > * Some kind of process that real-time syncs directories to nfs storage. > > * Erroring if the data is missing in some fashion when Puppet runs and > relying on > > sysadmins to put it in place. > > ...or making it available as a puppet file server, and using puppet to put > it > in place. >In our experience that is almost unusable, speedwise.> > > We''ve talked through the options but they all have fairly significant > > drawbacks. My personal favorite solution would be some kind of daemon > that > > syncs data constantly and is capable of intelligently syncing the data > back > > to the node if it goes missing. It could be potentially error prone but > it > > represents the least bad choice. > > You could potentially just use: > > file { "/example": > source => ''puppet:///module/example'', replace => false > } > > That will only put the file in place if it doesn''t already exist. >Hmm, I always forget about replace => false. I wonder if it has the same awful speed penalties. I think my issue with this is still the hassle of constantly syncing the changing files back into Puppet. That''s why I was looking for some kind of semi or fully automated syncing mechanism for something like this. It''s mostly Java apps that are especially bad for this. Most open source software sticks data into a database or at least a single easily dealt with directory. Java explodes all over the place like some kind of evil virus. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Tue, Nov 23, 2010 at 2:30 PM, Patrick <kc7zzv@gmail.com> wrote:> > 1) So are the Puppet clients (Nexus servers) supposed to be modifying the > data in /srv/sonatype-nexus like a database or is it used read-only like a > file server? >They modify data in that directory. I explained further in another email but we have several Java apps that do similar things, constantly changing/adding xml files and all kinds of logs and other stuff for running.> 2) If you change the "master copy" of the data, can you wipe the data and > recopy on each client or do you need to merge in changes? >I think in most cases it would require a remerge. Generally speaking I''m only dealing with a single client using this data at a time so my concerns are more ''if the data is not there completely, automatically reprovision it with a copy as up to date as possible'' rather than changing things. If there are specific file configuration files I need to change within the data then I handle that within Puppet like any other application.> 3) How big is the biggest file in the data? What''s the total size?Hmm, I''m not logged in to check at the moment but generally we''re talking a maximum of 4G of data for one of these applications, and some closer to 1G. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Ashley Penney <apenney@gmail.com> writes:> On Tue, Nov 23, 2010 at 6:41 PM, Daniel Pittman <daniel@rimspace.net> wrote: >> Ashley Penney <apenney@gmail.com> writes: >> >> > As an example of the kind of thing we''re talking about we use a product > >> called Sonatype Nexus that relies on a bunch of on disk data in > >> /srv/sonatype-nexus/. When installing the system for the first time (for > >> example, when the file{} containing the .war triggers) we would like it to > >> automatically put down a copy of /srv/sonatype-nexus/. We obviously don''t > >> want this drifting out of sync with the production data which is where the > >> issue is. How do other people handle this? >> >> Package those data files yourself, if necessary including logic in the >> package to ensure that you don''t overwrite valuable local changes. Then use >> puppet to ensure that package is either ''installed'' or ''latest''. > > I suppose this is possible, but awkward. An example of another application > is this horrible Java CMS that we use that writes numerous XML files of > random names all over the place during operation.Well, I agree that by the time you got as far as Java you had already lost. ;) More seriously, I can understand the problem, and it is a royal PITA.> There''s cache directories, it constantly rewrites various bits of > configuration xml files, it spews logs all over. Packaging something like > that up in a way that is functional is almost impossible. When we want to > reinstall/clone that server we just copy the entire directory and then run > Puppet to change a few key XML files. Something like that is difficult to > package, and the files that you would package change frequently due to > patches and internal development on top of the CMS.I would approach that, personally, by holding my nose and using something like Capastrano or another "deploy from a version control system" tool to do literally that: copy from a golden source into the target system, by hand. Then use puppet to manage the handful of configuration files that need customization, and have the "deployment" tool trigger a puppet run with ''--test'' on the target machine after installation. Which is a bit nasty, and it would be nice if puppet could do it, but it sucks less than the alternatives, I think. (In other words, I think you identified the body of alternative processes well in your earlier post, although other wrappers around them might be nicer.) Daniel -- ✣ Daniel Pittman ✉ daniel@rimspace.net ☎ +61 401 155 707 ♽ made with 100 percent post-consumer electrons -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.