Hi everyone, Just wanted to tell you a little story. We''ve been enthusiastic puppet users since about a year ago here at the Geographic Institute of the University of Zürich. But we won''t use the zpool type ever again. Its just not worth it. Here''s what happened: . one of our servers lost knowledge about one of its zfs pools . puppet didn''t find the pool and .. went on to zpool create it . we did indeed have a backup, but would have lost all data if not Creating zpools is a manual thing in every case, since one has to know the devices participating. The names of which tend to be a little bit different from one server to the next. Add that to the possibility of major data loss (like we just experienced) and get a negative yield for the ''zpool'' type. Hoping to inspire a few.. kaspar -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Hi> But we won''t use the zpool type ever again. Its just not worth it. > Here''s what happened: > > . one of our servers lost knowledge about one of its zfs pools > . puppet didn''t find the pool and .. went on to zpool create it > . we did indeed have a backup, but would have lost all data if not > > Creating zpools is a manual thing in every case, since one has to > know the devices participating. The names of which tend to be a > little bit different from one server to the next. > > Add that to the possibility of major data loss (like we just > experienced) and get a negative yield for the ''zpool'' type.there was recently a similar discussion about that with the new available fs and lvm type on the puppet-dev list. these are indeed very dangerous operations, where it should somehow be possible to lock them. the problem is clearly that if puppet fails to determine the correct state it tries to transfer into the right state, which might have (obviously) - ehhh - "nasty side-effects" on such operations. maybe you can catch that discussion up and give your thoughts about it how puppet should behave and how it would be possible to "lock" such operations. cheers pete -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Peter Meier wrote:> Hi > >> But we won''t use the zpool type ever again. Its just not worth it. >> Here''s what happened: >> >> . one of our servers lost knowledge about one of its zfs pools >> . puppet didn''t find the pool and .. went on to zpool create it >> . we did indeed have a backup, but would have lost all data if not >> >> Creating zpools is a manual thing in every case, since one has to >> know the devices participating. The names of which tend to be a >> little bit different from one server to the next. >> >> Add that to the possibility of major data loss (like we just >> experienced) and get a negative yield for the ''zpool'' type. > > there was recently a similar discussion about that with the new > available fs and lvm type on the puppet-dev list. these are indeed > very dangerous operations, where it should somehow be possible to lock > them. the problem is clearly that if puppet fails to determine the > correct state it tries to transfer into the right state, which might > have (obviously) - ehhh - "nasty side-effects" on such operations. > > maybe you can catch that discussion up and give your thoughts about it > how puppet should behave and how it would be possible to "lock" such > operations. > > cheers pete >I see this as being distinctly part of the provisioning portion of a server life-cycle. I haven''t looked at the discussion on -dev, but i''m not sure these types really belong in core puppet. They''re not unix-agnostic resources for one (has that fundamental bit of philosophy changed?), and they''re unlikely to change in a way that you want puppet to ''correct''. That being said, some people have ''bootstrap'' envs, which would be a better place to have these destructive resources than in your production environment. -- Joe McDonagh AIM: YoosingYoonickz IRC: joe-mac on freenode L''ennui est contre-révolutionnaire -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Kaspar Schiess wrote:> Hi everyone, > > Just wanted to tell you a little story. We''ve been enthusiastic puppet > users since about a year ago here at the Geographic Institute of the > University of Zürich. > > But we won''t use the zpool type ever again. Its just not worth it. > Here''s what happened: > > . one of our servers lost knowledge about one of its zfs pools > . puppet didn''t find the pool and .. went on to zpool create it > . we did indeed have a backup, but would have lost all data if not > > Creating zpools is a manual thing in every case, since one has to know > the devices participating. The names of which tend to be a little bit > different from one server to the next. > > Add that to the possibility of major data loss (like we just > experienced) and get a negative yield for the ''zpool'' type. > > Hoping to inspire a few.. > kaspar >Thunderbird and/or GMail just flaked so apologies if this is sent twice: I see this as being distinctly part of the provisioning portion of a server life-cycle. I haven''t looked at the discussion on -dev, but i''m not sure these types really belong in core puppet. They''re not unix-agnostic resources for one (has that fundamental bit of philosophy changed?), and they''re unlikely to change in a way that you want puppet to ''correct''. That being said, some people have ''bootstrap'' envs, which would be a better place to have these destructive resources than in your production environment. -- Joe McDonagh AIM: YoosingYoonickz IRC: joe-mac on freenode L''ennui est contre-révolutionnaire -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Tue, Apr 6, 2010 at 10:52 AM, Kaspar Schiess <eule@space.ch> wrote:> Creating zpools is a manual thing in every case, since one has to know the > devices participating. The names of which tend to be a little bit different > from one server to the next.Wow, so I''ve got a fairly large Solaris 10 server deployment and I''ve never used the zpool type out of F.U.D. I''ve often thought I should, but at the same time thought I''d have to dig into the source to see what it''d actually do. So I''m surprised by the result, could you describe the situation a bit more? Every time I''ve used the zpool create command on a block device that was/is a member of a zpool the command complains loudly that this is a potentially dangerous and destructive operation. If I''m sure I then force the operation with zpool create -f Does the zpool type just blindly force creation and ignore these warning? In addition, I totally agree about the complexities surrounding zpool creation. I run with MPxIO and the device path names change when the system first boots after MPxIO is activated. With the complexity of dealing with two device paths before puppet configures MPxIO and one totally new name after MPxIO, I''ve just always dealt with it semi-manually with some helper scripts that look at format</dev/null "before and after" outputs, comparing them to determine the actual path I''m interested in. So I don''t think this is an intractable problem, I think the type just needs to be smarter and follow the expectations people have about zpool create versus zpool create -f Is anyone interested in diving into it a bit more with me? -Jeff -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi> On Tue, Apr 6, 2010 at 10:52 AM, Kaspar Schiess <eule@space.ch> wrote: >> Creating zpools is a manual thing in every case, since one has to know the >> devices participating. The names of which tend to be a little bit different >> from one server to the next. > > Wow, so I''ve got a fairly large Solaris 10 server deployment and I''ve > never used the zpool type out of F.U.D. I''ve often thought I should, > but at the same time thought I''d have to dig into the source to see > what it''d actually do. > > So I''m surprised by the result, could you describe the situation a bit > more? Every time I''ve used the zpool create command on a block device > that was/is a member of a zpool the command complains loudly that this > is a potentially dangerous and destructive operation. > > If I''m sure I then force the operation with zpool create -f > > Does the zpool type just blindly force creation and ignore these warning?as far as I understood was that the zpool information was lost, hence puppet thought that there was no zpool anymore. I assume that this means that zpool-tools didn''t know about that anymore either, but it might have been recoverable with manual interaction. and looking at the code [1] it doesn''t look like it uses -f.> In addition, I totally agree about the complexities surrounding zpool > creation. I run with MPxIO and the device path names change when the > system first boots after MPxIO is activated. With the complexity of > dealing with two device paths before puppet configures MPxIO and one > totally new name after MPxIO, I''ve just always dealt with it > semi-manually with some helper scripts that look at format</dev/null > "before and after" outputs, comparing them to determine the actual > path I''m interested in.in general I see disk allocation as part of provisioning, which shouldn''t go into puppet. but that''s how I seperate things and others might have reasons to do differently.> So I don''t think this is an intractable problem, I think the type just > needs to be smarter and follow the expectations people have about > zpool create versus zpool create -fIt''s a bit like the discussion in [2], how can puppet determine that there have been once something on that and that it shouldn''t take the dangerous actions if the tools don''t reveal it. cheers pete [1] http://github.com/reductivelabs/puppet/blob/master/lib/puppet/provider/zpool/solaris.rb [2] http://groups.google.com/group/puppet-dev/browse_thread/thread/195faece1199ef88#d34b2b17b7bdac17 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAku7x9UACgkQbwltcAfKi3/O+ACffnIb9iFMBQkYg0bE3/3eNfc9 K38AnAxfXUvmo3EtZwIYhRCSN+MPlhQf =o8pM -----END PGP SIGNATURE----- -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
----- "Peter Meier" <peter.meier@immerda.ch> a écrit : | | as far as I understood was that the zpool information was lost, hence | puppet thought that there was no zpool anymore. I assume that this | means | that zpool-tools didn''t know about that anymore either, but it might | have been recoverable with manual interaction. Having faced a serious ZFS crash I dug into ZFS internals & design some time ago so I''ll permit to drop some thoughs here : the zpool metadata information may be corrupted and pool not to show up. This seems quite bad but there are now tools to rollback over uberblocks (that contain the core data of your pool). Loosing a pool forever is highly unlikely | and looking at the code [1] it doesn''t look like it uses -f. I do agree this | in general I see disk allocation as part of provisioning, which | shouldn''t go into puppet. but that''s how I seperate things and others | might have reasons to do differently. definitively a +1 Nico. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Hi> as far as I understood was that the zpool information was lost, hence > puppet thought that there was no zpool anymore. I assume that this means > that zpool-tools didn''t know about that anymore either, but it might > have been recoverable with manual interaction.That''s what happened. Actually, it wasn''t lost so much as I did remove it manually in failsafe mode (rm /etc/zfs/zpool.cache) to be able to boot into normal mode again. It is correct that zfs normally wont allow to recreate zpool (issuing a warning about the device already being part of a zpool). Only that when your OS doesn''t know about the pool anymore, you don''t want puppet to create it on the next boot - you will want to recover it. But I guess I''ve driven that home. So to reproduce this (for whatever its worth) try deleting /etc/zfs/zpool.cache.>> In addition, I totally agree about the complexities surrounding zpool >> creation.If zpool could create complex setups for me without knowing about device names beforehand, it would really be useful for provisioning. This way .. its just plain dangerous. I''ll try to see if I can add to the other thread you mention. Greetings, kaspar -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On 4/7/2010 8:31 AM, Kaspar Schiess wrote:> Hi > >> as far as I understood was that the zpool information was lost, hence >> puppet thought that there was no zpool anymore. I assume that this means >> that zpool-tools didn''t know about that anymore either, but it might >> have been recoverable with manual interaction. > That''s what happened. Actually, it wasn''t lost so much as I did remove > it manually in failsafe mode (rm /etc/zfs/zpool.cache) to be able to > boot into normal mode again. > > It is correct that zfs normally wont allow to recreate zpool (issuing a > warning about the device already being part of a zpool). Only that when > your OS doesn''t know about the pool anymore, you don''t want puppet to > create it on the next boot - you will want to recover it. But I guess > I''ve driven that home.Use "puppetd --disable" the next time to keep your tools from stampeding over your manual recovery efforts. Best Regards, David -- dasz.at OG Tel: +43 (0)664 2602670 Web: http://dasz.at Klosterneuburg UID: ATU64260999 FB-Nr.: FN 309285 g FB-Gericht: LG Korneuburg -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
> > Use "puppetd --disable" the next time to keep your tools from stampeding > over your manual recovery efforts.I am not sure I understand - I could only boot into failsafe mode at the time. And the first real boot came up with puppetd running first thing. I can''t think of anything to stop that, short of disabling the master. And I guess there will be no such next time, anyway ;) kaspar -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On 4/7/2010 10:44 AM, Kaspar Schiess wrote:>> Use "puppetd --disable" the next time to keep your tools from >> stampeding over your manual recovery efforts. > > I am not sure I understand - I could only boot into failsafe mode at the > time. And the first real boot came up with puppetd running first thing. > I can''t think of anything to stop that, short of disabling the master.Obviously I have no clue what failsafe mode under solaris means, but "puppetd --disable" creates a lock file keeping the daemon from starting puppet runs. It should be used whenever temporary local manual changes are made that potentially conflict with puppet''s manifest. Another alternative would be to disable the local puppet service altogether. I would be much surprised if there is any "failsafe" mode which forbids changes to services. Best Regards, David -- dasz.at OG Tel: +43 (0)664 2602670 Web: http://dasz.at Klosterneuburg UID: ATU64260999 FB-Nr.: FN 309285 g FB-Gericht: LG Korneuburg -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Kaspar, On Apr 7, 10:44 am, Kaspar Schiess <e...@space.ch> wrote:> > Use "puppetd --disable" the next time to keep your tools from stampeding > > over your manual recovery efforts. > > I am not sure I understand - I could only boot into failsafe mode at the > time. And the first real boot came up with puppetd running first thing. > I can''t think of anything to stop that, short of disabling the master. >you should have booted into milestone=none instead after the failsafe boot, the you have time to fix things before the services start. cheers, /Martin -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Kaspar, On Apr 7, 8:31 am, Kaspar Schiess <e...@space.ch> wrote:> It is correct that zfs normally wont allow to recreate zpool (issuing a > warning about the device already being part of a zpool). Only that when > your OS doesn''t know about the pool anymore, you don''t want puppet to > create it on the next boot - you will want to recover it. But I guess > I''ve driven that home. >when I commissioned Andrew to create the zpool type we tried to make it as failsafe as possible (e.g. not using "-f", aborting when unsure, etc). This was one of the cases we didn''t think of :) However, if you do this kind of operation, you should have realized that the system will come back up without all previous zpools (and zfs file systems).> So to reproduce this (for whatever its worth) try deleting > /etc/zfs/zpool.cache. >When I remove /etc/zfs/zpool.cache (in safemode) and then reboot, the system comes back up without the data pool. This triggers zpool create to run, but it fails with: "<disk> is part of an exported or potentially active ZFS pool <pool>", and since that failed all my zfs types depending on that pool failed - no harm done. So I don''t get how you could have lost your pool, as zpool will refuse to overwrite an existing pool without the "-f". All you would have had to do was run "zpool import<pool>" and you''d been back to normal.> >> In addition, I totally agree about the complexities surrounding zpool > >> creation. > > If zpool could create complex setups for me without knowing about device > names beforehand, it would really be useful for provisioning. This way > .. its just plain dangerous. >For me it depends. I just deployed 40 identical systems, and they all have 4 disks, two are used for the root pool (to boot form) and two are used as a data pool. I prefer to do the data pool creation in puppet over in jumpstart, as it allows me to control more features in the zpool. cheers, /Martin -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Joe McDonagh wrote: [I''m re-arranging what Joe said a bit so I can keep replies to related issues together.]> They''re not > unix-agnostic resources for one (has that fundamental bit of philosophy > changed?), and they''re unlikely to change in a way that you want puppet > to ''correct''.Puppet has several types that are specific to some Unix(-clone) variant. Three types that are for MacOS X, ''yumrepo'' type is for a subset of Linux distributions, and there are ''zone'', ''zfs'' and ''zpool'' for Solaris. And there is talk about creating ''iptables'' and ''lvm'' types, which are Linux- specific. It''s nice if types *can* be made generic, but I don''t think Luke has any rule that core types must be so, nor do I think there needs to be such a rule.> I see this as being distinctly part of the provisioning portion of a > server life-cycle. I haven''t looked at the discussion on -dev, but i''m > not sure these types really belong in core puppet.I have never quite understood the distinction between provisioning and other systems administration. Why is creating a file system provisioning, but installing a package not?> That being said, some people have ''bootstrap'' envs, which would be a > better place to have these destructive resources than in your production > environment.What is potentially destructive is not so much *types*, as specific resources. I can easily wreak havoc with an ill-advised file resource, or with an augeas stanza that isn''t cautious enough. Likewise, re- creating a filesystem automatically doesn''t *have* to be dangerous. I would not hesitate much to let Puppet automatically re-create filesystems containing pure caches, or the OS installation (which is just a short kickstart installation away to be re-created). /Bellman -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On 7 April 2010 11:56, Thomas Bellman <bellman@nsc.liu.se> wrote:> I have never quite understood the distinction between provisioning and > other systems administration. Why is creating a file system provisioning, > but installing a package not?Some things are usually -- or best -- done at install time. IMHO that''s provisioning. I''ve done some work with disk / volume management with Puppet (with LVM on Linux) in the past and this was to work around severe limitations in the provisioning system we had. I''d never recommended doing this again, it was the wrong tool for the job. -- Gary Law -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Kaspar Schiess
2010-Apr-07 12:59 UTC
Re: [Puppet Users] Re: Why we wont use zpool ever again
> So I don''t get how you could have lost your pool, as zpool will refuse > to overwrite an existing pool without the "-f". All you would have had > to do was run "zpool import<pool>" and you''d been back to normal.To be perfectly honest with you, I am a bit in the dark about that as well. I''ve done the same experiment in the meantime - with no success. Guess the pool was really messed up. Note that I am not telling anyone not to use zpool - it''s just not paying off in _my scenario_ anymore. We use large zpools (apart from rootpool) in the big data machines only - and I dont mind doing those manually. Sure, I could have thought of not having puppet start on system start (as has been suggested elsewhere) - but I must admit that my primary concern was fixing the server ASAP at the time, not what puppet could do to me once the server was back. That''s just something you don''t think of there and then... Hence my post. Not wanting to step on anyones toes. best regards, kaspar -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.