Hello, I''ve looked around Google and the zfs-discuss archives but have not been able to find a good answer to this question (and the related questions that follow it): How well does ZFS handle unexpected power failures? (e.g. environmental power failures, power supply dying, etc.) Does it consistently gracefully recover? Should having a UPS be considered a (strong) recommendation or a "don''t even think about running without it" item? Are there any communications/interfacing caveats to be aware of when choosing the UPS? In this particular case, we''re talking about a home file server running OpenSolaris 2009.06. Actual environment power failures are generally < 1 per year. I know there are a few blog articles about this type of application, but I don''t recall seeing any (or any detailed) discussion about power failures and UPSes as they relate to ZFS. I did see that the ZFS Evil Tuning Guide says cache flushes are done every 5 seconds. Here is one post that didn''t get any replies about a year ago after someone had a power failure, then UPS battery failure while copying data to a ZFS pool: http://lists.macosforge.org/pipermail/zfs-discuss/2008-July/000670.html Both theoretical answers and real life experiences would be appreciated as the former tells me where ZFS is needed while the later tells me where it has been or is now. Thanks, -hk
I''ve seen enough people suffer from corrupted pools that a UPS is definitely good advice. However, I''m running a (very low usage) ZFS server at home and it''s suffered through at least half a dozen power outages without any problems at all. I do plan to buy a UPS as soon as I can, but it seems to be surviving very well so far. -- This message posted from opensolaris.org
A related question: If you are on a UPS, is it OK to disable ZIL? The evil tuning guide says "The ZIL is an essential part of ZFS and should never be disabled." However, if you have a UPS, what can go wrong that really requires ZIL? Opinions? Monish ----- Original Message ----- From: "Ross" <no-reply at opensolaris.org> To: <zfs-discuss at opensolaris.org> Sent: Tuesday, June 30, 2009 3:04 PM Subject: Re: [zfs-discuss] ZFS, power failures, and UPSes> I''ve seen enough people suffer from corrupted pools that a UPS is > definitely good advice. However, I''m running a (very low usage) ZFS > server at home and it''s suffered through at least half a dozen power > outages without any problems at all. > > I do plan to buy a UPS as soon as I can, but it seems to be surviving very > well so far. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Tue, 30 Jun 2009, Monish Shah wrote:> The evil tuning guide says "The ZIL is an essential part of ZFS and should > never be disabled." However, if you have a UPS, what can go wrong that > really requires ZIL?Without addressing a single ZFS-specific issue: * panics * crashes * hardware failures - dead RAM - dead CPU - dead systemboard - dead something else * natural disasters * UPS failure * UPS failure (must be said twice) * Human error (what does this button do?) * Cabling problems (say, where did my disks go?) * Malicious actions (Fired? Let me turn their power off!) That''s just a warm-up; I''m sure people can add both the ZFS-specific reasons and also the fallacy that a UPS does anything more than mitigate one particular single point of failure. Don''t forget to buy two UPSes and split your machine across both. And don''t forget to actually maintain the UPS. And check the batteries. And schedule a load test. The single best way to learn about the joys of UPS behaviour is to sit down and have a drink with a facilities manager who has been doing the job for at least ten years. At least you''ll hear some funny stories about the day a loose screw on one floor took out a house UPS and 100+ hosts and NEs with it. Andre. -- Andre van Eyssen. mail: andre at purplecow.org jabber: andre at interact.purplecow.org purplecow.org: UNIX for the masses http://www2.purplecow.org purplecow.org: PCOWpix http://pix.purplecow.org
Haudy Kazemi wrote:> Hello, > > I''ve looked around Google and the zfs-discuss archives but have not > been able to find a good answer to this question (and the related > questions that follow it): > > How well does ZFS handle unexpected power failures? (e.g. > environmental power failures, power supply dying, etc.) > Does it consistently gracefully recover?Mostly. Unless you are unlucky. Backups are your friend in *any* environment though.> Should having a UPS be considered a (strong) recommendation or a > "don''t even think about running without it" item?There has been quite any interesting thread on this over the last few months. I won''t repeat my comments, but it is there in digital posterity on the zfs-discuss archives. Certainly in a large environment with a lot of data being written, then one should consider this a mandatory requirement if you care about your data. Particularly if there are many links in your storage chain that cause data corruption due to power failure.> Are there any communications/interfacing caveats to be aware of when > choosing the UPS? > > In this particular case, we''re talking about a home file server > running OpenSolaris 2009.06.As far as a home server goes, particularly if it is not write intensive then you will ''most likely'' be fine. I have a home one with a v120 running S10 u6 with a D1000 and 7 x 300 GB SCSI disk in a RAIDZ2 that has seen numerous power interruptions with no faults. This machine is a Samba server for my Macs and printing business. I also have another mail / web server also on another v120 which experiences the same power faults and regularly bounces back without issues. But your mileage may vary. It all really depends on how much you care about the data really. I haven''t used OpenSolaris specifically however as I prefer the generally more well supported S10 releases. (yes I know you can get support for OS, but I tend to be conservative and standardize as much as possible. I do have millions of files stored on ZFS volumes for our Uni and I sleep well ;))> Actual environment power failures are generally < 1 per year. I know > there are a few blog articles about this type of application, but I > don''t recall seeing any (or any detailed) discussion about power > failures and UPSes as they relate to ZFS. I did see that the ZFS Evil > Tuning Guide says cache flushes are done every 5 seconds.The flush time you mention is based on older versions of ZFS, newer ones can have a flush time as long as 30 seconds I believe now.> > Here is one post that didn''t get any replies about a year ago after > someone had a power failure, then UPS battery failure while copying > data to a ZFS pool: > http://lists.macosforge.org/pipermail/zfs-discuss/2008-July/000670.html > > Both theoretical answers and real life experiences would be > appreciated as the former tells me where ZFS is needed while the later > tells me where it has been or is now. > > Thanks, > > -hk > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Doug Baker - Sun UK - Support Engineer
2009-Jun-30 10:16 UTC
[zfs-discuss] ZFS, power failures, and UPSes
Monish Shah wrote:> A related question: If you are on a UPS, is it OK to disable ZIL? > > The evil tuning guide says "The ZIL is an essential part of ZFS and > should never be disabled." However, if you have a UPS, what can go > wrong that really requires ZIL?The UPS.> > Opinions? > > Monish > > ----- Original Message ----- From: "Ross" <no-reply at opensolaris.org> > To: <zfs-discuss at opensolaris.org> > Sent: Tuesday, June 30, 2009 3:04 PM > Subject: Re: [zfs-discuss] ZFS, power failures, and UPSes > > >> I''ve seen enough people suffer from corrupted pools that a UPS is >> definitely good advice. However, I''m running a (very low usage) ZFS >> server at home and it''s suffered through at least half a dozen power >> outages without any problems at all. >> >> I do plan to buy a UPS as soon as I can, but it seems to be surviving >> very well so far. >> -- >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Dr Doug Baker Sun Microsystems Systems Support Engineer. UK Mission Critical Solution Centre. Tel : 0870 600 3222
Monish Shah wrote:> A related question: If you are on a UPS, is it OK to disable ZIL?I think the answer to this is no. UPS''s do fail. If you have two redundant units, answer *might* be maybe. But prudence says *no*. I have seen numerous UPS'' failures over the years, cascading UPS failures as well by poorly engineered electrical systems supporting server environments (or even more poorly managed and maintained). One would have to weigh up the risk against the gain really and that would be *very* specific to any environment. The only time IMO would be if the data is disposable and recreating your pool and data is not an issue. (and all of the accompanying downtime that would go with it is acceptable) Really no one should disable the ZIL, rather look into write optimzed SSD''s for the ZIL instead. Particularly if you are that interested in performance that you are considering disabling your ZIL.> > The evil tuning guide says "The ZIL is an essential part of ZFS and > should never be disabled." However, if you have a UPS, what can go > wrong that really requires ZIL? > > Opinions? > > Monish > > ----- Original Message ----- From: "Ross" <no-reply at opensolaris.org> > To: <zfs-discuss at opensolaris.org> > Sent: Tuesday, June 30, 2009 3:04 PM > Subject: Re: [zfs-discuss] ZFS, power failures, and UPSes > > >> I''ve seen enough people suffer from corrupted pools that a UPS is >> definitely good advice. However, I''m running a (very low usage) ZFS >> server at home and it''s suffered through at least half a dozen power >> outages without any problems at all. >> >> I do plan to buy a UPS as soon as I can, but it seems to be surviving >> very well so far. >> -- >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On 06/30/09 03:00 AM, Andre van Eyssen wrote:> On Tue, 30 Jun 2009, Monish Shah wrote: > >> The evil tuning guide says "The ZIL is an essential part of ZFS and >> should never be disabled." However, if you have a UPS, what can go >> wrong that really requires ZIL? > > Without addressing a single ZFS-specific issue: > > * panics > * crashes > * hardware failures > - dead RAM > - dead CPU > - dead systemboard > - dead something else > * natural disasters > * UPS failure > * UPS failure (must be said twice) > * Human error (what does this button do?) > * Cabling problems (say, where did my disks go?) > * Malicious actions (Fired? Let me turn their power off!) > > That''s just a warm-up; I''m sure people can add both the ZFS-specific > reasons and also the fallacy that a UPS does anything more than > mitigate one particular single point of failure.Actually, they do quite a bit more than that. They create jobs, generate revenue for battery manufacturers, and tech''s that change batteries and do PM maintenance on the large units. Let''s not forget that they add significant revenue to the transportation industry, given their weight for shipping. In the last 28 years of doing this stuff, I''ve found a few times that the UPS has actually worked and lasted as long as the outage. Many other times, the unit is failed (circuits), or the batteries are beyond the service life. But really, something approaching 40% of the time they actually work out OK. So they also create repair and recycling jobs. :-)> > Don''t forget to buy two UPSes and split your machine across both. And > don''t forget to actually maintain the UPS. And check the batteries. > And schedule a load test. > > The single best way to learn about the joys of UPS behaviour is to sit > down and have a drink with a facilities manager who has been doing the > job for at least ten years. At least you''ll hear some funny stories > about the day a loose screw on one floor took out a house UPS and 100+ > hosts and NEs with it. > > Andre. > >
On Tue, 30 Jun 2009, Neal Pollack wrote:> Actually, they do quite a bit more than that. They create jobs, > generate revenue for battery manufacturers, and tech''s that change > batteries and do PM maintenance on the large units. Let''s notIt sounds like this is a responsibility which should be moved to the US federal goverment since UPSs create jobs.> In the last 28 years of doing this stuff, I''ve found a few times > that the UPS has actually worked and lasted as long as the outage.I have seen UPSs help quite a lot for short glitches lasting seconds, or a minute. Otherwise the outage is usually longer than the UPSs can stay up since the problem required human attention. A standby generator is needed for any long outages. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Bob Friesenhahn wrote:> On Tue, 30 Jun 2009, Neal Pollack wrote: > >> Actually, they do quite a bit more than that. They create jobs, >> generate revenue for battery manufacturers, and tech''s that change >> batteries and do PM maintenance on the large units. Let''s not > > It sounds like this is a responsibility which should be moved to the > US federal goverment since UPSs create jobs. >Actually, I think UPS already employs some 410,000+ people, making it the 3rd largest private employer in the USA. (5th overall, if you include the Federal Gov''t and the US Postal Service). <wink>>> In the last 28 years of doing this stuff, I''ve found a few times that >> the UPS has actually worked and lasted as long as the outage. > > I have seen UPSs help quite a lot for short glitches lasting seconds, > or a minute. Otherwise the outage is usually longer than the UPSs can > stay up since the problem required human attention. > > A standby generator is needed for any long outages. > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, > http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussAs someone who has spend enough time doing data center work, I can attest to the fact that UPSes are really useful only as extremely-short-interval solutions. A dozen or so minutes, at best. The best design I''ve see was for an old BBN (hey, remember them!) site just outside of Cambridge, MA. It took in utility power, ran it through a conditioner setup, and then through this nice switch thing. The switch took three inputs: Utility, a local diesel generator, and a line of marine batteries. The switch itself was internally redundant (which isn''t hard to do, it''s 50''s tech), so you could draw power from any (or even all 3 at once). Nothing really fancy; it was simple, with no semiconductor stuff to fail - just all 50-ish hardwired circuitry. I don''t even think there was a transistor in the whole shebang. Lots of capacitors, though. :-) The jist of the whole thing was, that if utility power was out more than 5 minutes, there was not good predictor of how long it would remain out - I saw a nice little graph that showed no real good prediction of outage time based on existing outage length (i.e. if the power has been out X minutes, you can expect it to be restored in Y minutes...). I suspect it was something like 20 years of accumulated data or so... The end of this is simple: UPSes should give you enough time to start the gen-pack. If you are having problems with your gen-pack, you''ll never have enough UPS time to fix it (and, it''s not cost-effective to try to make it so), so FIX YOUR GEN PACK BEFORE the outage. Which means - TEST it, and TEST it, and TEST it again! For home use, I set my UPS to immediately shut down anything attached to it for /any/ service outage. Large enough batteries to handle anything more than a couple of minutes are frankly a fire-hazard for the home, not to mention a maintenance PITA. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
On Tue, Jun 30, 2009 at 1:36 PM, Erik Trimble<Erik.Trimble at sun.com> wrote:> Bob Friesenhahn wrote: >> >> On Tue, 30 Jun 2009, Neal Pollack wrote: >> >>> Actually, they do quite a bit more than that. They create jobs, generate >>> revenue for battery manufacturers, and tech''s that change batteries and do >>> PM maintenance on the large units. ?Let''s not >> >> It sounds like this is a responsibility which should be moved to the US >> federal goverment since UPSs create jobs. >> > Actually, I think UPS already employs some 410,000+ people, making it the > 3rd largest private employer in the USA. (5th overall, if you include the > Federal Gov''t and the US Postal Service). > > <wink> > > >>> In the last 28 years of doing this stuff, I''ve found a few times that the >>> UPS has actually worked and lasted as long as the outage. >> >> I have seen UPSs help quite a lot for short glitches lasting seconds, or a >> minute. ?Otherwise the outage is usually longer than the UPSs can stay up >> since the problem required human attention. >> >> A standby generator is needed for any long outages. >> >> Bob >> -- >> Bob Friesenhahn >> bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ >> GraphicsMagick Maintainer, ? ?http://www.GraphicsMagick.org/ >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > As someone who has spend enough time doing data center work, I can attest to > the fact that UPSes are really useful only as extremely-short-interval > solutions. A dozen or so minutes, at best. > > The best design I''ve see was for an old BBN (hey, remember them!) site just > outside of Cambridge, MA. ?It took in utility power, ran it through a > conditioner setup, and then through this nice switch thing. ?The switch took > three inputs: ?Utility, a local diesel generator, and a line of marine > batteries. ?The switch itself was internally redundant (which isn''t hard to > do, it''s 50''s tech), so you could draw power from any (or even all 3 at > once). ?Nothing really fancy; it was simple, with no semiconductor stuff to > fail - just all 50-ish hardwired circuitry. I don''t even think there was a > transistor in the whole shebang. Lots of capacitors, though. ? :-) > > > The jist of the whole thing was, that if utility power was out more than 5 > minutes, there was not good predictor of how long it would remain out - I > saw a nice little graph that showed no real good prediction of outage time > based on existing outage length (i.e. if the power has been out X minutes, > you can expect it to be restored in Y minutes...). ? I suspect it was > something like 20 years of accumulated data or so... > > The end of this is simple: ?UPSes should give you enough time to start the > gen-pack. ?If you are having problems with your gen-pack, you''ll never have > enough UPS time to fix it (and, it''s not cost-effective to try to make it > so), so FIX YOUR GEN PACK BEFORE the outage. ?Which means - TEST it, and > TEST it, and TEST it again!Slight corollary -- just because you have a generator and test it doesn''t mean you can assume you can get fuel in a timely manner (so still be prepared to shutdown if needed). I have seen places whose DR plans completely rely on the assumption there will never be any problems refueling their generators. However, last year after Ike hit, one of AT&T''s central offices lost power because it ran out of fuel (and couldn''t get refilled in time).> > > For home use, I set my UPS to immediately shut down anything attached to it > for /any/ service outage. ?Large enough batteries to handle anything more > than a couple of minutes are frankly a fire-hazard for the home, not to > mention a maintenance PITA. > > -- > Erik Trimble > Java System Support > Mailstop: ?usca22-123 > Phone: ?x17195 > Santa Clara, CA > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
>>>>> "ms" == Monish Shah <monish at indranetworks.com> writes: >>>>> "sl" == Scott Lawson <Scott.Lawson at manukau.ac.nz> writes: >>>>> "np" == Neal Pollack <Neal.Pollack at Sun.COM> writes:ms> If you are on a UPS, is it OK to disable ZIL? sl> I have seen numerous UPS'' failures over the years, yeah at my place in NYC we''ve had more problems with the UPS than with the service. At the very least a UPS needs to switch off for new batteries every two years, and the raw service does not go out that often for me. It starts to make more sense to use a UPS if you have dual power supplies, dual UPS''s, bypass switches. Or crappy aboveground power. anyway, typical machines panic because of bugs a lot more often than either UPS or line problems. **BUT THIS IS ALL BESIDE THE POINT**! The ZIL is for implementing fsync() for databases and also the part of NFS that allows servers to reboot without client data loss. It has *NOTHING TO DO* with losing your entire pool. Disabling the ZIL does not make catastrophic pool loss more likely, not even a little bit! Unfortunately some software developer decided to write a bunch of DIRE WARNINGS to SCARE PEOPLE INTO ASSUMPTIONS leading them to use the maximum amount of code of which said developer is justly proud, regardless of whether they''re using it for the right reason or not. oddly, I don''t think disabling ZIL will make catastrophic loss more likely for databases running above the ZFS, either, because unlike non-COW filesystems ZFS never recovers to a state where writes appear to have happened out-of-order prior to the crash. Yes, disabling the ZIL could break the ''D'' in ACID for databases running above that ZFS, but in a way that rolls them back in time, not makes them become corrupt. Running without ZIL is as if a snapshot were taken at each TXG commit time, and on reboot after a crash you recover to the most recent TXG-snapshot that fully committed, thus databases will be ``crash-consistent'''' even without the ZIL, unless I''m mistaken. Adding an SSD *does* make catastrophic pool loss more likely, because if you break the SSD and then export the pool, you can never import it again. so, adding an SSD for the ZIL as a suggestive good-little-boy alternative to disabling the ZIL makes catastrophic loss of the entire pool more likely, not less. The advantage of rolling with ZIL is, if you''re using NFS you should be able to crash and reboot the server without the clients noticing. Also MTA''s that accept messages, databases that confirm orders and bookings, won''t lose anything they''ve accepted or confirmed in the crash (if everything else works). I wish ZIL could be enabled and disabled per filesystem instead of per kernel. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090630/27f12b50/attachment.bin>
On Jun 30, 2009, at 14:08, Bob Friesenhahn wrote:> I have seen UPSs help quite a lot for short glitches lasting > seconds, or a minute. Otherwise the outage is usually longer than > the UPSs can stay up since the problem required human attention. > > A standby generator is needed for any long outages.Can''t remember where I read the claim, but supposedly if power isn''t restored within about ten minutes, then it will probably be out for a few hours. If this ''statistic'' is true, it would mean that your UPS should last (say) fifteen minutes, and after that you really need a generator. At $WORK we currently have about thirty minutes worth of juice at full load, but as time drags on and we start shutting down less essential stuff we can increase that. The PBX and security system have their own UPSes in their own racks, so there are two layers of battery there.
David Magda wrote:> On Jun 30, 2009, at 14:08, Bob Friesenhahn wrote: > >> I have seen UPSs help quite a lot for short glitches lasting seconds, >> or a minute. Otherwise the outage is usually longer than the UPSs >> can stay up since the problem required human attention. >> >> A standby generator is needed for any long outages. > > Can''t remember where I read the claim, but supposedly if power isn''t > restored within about ten minutes, then it will probably be out for a > few hours. If this ''statistic'' is true, it would mean that your UPS > should last (say) fifteen minutes, and after that you really need a > generator.Most UPS''s from any vendor are designed to run for around ~12 minutes at full load. So that would appear to back that claim up and from my experience that is pretty much on the money...> > At $WORK we currently have about thirty minutes worth of juice at full > load, but as time drags on and we start shutting down less essential > stuff we can increase that. The PBX and security system have their own > UPSes in their own racks, so there are two layers of battery there.The problem comes when the power cut comes and you aren''t there in the middle of the night. Then you either need an automated shutdown system instigated by traps from the UPS (shutting things down in the correct order) or a generator. About here the generator becomes a very good option. The above no generator scenario needs to be consistently tested to maintain it''s validity, which is a royal pain in the neck. Gen sets are worth their weight in gold. I can''t even think how many times in the last few years they have saved our bacon. (through both planned and unplanned outages)> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
David Magda wrote:> On Jun 30, 2009, at 14:08, Bob Friesenhahn wrote: > >> I have seen UPSs help quite a lot for short glitches lasting seconds, >> or a minute. Otherwise the outage is usually longer than the UPSs >> can stay up since the problem required human attention. >> >> A standby generator is needed for any long outages. > > Can''t remember where I read the claim, but supposedly if power isn''t > restored within about ten minutes, then it will probably be out for a > few hours. If this ''statistic'' is true, it would mean that your UPS > should last (say) fifteen minutes, and after that you really need a > generator.Or run your systems of DC and get as much backup as you have room (and budget!) for batteries. I once visited a central exchange with 48 hours of battery capacity... -- Ian.
Haudy Kazemi
2009-Jul-01 12:11 UTC
[zfs-discuss] ZFS, power failures, and UPSes (and ZFS recovery guide links)
Ian Collins wrote:> David Magda wrote: >> On Jun 30, 2009, at 14:08, Bob Friesenhahn wrote: >> >>> I have seen UPSs help quite a lot for short glitches lasting >>> seconds, or a minute. Otherwise the outage is usually longer than >>> the UPSs can stay up since the problem required human attention. >>> >>> A standby generator is needed for any long outages. >> >> Can''t remember where I read the claim, but supposedly if power isn''t >> restored within about ten minutes, then it will probably be out for a >> few hours. If this ''statistic'' is true, it would mean that your UPS >> should last (say) fifteen minutes, and after that you really need a >> generator. > Or run your systems of DC and get as much backup as you have room (and > budget!) for batteries. I once visited a central exchange with 48 > hours of battery capacity... >The way Google handles UPSes is to have a small 12v battery integrated with each PC power supply. When the machine is on, the battery has its charged maintained. Not unlike a laptop in that it has a built in battery backup, but using an inexpensive sealed lead acid battery instead of lithium ion. Here is info along with photos of the Google server internals: http://news.cnet.com/8301-1001_3-10209580-92.html http://willysr.blogspot.com/2009/04/googles-server-design.html (IIRC there have been power supply UPSes since at least the late 1980s which had an internal battery. Either that or they were UPSes that fit inside the standard PC (AT) compatible desktop case, making the power protection system entirely internal to the computer. I think I saw these models one time while browsing late 1980s or early 1990s issues of PC Magazine that reviewed UPSes. They still exist...one company selling them is http://www.globtek.com/html/ups.html . A Google search for ''power supply built in UPS'' would likely find more.) I also did additional searches in the zfs-discuss archives and found a thread from mid-February, which lead me to other threads. It looks like there are still scattered instances where ZFS has not recovered gracefully from power failures or other failures, where it became necessary to perform a manual transaction group (txg) rollback. Here is a consolidated list of links related to manual uberblock transaction group (txg) rollback and similar ZFS data recovery guides, including undeleting: Section 1: Nathan Hand''s guide and related thread Nathan Hand''s guide to invalidating uberblocks (Dec 2008 thread) http://www.opensolaris.org/jive/thread.jspa?threadID=85794 or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg22153.html Section 2. Victor Latushkin''s guide and related threads Thread: zpool unimportable (corrupt zpool metadata??) but no zdb -l device problems (Oct 2008 to Feb 2009 thread) http://www.opensolaris.org/jive/thread.jspa?threadID=76960 or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg19839.html Repair report: Re: Solved - a big THANKS to Victor Latushkin @ Sun / Moscow http://www.opensolaris.org/jive/message.jspa?messageID=289537#289537 Some recovery discussion by Victor: "zdb -bv alone took several hours to walk the block tree" http://www.opensolaris.org/jive/message.jspa?messageID=292991#292991 or http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/022365.html or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg20095.html Victor Latushkin''s guide: "Thanks to COW nature of ZFS it was possible to successfully recover pool state which was only 5 seconds older than last unopenable one." http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/022331.html or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg20061.html Section 3: reliability debates, recovery tool planning, uberblock info Thread: Availability: ZFS needs to handle disk removal / driver failure better (August 2008 thread) http://www.opensolaris.org/jive/thread.jspa?threadID=70811 or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg19057.html Thread: ZFS: unreliable for professional usage? (Feb 2009 thread) http://www.opensolaris.org/jive/thread.jspa?threadID=91426 or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg23833.html Richard Elling''s post that "uberblocks are kept in an 128-entry circular queue which is 4x redundant with 2 copies each at the beginning and end of the vdev. Other metadata, by default, is 2x redundant and spatially diverse." http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg24145.html Jeff Bonwick''s post about Bug ID 6667683 http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg23961.html Bug ID 6667683: need a way to rollback to an uberblock from a previous txg Description: If we are unable to open the pool based on the most recent uberblock then it might be useful to try an older txg uberblock as it might provide a better view of the world. Having a utility to reset the uberblock to a previous txg might provide a nice recovery mechanism. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6667683 Uberblock information http://blogs.sun.com/blogfinger/entry/zfs_and_the_uberblock http://blogs.sun.com/blogfinger/entry/zfs_and_the_uberblock_part Section 4: undeleting Recovering removed file on zfs disk using a modified mdb and zdb (i.e. undelete) http://mbruning.blogspot.com/2008/08/recovering-removed-file-on-zfs-disk.html Re: [zfs-discuss] Forensic analysis [was: more ZFS recovery] (listed because forensic analysis tools often overlap with undeletion tools/data recovery tools) http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg18557.html http://opensolaris.org/os/project/forensics/ZFS-Forensics/ Thanks everyone for the input you''ve given so far. -hk
On 07/ 1/09 05:11 AM, Haudy Kazemi wrote:> Ian Collins wrote: >> Or run your systems of DC and get as much backup as you have room >> (and budget!) for batteries. I once visited a central exchange with >> 48 hours of battery capacity... >> > The way Google handles UPSes is to have a small 12v battery integrated > with each PC power supply. When the machine is on, the battery has > its charged maintained. Not unlike a laptop in that it has a built in > battery backup, but using an inexpensive sealed lead acid battery > instead of lithium ion. Here is info along with photos of the Google > server internals: > http://news.cnet.com/8301-1001_3-10209580-92.html > http://willysr.blogspot.com/2009/04/googles-server-design.htmlwhich is of course why people claim that google is less green than detroit :-) Each sealed lead-acid battery is good for about 2 years in those power supplies. Goodle uses more than 10,000 servers, many more. Do the math. That''s many many tons of lead and acid in the dump every 24 months.....> > (IIRC there have been power supply UPSes since at least the late 1980s > which had an internal battery. Either that or they were UPSes that > fit inside the standard PC (AT) compatible desktop case, making the > power protection system entirely internal to the computer. I think I > saw these models one time while browsing late 1980s or early 1990s > issues of PC Magazine that reviewed UPSes. They still exist...one > company selling them is http://www.globtek.com/html/ups.html . A > Google search for ''power supply built in UPS'' would likely find more.) > > I also did additional searches in the zfs-discuss archives and found a > thread from mid-February, which lead me to other threads. It looks > like there are still scattered instances where ZFS has not recovered > gracefully from power failures or other failures, where it became > necessary to perform a manual transaction group (txg) rollback. Here > is a consolidated list of links related to manual uberblock > transaction group (txg) rollback and similar ZFS data recovery guides, > including undeleting: > > Section 1: Nathan Hand''s guide and related thread > Nathan Hand''s guide to invalidating uberblocks (Dec 2008 thread) > http://www.opensolaris.org/jive/thread.jspa?threadID=85794 > or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg22153.html > > > Section 2. Victor Latushkin''s guide and related threads > Thread: zpool unimportable (corrupt zpool metadata??) but no zdb -l > device problems (Oct 2008 to Feb 2009 thread) > http://www.opensolaris.org/jive/thread.jspa?threadID=76960 > or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg19839.html > > Repair report: Re: Solved - a big THANKS to Victor Latushkin @ Sun / > Moscow > http://www.opensolaris.org/jive/message.jspa?messageID=289537#289537 > > Some recovery discussion by Victor: "zdb -bv alone took several hours > to walk the block tree" > http://www.opensolaris.org/jive/message.jspa?messageID=292991#292991 > or > http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/022365.html > > or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg20095.html > > Victor Latushkin''s guide: "Thanks to COW nature of ZFS it was possible > to successfully recover pool state which was only 5 seconds older than > last unopenable one." > http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/022331.html > > or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg20061.html > > > Section 3: reliability debates, recovery tool planning, uberblock info > Thread: Availability: ZFS needs to handle disk removal / driver > failure better (August 2008 thread) > http://www.opensolaris.org/jive/thread.jspa?threadID=70811 > or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg19057.html > > Thread: ZFS: unreliable for professional usage? (Feb 2009 thread) > http://www.opensolaris.org/jive/thread.jspa?threadID=91426 > or http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg23833.html > > Richard Elling''s post that "uberblocks are kept in an 128-entry > circular queue which is 4x redundant with 2 copies each at the > beginning and end of the vdev. Other metadata, by default, is 2x > redundant and spatially diverse." > http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg24145.html > > Jeff Bonwick''s post about Bug ID 6667683 > http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg23961.html > > Bug ID 6667683: need a way to rollback to an uberblock from a previous > txg > Description: If we are unable to open the pool based on the most > recent uberblock then it might be useful to try an older txg uberblock > as it might provide a better view of the world. Having a utility to > reset the uberblock to a previous txg might provide a nice recovery > mechanism. > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6667683 > > Uberblock information > http://blogs.sun.com/blogfinger/entry/zfs_and_the_uberblock > http://blogs.sun.com/blogfinger/entry/zfs_and_the_uberblock_part > > > Section 4: undeleting > Recovering removed file on zfs disk using a modified mdb and zdb (i.e. > undelete) > http://mbruning.blogspot.com/2008/08/recovering-removed-file-on-zfs-disk.html > > > Re: [zfs-discuss] Forensic analysis [was: more ZFS recovery] (listed > because forensic analysis tools often overlap with undeletion > tools/data recovery tools) > http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg18557.html > http://opensolaris.org/os/project/forensics/ZFS-Forensics/ > > > Thanks everyone for the input you''ve given so far. > > -hk > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Neal Pollack wrote:> On 07/ 1/09 05:11 AM, Haudy Kazemi wrote: >> Ian Collins wrote: >>> Or run your systems of DC and get as much backup as you have room >>> (and budget!) for batteries. I once visited a central exchange with >>> 48 hours of battery capacity... >>> >> The way Google handles UPSes is to have a small 12v battery >> integrated with each PC power supply. When the machine is on, the >> battery has its charged maintained. Not unlike a laptop in that it >> has a built in battery backup, but using an inexpensive sealed lead >> acid battery instead of lithium ion. Here is info along with photos >> of the Google server internals: >> http://news.cnet.com/8301-1001_3-10209580-92.html >> http://willysr.blogspot.com/2009/04/googles-server-design.html > > which is of course why people claim that google is less green than > detroit :-) > > Each sealed lead-acid battery is good for about 2 years in those power > supplies. > Goodle uses more than 10,000 servers, many more. > Do the math. That''s many many tons of lead and acid in the dump every > 24 months..... >Yes, but... Lead acid batteries are one of (if not _the_) the most-recycled items in the world. Something like 99.99% of all lead-acid batteries get fully recycled. Personally, I don''t like Google''s solution. That''s waaaay too many small batteries in everything. I''d be more in favor of something like a double marine battery every 2 racks. Lots more power, and those things are far easier to recondition and reuse - and much less labor intensive to install than 1 battery in 80+ servers. All this said, I certainly do agree that the proper thing to do is move to full 12V DC inputs for all computers intended for data center use. Eliminating the need for non-12V (i.e. get rid of all the stuff that want 5V) on the internal components is really needed to make this efficient; that way, all you need in the way of a power supply is something that takes 48VDC input, and breaks up the leads into 12V outputs. Really cheap, really efficient. Having a nice 48VDC bus for the rack (like Telco) is much more energy efficient and far easier to hook something like a small UPS to... -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
Erik Trimble wrote:> Neal Pollack wrote: >> On 07/ 1/09 05:11 AM, Haudy Kazemi wrote: >>> Ian Collins wrote: >>>> Or run your systems of DC and get as much backup as you have room >>>> (and budget!) for batteries. I once visited a central exchange >>>> with 48 hours of battery capacity... >>>> >>> The way Google handles UPSes is to have a small 12v battery >>> integrated with each PC power supply. When the machine is on, the >>> battery has its charged maintained. Not unlike a laptop in that it >>> has a built in battery backup, but using an inexpensive sealed lead >>> acid battery instead of lithium ion. Here is info along with photos >>> of the Google server internals: >>> http://news.cnet.com/8301-1001_3-10209580-92.html >>> http://willysr.blogspot.com/2009/04/googles-server-design.html >> >> which is of course why people claim that google is less green than >> detroit :-) >> >> Each sealed lead-acid battery is good for about 2 years in those >> power supplies. >> Goodle uses more than 10,000 servers, many more. >> Do the math. That''s many many tons of lead and acid in the dump >> every 24 months..... >> > Yes, but... > > Lead acid batteries are one of (if not _the_) the most-recycled items > in the world. Something like 99.99% of all lead-acid batteries get > fully recycled.Lead acid batteries are one of the most recycled items, both because it makes economic sense and because it is a legal requirement. According to Google''s published results, they are also have some of the most power efficient systems out there with 90%+ efficient 12v power supplies and great Power Usage Efficiency (PUE) numbers: http://www.greenm3.com/2009/04/insights-into-googles-pue-a-laptop-approach-to-power-supplies-and-ups-for-servers-achieves-999-efficient-ups-system.html I''m not convinced by the argument that Google is less green than Detroit, and from the smiley it appears this statement was meant as tongue-in-cheek humor.> Personally, I don''t like Google''s solution. That''s waaaay too many > small batteries in everything. I''d be more in favor of something like > a double marine battery every 2 racks. Lots more power, and those > things are far easier to recondition and reuse - and much less labor > intensive to install than 1 battery in 80+ servers.With a good quality lead acid battery and appropriate charge management system, the battery can last the business life of the server without replacement (e.g. 4 years). In that case the batteries could be considered ''hands off'' and would be replaced as a single unit along with the server. Google has talked about using commodity hardware vs. traditional server equipment, and here it looks like they have similar-to-commodity hardware optimized for efficiency via their leveraging of purchasing power (i.e. custom power supplies and OEM Gigabyte motherboards). The experience people have with lead acid UPS batteries (and lithium phone and laptop batteries for that matter) dying in 2 years is primarily a function of poor quality batteries and/or poorly designed chargers that trickle charge the batteries to death. (Margins on official replacement batteries for UPSes, laptops and phones are high, leaving room in the market for refilled batteries and third party equivalents. There isn''t much of an incentive to design in a good charging system.) The electric vehicle community knows this well and makes sure to use good charging and balancing systems to get their batteries to last for hundreds to thousands of cycles over several years (UPS systems don''t need to cycle very often, but they do need deep cycle discharge capability). Some DIY electric vehicle enthusiasts successfully use batteries that in a former life served in UPSes but were revived. More on lead acid charging and care: Charging Basics: http://www.evdl.org/pages/hartcharge.html Care Basics: http://www.evdl.org/pages/hartbatt.html> All this said, I certainly do agree that the proper thing to do is > move to full 12V DC inputs for all computers intended for data center > use. Eliminating the need for non-12V (i.e. get rid of all the stuff > that want 5V) on the internal components is really needed to make this > efficient; that way, all you need in the way of a power supply is > something that takes 48VDC input, and breaks up the leads into 12V > outputs. Really cheap, really efficient. Having a nice 48VDC bus for > the rack (like Telco) is much more energy efficient and far easier to > hook something like a small UPS to...I think it will be hard for 48v in 12v out DC/DC converters to compete in price and efficiency with a 240v AC input 12v DC out power supply that is 90%+ efficient (a quick Google search for ''power supply 95% efficient'' finds models as well). 48v DC buses and batteries still need to be fed from a power supply of their own. Google''s approach seems reasonable, assuming they have integrated a good battery charger/maintainer and are running off 240v AC.
On Thu 02/07/09 10:50 , Haudy Kazemi kaze0010 at umn.edu sent: [getting way OT!]> With a good quality lead acid battery and appropriate charge management > system, the battery can last the business life of the server without > replacement (e.g. 4 years).5+ is typical for telco use.> The experience people have with lead acid UPS batteries (and lithium > phone and laptop batteries for that matter) dying in 2 years is > primarily a function of poor quality batteries and/or poorly designed > chargers that trickle charge the batteries to death.In the case of a UPS, rippled to death! -- Ian
On Thu, 2 Jul 2009, Ian Collins wrote:> 5+ is typical for telco use.Aah, but we start getting into rooms full of giant 2V wet lead acid cells and giant busbars the size of railway tracks. -- Andre van Eyssen. mail: andre at purplecow.org jabber: andre at interact.purplecow.org purplecow.org: UNIX for the masses http://www2.purplecow.org purplecow.org: PCOWpix http://pix.purplecow.org
On Jul 1, 2009, at 11:45 AM, Neal Pollack <Neal.Pollack at Sun.COM> wrote:> many more. > Do the math. That''s many many tons of lead and acid in the dump > every 24 months.....Why do you believe they aren''t recycled? Lead acid batteries are usually recycled very effectively khbkhb at gmail.com | keith.bierman at quantum.com Sent from my iPod